Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New output in ComputeChainObjfAndDeriv #2924

Open
danpovey opened this issue Dec 18, 2018 · 3 comments
Open

New output in ComputeChainObjfAndDeriv #2924

danpovey opened this issue Dec 18, 2018 · 3 comments
Labels
in progress Issue has been taken and is being worked on stale Stale bot on the loose

Comments

@danpovey
Copy link
Contributor

@hhadian, when you have a chance can you please do the implementation work for the
'numerator_post' thing below? Again, this can go to the 'svd_draft' branch in my personal
repo for now.
I'll need this for both the regular and e2e egs.

diff --git a/src/chain/chain-training.h b/src/chain/chain-training.h
index 6ea70b5ca..63e03c7e3 100644
--- a/src/chain/chain-training.h
+++ b/src/chain/chain-training.h
@@ -99,7 +99,7 @@ struct ChainTrainingOptions {
                            example; you'll want to divide it by 'tot_weight' before
                            displaying it.
    @param [out] l2_term  The l2 regularization term in the objective function, if
-                           the --l2-regularize option is used.  To be added to 'o
+                         the --l2-regularize option is used (else will be set to 0.0).
    @param [out] weight     The weight to normalize the objective function by;
                            equals supervision.weight * supervision.num_sequences *
                            supervision.frames_per_sequence.
@@ -115,6 +115,10 @@ struct ChainTrainingOptions {
                            peak memory use).  xent_output_deriv will be used in
                            the cross-entropy regularization code; it is also
                            used in computing the cross-entropy objective value.
+   @param [out] numerator_post  If non-NULL, then the posterior from the numerator
+                           forward-backward will be written here (note: it won't be
+                           scaled by the supervision weight).  This is intended for
+                           use in the adaptation framework used in "chaina" training.
 */
 void ComputeChainObjfAndDeriv(const ChainTrainingOptions &opts,
                               const DenominatorGraph &den_graph,
@@ -124,7 +128,8 @@ void ComputeChainObjfAndDeriv(const ChainTrainingOptions &opts,
                               BaseFloat *l2_term,
                               BaseFloat *weight,
                               CuMatrixBase<BaseFloat> *nnet_output_deriv,
-                              CuMatrix<BaseFloat> *xent_output_deriv = NULL);
+                              CuMatrix<BaseFloat> *xent_output_deriv = NULL,
+                              Posterior *numerator_post = NULL);
 
 

@danpovey
Copy link
Contributor Author

... and the order of 'numerator_post' should be the same as the order of the rows of 'input'. That should probably be clarified in the documentation.

@hhadian
Copy link
Contributor

hhadian commented Dec 19, 2018

Will do.

@kkm000 kkm000 added the in progress Issue has been taken and is being worked on label Mar 31, 2019
@stale
Copy link

stale bot commented Jun 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale bot on the loose label Jun 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in progress Issue has been taken and is being worked on stale Stale bot on the loose
Projects
None yet
Development

No branches or pull requests

3 participants