Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes from vijay to the xconfig branch. #30

Open
wants to merge 13 commits into
base: xconfig
Choose a base branch
from

Conversation

vijayaditya
Copy link

  1. Moved library files to a sub-directory as we will have more library files in PR 1066

@danpovey
Copy link
Owner

danpovey commented Nov 8, 2016

If you move them like that, perhaps you could rename them to e.g.
xconfig/utils.py and xconfig/layers.py, and import as xconfig.utils and
xconfig.layers, if you feel that's more Pythonic.

On Mon, Nov 7, 2016 at 7:45 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

  1. Moved library files to a sub-directory as we will have more library
    files in PR 1066

You can view, comment on, or merge this pull request online at:

#30
Commit Summary

  • Moved library files to a sub-directory as we will have more library

File Changes

Patch Links:


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#30, or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVuxZ17tOlGH14IBGeOgGQP9S_hOxtks5q78ZDgaJpZM4Kr5ye
.

@vijayaditya
Copy link
Author

@vimalmanohar I have created the package structure according to our discussion.

@vijayaditya
Copy link
Author

@freewym Would you be able to help in this PR ? You might be maintaining these scripts in the future so it would be good to familiarize yourself.

Could you look at the LSTMP xconfig and write one for the normal LSTM (i.e., without the projection layers) ?

@freewym
Copy link

freewym commented Nov 11, 2016

OK.

@freewym
Copy link

freewym commented Nov 14, 2016

@vijayaditya I made a PR to your repo to add an LSTM's xconfig layer (w/o projection)

@vijayaditya
Copy link
Author

@danpovey could you check the latest commit and let me know if you agree with the Nones class that I am using.

This was a hacky way to avoid explicitly specifying the type of config variable when we don't know what the default has to be and want the C++ code to use the default it wants.

@danpovey
Copy link
Owner

Regarding the Nones stuff... I think negative values would normally be easier to follow. The python could just, say, not print out the config value in the config file if the source was -1, or unset.

@vijayaditya
Copy link
Author

The negative values would not work when we would like to specify things like means (we support bias-mean in NaturalGradientAffineComponent), so I thought this was a consistent way to just say we don't what this value is.

@danpovey
Copy link
Owner

Sorry, I think it's going to make the code harder for newbies to understand. I'd rather keep it obvious... e.g. in that case of the bias-mean [which it's not clear we would ever need to expose at the xconfig level], you could just leave the default at zero. For cases where we want to, say, set obscure options like natural-gradient options, we could have a config like
natural-gradient-opts='input-rank=20 output-rank=40 alpha=2.0'
(this would involve changing the parsing code to be able to parse quoted strings with ='s in them, which is not hard). If there is a case where you really need to have an unset value for a particular config value, there will usually be a particular value that doesn't make sense, or that's going to remain the C++ default, that you could use as the default, like 0 or -1. This may be ugly, but at least the ugliness is highly localized. The advantage of 'dim=-1' is that any idiot can see what it means.

Also, remember that the script prints out the 'xconfig.expanded', and your nones would appear there and would look rather ugly. I'd rather have more human-interpretable unset values. [and note: if we eventually allow strings with = in them, they'd have to be printed quoted in xconfig.expanded.]

@vijayaditya
Copy link
Author

OK got it.

@danpovey
Copy link
Owner

It would be nice to see your example scripts that you are using for testing.

@danpovey
Copy link
Owner

BTW, if you make this a PR to the Kaldi repo it would be better, then others (not just me) will see it.

freewym and others added 2 commits November 17, 2016 13:15
Fix an asymmetry in how the derivatives were truncated outside the chunk for BLSTM training.  [Caution: may change BLSTM results.]

In the nnet3 training code there is a mechanism --{min,max}-deriv-time to stop processing derivatives outside of a particular time range, which can be used to stop wasteful and possibly harmful computation in the e.g. +-40-frame context outside of the chunk boundaries where the supervision lies.  [E.g. the gradients may blow up there.]   Due to a previous oversight, this was previously only applied on the left, i.e. the python script set the --min-deriv-time not the --max-deriv-time.  This commit fixes that, and also tunes the time values used in the scripts, to limit the derivatives to +-10 frames around the supervised chunk.

Results for BLSTM training are improved where tested.  Caution: if you are tuning BLSTM things, you may need to re-run baselines after you merge this change.
Added (B)LSTM scripts for ami/s5b and tedlium/s5_r2
@vijayaditya vijayaditya force-pushed the xconfig_vijay branch 3 times, most recently from 72e3261 to 4611e41 Compare November 21, 2016 02:04
          2. Added ability to provide strings with '=' as values in
xconfig.
          3. Added swbd recipes for TDNN, LSTM and BLSTM using xconfig.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants