Skip to content
This repository was archived by the owner on Jul 18, 2024. It is now read-only.

Conversation

@tkornuta-ibm
Copy link
Contributor

  • Fixes issues with description of hidden stream in decoder with attention (it actually required different order than defined in output_data_definitions)
  • Standardized dimensions of input/output hidden steams in both RNN models: batch-major!

That, along with fixed sizes of embeddings (padding) enabled to properly parallelize both the model and C4 pipeline

…RNN, AttDecGRU): batch first, c4 working in DataParallel
…nerate 19 symbols, which makes it really slow
@tkornuta-ibm tkornuta-ibm requested a review from aasseman May 3, 2019 22:34
@tkornuta-ibm
Copy link
Contributor Author

Early results:

  • 7 GPUs enabled me to use batch size of 300
  • on that setting, one epoch (i.e. processing of 4 folds of joined training and validation sets, ~3000 samples) takes around 30 seconds

Screen Shot 2019-05-03 at 4 31 06 PM

@tkornuta-ibm tkornuta-ibm merged commit 484b2f6 into develop May 4, 2019
@tkornuta-ibm tkornuta-ibm deleted the parallel_att_decoder branch May 4, 2019 22:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants