Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.8.0 -> v0.9.0 #1452

Closed
wants to merge 1 commit into from
Closed

v0.8.0 -> v0.9.0 #1452

wants to merge 1 commit into from

Conversation

myleott
Copy link
Contributor

@myleott myleott commented Dec 3, 2019

Possibly breaking changes:

  • Set global numpy seed (4a7cd58)
  • Split in_proj_weight into separate k, v, q projections in MultiheadAttention (fdf4c3e)
  • TransformerEncoder returns namedtuples instead of dict (27568a7)

New features:

  • Add --fast-stat-sync option (e1ba32a)
  • Add --empty-cache-freq option (315c463)
  • Support criterions with parameters (ba5f829)

New papers:

  • Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
  • Levenshtein Transformer (86857a5, ...)
  • Cross+Self-Attention for Transformer Models (4ac2c5f)
  • Jointly Learning to Align and Translate with Transformer Models (1c66792)
  • Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
  • Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
  • BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
  • CamemBERT: a French BERT (b31849a)

Speed improvements:

  • Add CUDA kernels for LightConv and DynamicConv (f840564)
  • Cythonization of various dataloading components (4fc3953, ...)
  • Don't project mask tokens for MLM training (718677e)

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@myleott has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@myleott myleott deleted the v0.9.0 branch December 6, 2019 22:58
moussaKam pushed a commit to moussaKam/language-adaptive-pretraining that referenced this pull request Sep 29, 2020
Summary:
Possibly breaking changes:
- Set global numpy seed (4a7cd58)
- Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e)
- TransformerEncoder returns namedtuples instead of dict (27568a7)

New features:
- Add `--fast-stat-sync` option (e1ba32a)
- Add `--empty-cache-freq` option (315c463)
- Support criterions with parameters (ba5f829)

New papers:
- Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
- Levenshtein Transformer (86857a5, ...)
- Cross+Self-Attention for Transformer Models (4ac2c5f)
- Jointly Learning to Align and Translate with Transformer Models (1c66792)
- Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
- CamemBERT: a French BERT (b31849a)

Speed improvements:
- Add CUDA kernels for LightConv and DynamicConv (f840564)
- Cythonization of various dataloading components (4fc3953, ...)
- Don't project mask tokens for MLM training (718677e)
Pull Request resolved: facebookresearch#1452

Differential Revision: D18798409

Pulled By: myleott

fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727
facebook-github-bot pushed a commit that referenced this pull request Nov 20, 2020
Summary: Pull Request resolved: fairinternal/fairseq-py#1452

Test Plan: Imported from OSS

Reviewed By: lematt1991

Differential Revision: D25108462

Pulled By: myleott

fbshipit-source-id: 3c17a9937a4c3edb69f64130dfd866c5f42a4aaf
yzpang pushed a commit to yzpang/gold-off-policy-text-gen-iclr21 that referenced this pull request Feb 19, 2021
Summary:
Possibly breaking changes:
- Set global numpy seed (4a7cd58)
- Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e)
- TransformerEncoder returns namedtuples instead of dict (27568a7)

New features:
- Add `--fast-stat-sync` option (e1ba32a)
- Add `--empty-cache-freq` option (315c463)
- Support criterions with parameters (ba5f829)

New papers:
- Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
- Levenshtein Transformer (86857a5, ...)
- Cross+Self-Attention for Transformer Models (4ac2c5f)
- Jointly Learning to Align and Translate with Transformer Models (1c66792)
- Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
- CamemBERT: a French BERT (b31849a)

Speed improvements:
- Add CUDA kernels for LightConv and DynamicConv (f840564)
- Cythonization of various dataloading components (4fc3953, ...)
- Don't project mask tokens for MLM training (718677e)
Pull Request resolved: facebookresearch/fairseq#1452

Differential Revision: D18798409

Pulled By: myleott

fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727
yzpang pushed a commit to yzpang/gold-off-policy-text-gen-iclr21 that referenced this pull request Feb 19, 2021
Summary:
Possibly breaking changes:
- Set global numpy seed (4a7cd58)
- Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e)
- TransformerEncoder returns namedtuples instead of dict (27568a7)

New features:
- Add `--fast-stat-sync` option (e1ba32a)
- Add `--empty-cache-freq` option (315c463)
- Support criterions with parameters (ba5f829)

New papers:
- Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c9)
- Levenshtein Transformer (86857a5, ...)
- Cross+Self-Attention for Transformer Models (4ac2c5f)
- Jointly Learning to Align and Translate with Transformer Models (1c66792)
- Reducing Transformer Depth on Demand with Structured Dropout (dabbef4)
- Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5ea)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcda)
- CamemBERT: a French BERT (b31849a)

Speed improvements:
- Add CUDA kernels for LightConv and DynamicConv (f840564)
- Cythonization of various dataloading components (4fc3953, ...)
- Don't project mask tokens for MLM training (718677e)
Pull Request resolved: facebookresearch/fairseq#1452

Differential Revision: D18798409

Pulled By: myleott

fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727
sshleifer pushed a commit that referenced this pull request Apr 7, 2021
Summary: Pull Request resolved: fairinternal/fairseq-py#1452

Test Plan: Imported from OSS

Reviewed By: lematt1991

Differential Revision: D25108462

Pulled By: myleott

fbshipit-source-id: 3c17a9937a4c3edb69f64130dfd866c5f42a4aaf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants