@eric-haibin-lin eric-haibin-lin released this Nov 27, 2018 · 16 commits to master since this release

Assets 2

Highlights

Models

New Tutorials

New Datasets

  • Sentiment Analysis
    • MR, a movie-review data set of 10,662 sentences labeled with respect to their overall sentiment polarity (positive or negative). (#391)
    • SST_1, an extension of the MR data set with fine-grained labels (#391)
    • SST_2, an extension of the MR data set with binary sentiment polarity labels (#391)
    • SUBJ, a subjectivity data set for sentiment analysis (#391)
    • TREC, a movie-review data set of 10,000 sentences labeled with respect to their subjectivity status (subjective or objective). (#391)

API Updates

  • Changed Vocab constructor from staticmethod to classmethod to handle inheritance (#386)
  • Added Transformer Encoder APIs (#409)
  • Added pre-trained ELMo model to model.get_model API (#227)
  • Added pre-trained BERT model to model.get_model API (#409)
  • Added unknown_lookup setter to TokenEmbedding (#429)
  • Added dtype support to EmbeddingCenterContextBatchify (#416)
  • Propagated exceptions from PrefetchingStream (#406)
  • Added sentencepiece tokenizer detokenizer (#380)
  • Added CSR format for variable length data in embedding training (#384)

Fixes & Small Changes

  • Included output of nlp.embedding.list_sources() in API docs (#421)
  • Supported symlinks in examples and scripts (#403)
  • Fixed weight tying in GNMT and Transformer (#413)
  • Simplified transformer notebook (#400)
  • Fixed LazyTransformDataStream prefetching (#397)
  • Adopted src/gluonnlp folder layout (#390)
  • Fixed text8 archive file name for downloads from S3 (#388) Thanks @bkktimber!
  • Fixed ppl reporting for training on multi gpu in the language model notebook (#365). Thanks @ThomasDelteil!
  • Fixed a spelling mistake in QA script. (#379) Thanks @qyhfbqz!