Skip to content

Latest commit

 

History

History

utils

Dataset Preprocessing Modules

  • utils.preprocess_wiki: Generates a TFRecordDataset from a Wikipedia dump extracted by WikiExtractor.
  • utils.preprocess_lang8: Generates a TFRecordDataset from the Lang8 corpus.
  • utils.edits: Module for edit-tagging parallel sentences.
  • utils.errorify: Module for generating synthetic errors in sentences.
  • utils.helpers: General common helper functions.