Skip to content

yunjey/seq2seq-dataloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataLoader for Seq2seq

Efficient data loader for text dataset using torch.utils.data.Dataset, collate_fn and torch.utils.data.DataLoader.


Prerequesites


Usage

1. Clone the repository

$ git clone https://github.com/yunjey/seq2seq-dataloader.git
$ cd seq2seq-dataloader

2. Download nltk tokenizer

$ pip install nltk
$ python
$ import nltk
$ nltk.download('punkt')

3. Build word2id dictionary

$ python build_vocab.py

4. Check DataLoader

For usage, please see example.ipynb.

About

PyTorch DataLoader for seq2seq

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published