DataLoader for Seq2seq

Efficient data loader for text dataset using torch.utils.data.Dataset, collate_fn and torch.utils.data.DataLoader.

Prerequesites

Usage

1. Clone the repository

$ git clone https://github.com/yunjey/seq2seq-dataloader.git
$ cd seq2seq-dataloader

2. Download nltk tokenizer

$ pip install nltk
$ python
$ import nltk
$ nltk.download('punkt')

3. Build word2id dictionary

$ python build_vocab.py

4. Check DataLoader

For usage, please see example.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
LICENSE		LICENSE
README.md		README.md
build_vocab.py		build_vocab.py
data_loader.py		data_loader.py
example.ipynb		example.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

LICENSE

LICENSE

README.md

README.md

build_vocab.py

build_vocab.py

data_loader.py

data_loader.py

example.ipynb

example.ipynb

Repository files navigation

DataLoader for Seq2seq

Prerequesites

Usage

1. Clone the repository

2. Download nltk tokenizer

3. Build word2id dictionary

4. Check DataLoader

About

Releases

Packages

Languages

License

yunjey/seq2seq-dataloader

Folders and files

Latest commit

History

Repository files navigation

DataLoader for Seq2seq

Prerequesites

Usage

1. Clone the repository

2. Download nltk tokenizer

3. Build word2id dictionary

4. Check DataLoader

About

Resources

License

Stars

Watchers

Forks

Languages