Getting Text into Tensorflow with the Dataset API

This repo is an accompinemnt to my Blog Post Getting Text into Tensorflow with the Dataset API Inside, we build a simple GRU based model to predict the Book a particaular verse in the bible came from.

Structure

PrepareBibleExamples.ipynb Takes the raw file of the bible, splits it into books, chapters and verses. This is the preparation of our raw data. At the end of the notebook, we convert the structured training examples into TFRecords
preppy.py Contains the logic to convert a raw example made in the notebook into a TFRecord
prepare_dataset.py Here we use the Dataset API, so look here if that's what you're after.
model.py The definition of the model we use. Nothing to fancy here.
trainer.py The code to run a training and validation loop. Look here to see how we leverage the Dataset and its Iterator to easily to a train epoch and then a val epoch.

Using this

Clone the repo
Run through the notebook PrepareBibleExamples.ipynb. This will create the TFRecords
run python trainer.py

But if you just do this, what have you learnt ? Read the code

Other stuff

Found a mistake, have a suggestion ? Pull requests are welcome.
This isn't supposed to be a good neural network, don't learn that from here.
Thanks to the WildML blog for the great post on TFRecords and sequences
The Tensorflow guide on the Dataset API is good. Go read it.

Need to label data before putting it into Tensorflow ?

This post and repo are brought to you by LightTag, a platform to manage and execute NLP annotations with a team. Check us out at LightTag.io

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
LICENSE		LICENSE
Predicting The Bible Book.ipynb		Predicting The Bible Book.ipynb
PrepareBibleExamples.ipynb		PrepareBibleExamples.ipynb
PrepareBibleExamples.py		PrepareBibleExamples.py
README.md		README.md
__init__.py		__init__.py
model.py		model.py
pg10.txt		pg10.txt
prepare_dataset.py		prepare_dataset.py
preppy.py		preppy.py
suggestions.gif		suggestions.gif
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Text into Tensorflow with the Dataset API

Structure

Using this

Other stuff

Need to label data before putting it into Tensorflow ?

About

Releases

Packages

Contributors 3

Languages

License

LightTag/BibSample

Folders and files

Latest commit

History

Repository files navigation

Getting Text into Tensorflow with the Dataset API

Structure

Using this

Other stuff

Need to label data before putting it into Tensorflow ?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages