Tensorflow-Easy-Seq2Seq

A tool that allows you to easily train a Seq2Seq, get the embeddings and the outputs without having much knowledge about deep learning.

Requirements

Python 3.*
Tensorflow 1.0.0
SciPy
Scikit-Learn
tqdm
matplotlib
Numpy
Six

Adjusting parameters

Before you create your dataset file and train your model it's adviseable that you adjust the parameters first. This you can do by editing the file config.py. If you don't understand a parameter it's best to leave it alone.

When running a script you can always pass the parameters directly and override the config.py parameters. To get to know what the parameters are you can type python3 script_name.py -h.

Creating dataset file

Firstly put your data in the data folder. Put your inputs in a file called inputs.txt and your outputs in a file called outputs.txt. Each line in the two files corresponds to one datapoint.

For example, chatbot dataset:

data/inputs.txt

Hello, my name is Oliver!
How are you?
Where do you come from?
Do you like carrots?
What do you think about North Korea?

data/outputs.txt

Hi, nice to meet you, my name is Fredrik.
I'm fine thank you!
I come from France.
No, I don't like carrots.
I don't like North Korea, what about you?

Thereafter you can run python3 create_data_set.py. This will create a file called data_set.pkl in a folder called models. This script does everything that you'll need, creating vocabulary and encoding the data to the correct format for training.

Training

To train a model you firstly have to create a dataset file. After that it's as easy as running the script python3 train.py

Get model outputs

Evaluating outputs

To get an output from the Seq2Seq model you can write: python3 evaluate_outputs.py --eval_text="WRITE YOUR INPUT HERE"

Evaluating embeddings

To plot embeddings you can run write: python3 evalute_embeddings --inputs_file_path=embedding_inputs.txt The embeddings will be dimensionality reducted with TSNE to two dimensions and plotted for you with matplotlib. The inputs_file_path is the path with a file that has the inputs written in the same way as in normal inputs.txt.

Get outputs and embeddings from model programmatically

In the src folder there's the files embedding.py and predicting.py, import from these to do it programmatically.

Todos

Tensorboard.
Example data that you can download.
TFRecord data!
Other Seq2Seq models than embedding attention Seq2Seq.

Other

Made by Oliver Edholm, 15 years old.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
README.md		README.md
config.py		config.py
create_data_set.py		create_data_set.py
evaluate_embeddings.py		evaluate_embeddings.py
evaluate_outputs.py		evaluate_outputs.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tensorflow-Easy-Seq2Seq

Requirements

Adjusting parameters

Creating dataset file

Training

Get model outputs

Todos

Other

About

Releases

Packages

Languages

OliverEdholm/Tensorflow-Easy-Seq2Seq

Folders and files

Latest commit

History

Repository files navigation

Tensorflow-Easy-Seq2Seq

Requirements

Adjusting parameters

Creating dataset file

Training

Get model outputs

Todos

Other

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages