SelfAI

Using my Facebook message data to make a generative deep learning model to talk like me.

Usage

To preprocess the input data, separate "contexts" and "responses" to respective .txt files for training and test data.

python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/model

To train the model, run this command. Further details on training parameters are in the OpenNMT documentation.

python train.py -data data/model -dropout .2 -global_attention mlp -start_decay_steps 8 -tensorboard -tensorboard_log_dir self_v1 -save_model checkpoints/model_v1 -src_word_vec_size 256 -tgt_word_vec_size 256 -rnn_size 256

Test out the model on test data! Adjust beam size to see next highest probable text sequences.

python translate.py -model checkpoints/model_v1_step_<STEP_SIZE>.pt -src data/src-val.txt -beam_size 3 -replace_unk -verbose

Examples

The model yielded some decent results:

INPUT: can u come at 930
OUTPUT: ill head to ur place

INPUT: im at the library with George
OUTPUT: ok wordd ill come thru dude

INPUT: hey alex do u know if we haf the math test monday for bartlett
OUTPUT: yeah yo

And not so great (but none the less funny) ones:

INPUT: nope
OUTPUT: il bring sunglasses for kidz bop

INPUT: okk cya my phones dead tho i should probably go charget itt
OUTPUT: truuu haha we are currently in heated debate

INPUT: 9
OUTPUT: i hate math

Credit to OpenNMT for their Seq2Seq implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
onmt		onmt
self_v1/Aug-07_23-28-00		self_v1/Aug-07_23-28-00
self_v2		self_v2
tools		tools
.gitignore		.gitignore
Exploring data.ipynb		Exploring data.ipynb
LICENSE.md		LICENSE.md
README.md		README.md
preds-train.txt		preds-train.txt
preds.txt		preds.txt
preprocess.py		preprocess.py
process_data.py		process_data.py
requirements.txt		requirements.txt
server.py		server.py
setup.py		setup.py
src_embeddings.txt		src_embeddings.txt
tgt_embeddings.txt		tgt_embeddings.txt
train.py		train.py
translate.py		translate.py

License

alexc2684/SelfAI

Folders and files

Latest commit

History

Repository files navigation

SelfAI

Usage

Examples

About

Resources

License

Stars

Watchers

Forks

Languages