Generate Any Text
Generate any kind of text with character embedding and RNN in pure Numpy.
Note that this project is not optimized for performance. It is rather a reference implementation for many building blocks of neural networks.
Check this list to reach implemented elements.
Clone the repository and change directory into it
git clone https://github.com/kirbiyik/generate-any-text cd generate-any-text
This project has only 3 dependencies: Numpy for numeric computation, tqdm for progress bar and matplotlib for plotting. You probably already have these. Otherwise run command below.
pip install -r requirements.txt
Download the Trump speeches dataset
wget -P data/ https://github.com/ryanmcdermott/trump-speeches/blob/master/speeches.txt
python train.py -d data/speeches.txt
Here is an example footage of training. It learns how to generate Trump-like tweets with a 3 minutes training in CPU.
Models will be saved under
model-files unless explicitly specified.
python inference.py -m model-files/char_rnn_epoch3.pkl -s i
Click on any item to go to related line of code.
- CharRNN (Sequence to Sequence RNN to produce characters with temperature)
- Unit tests with numerical gradients to check implementations
- Character Embedding
- Recurrent Neural Networks
- Temporal Affine Layer
- Temporal Softmax
- Adam optimizer
- Solver (Run training loop, update weights, save model, record logs, etc.)
- Data loading, character mapping, masking non-frequent characters
Run unit tests with
Trump speeches dataset is used for demonstration. Though it is possible to use any kind of text file. It does not require any special format. It learns what you feed!
This work is heavily based on Stanford's CS 231n Assignments. I've implemented additional layers and features for the special needs of this problem.