Sophia

Sophia (Greek for "wisdom") is a real-time recurrent neural network (RNN) agent based on Theano. It focuses on training and evaluating experimental RNN architectures for regression tasks on long and noisy inputs. Once trained, the RNN can be used as a plugin to an existing project (written in any language) to perform real-time estimation by communicating input/output data via interprocess communication (IPC) or over a network (TCP).

Features

BPTT(h; h') for more efficient training
On-line, minibatch training on sequences of unequal lengths
Inference on ensemble of heterogeneous models
Scheduled learning rate annealing with patience
LSTM/GRU with various unit options (weight normalization, layer normalization, batch normalization, residual gate, initial state learning, etc.), loss functions (L2, L1, Huber), initializations (uniform, orthogonal, Glorot), and optimizers (Nesterov, Adadelta, RMSProp, Adam)
Options for learning time-dimension (e.g., Phased LSTM) or batch-dimension (e.g., sequence ID) parameters in addition to the standard hidden-dimension parameters
Option for faster training by unrolling Theano scans (increases compile time)
Self-documenting object oriented code

Requirements

Project was developed on Ubuntu 16.04.1 LTS. Version numbers are not hard requirements.

Python 2.7 or 3.5+
Theano (0.9.0b1), along with its dependencies (CUDA 8.0.44, cuDNN 5.1.5, libgpuarray 0.6.0)
ZeroMQ (pyzmq 16.0.2) : used for communicating real-time input/output data with an external process
Input and target data preprocessed and saved as separate binary files (see below for format)

Related repository

Sibyl : sophia.py may be used as the real-time RNN engine for this project

How to use

Selecting the branch

master: for anything other than batch normalization
bn: if using batch normalization (removes certain incompatible features)

Preparing data

There is no application specific data processing in this project; the code assumes that the input has already been preprocessed, and that any post-processing of the output will be done by the receiving side
A dataset is contained in a directory with
- train.list: list of training set sequences
- dev.list: list of development set sequences
- For each sequence, a pair of sequence_ID.input/sequence_ID.target in arbitrary subdirectories (in whichever way that makes sense for the data)
.list files are '\n'-delimited plain text lists of relative paths of .input/.target files minus the extensions (i.e., one entry of relative/path/to/sequence_ID for each sequence)
Multiple sequences may share the same sequence_ID in different subdirectores if that makes sense for the data; sequence_ID can be used as an input feature in this case
.input/.target files are binary files each containing a flat array of little endian 4-byte floats in [time][dimension] order (stride of 1 in dimension axis, stride of dimension in time axis);
Time lengths of .input & .target must match for the same sequence
Time lengths for different sequences do not need to match, unless using batch normalization (where all sequences in a minibatch must be synchronized)

Training

Options are configured in train.py

option name	explanation
input_dim/target_dim	integers
unit_type	'fc'/'lstm'/'gru'
loss_type	'l2'/'l1'/'huber'
net_width/net_depth	# of params ~ W² D
batch_size	minibatch size
window_size/step_size	BPTT(window_size; step_size)
*_norm	False/True (weight/layer/batch norm)
residual_gate	False/True
learn_init_states	False/True
learn_id_embedding	False/True
learn_clock_params	False/True
update_type	'sgd'/'momentum'/'nesterov'
force_type	'vanilla'/'adadelta'/'rmsprop'/'adam'
frames_per_epoch	time_indices * batch_size per epoch
lr_*, max_retry	for learning rate annealing with patience
unroll_scan	trades memory consumption & slower compile time for faster training

Instructions for launching a training instance is provided in train.py heading
Multiple instances may be launched if needed
By providing a --load_from flag, the RNN can be trained starting from an already trained RNN; this may help getting out of saddle points on some tasks
Sample train.py output:

Inference

sophia.py provides an interface for real-time communication with an external process which is not necessarily written in Python; see sophia.py for the communication protocol (handshake + data exchange)
It is assumed that the same pre-/post-processing is applied to the input/output of the RNN as was used during training
If the use case is simple, it may be directly implemented in Python in a manner similar to sophia.py

Notes

Naming convention

np = numpy, th = theano, tt = theano.tensor

prefix	type	explanation	how it's created
(none)	built-in/NumPy	variable on CPU memory	the usual way
v_	th.SharedVariable	variable on GPU memory; has init value	th.shared
s_	symbolic node	graph node; has no substance until compiled	tensor literals (tt.alloc/zeros/ones, etc) calculations on v_/s_
p_	symbolic node	same as above, but marked for use as input/output ports for th.function	inputs: tt.scalar/vector/matrix/tensor3, etc outputs: calculations on v_/s_
f_	callable object	links built-in/NumPy to input/output ports all v_ updates take place via f_	th.function

Prefix not used inside Layer.setup_graph as there is no need for distinction there

suffix	explanation
<tensor_name>_[t][b][i/j/k]	[time][batch][hidden] dimensions
<individual_name>s	collections (list, dict, etc)

[optional], <required>, {default}, (miscellaneous)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
readme		readme
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
data.py		data.py
ensemble.py		ensemble.py
layers.py		layers.py
net.py		net.py
optimizers.py		optimizers.py
sophia.py		sophia.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sophia

Features

Requirements

Related repository

How to use

Selecting the branch

Preparing data

Training

Inference

Notes

Naming convention

About

Releases

Packages

Languages

License

junosan/Sophia

Folders and files

Latest commit

History

Repository files navigation

Sophia

Features

Requirements

Related repository

How to use

Selecting the branch

Preparing data

Training

Inference

Notes

Naming convention

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages