A collection of Gstreamer plugins based on recurrent neural networks
C Python Roff C++ Makefile Shell
Switch branches/tags
Nothing to show
Clone or download
Permalink
Failed to load latest commit information.
ccan remove unused library ccan/ttxml Jul 2, 2016
licenses Add python cdb module for rnn_describe Nov 4, 2015
scripts scripts/reduce-video: use ffmpeg, as mencoder is now rare Nov 6, 2017
startup Replace hard coded paths in startup files with template system Jun 8, 2014
test-images charmodel: digit and letter adjustments for find_alphabet Jul 4, 2014
test Add rnnumpy python module for learning from numpy arrays Nov 2, 2017
.gitignore gitignore TAGS file and fix TAGS make rule Oct 23, 2016
Makefile Makefile: default to test video with sound (gstrecur needs it) Nov 6, 2017
README.md Add python cdb module for rnn_describe Nov 4, 2015
alphabet-test charmodel python: add a net object type. Mar 23, 2015
audio-common.h add header guard for audio-common.h Nov 2, 2017
badmaths.h badmaths: slight comment modification in softmax Apr 3, 2015
blit-helpers.h Changes to placate clang warnings Nov 14, 2013
charmodel-classify.c charmodel: add an optional parameter to handle sparse error Apr 19, 2015
charmodel-helpers.h add header guard for charmodel-helpers.h Nov 2, 2017
charmodel-init.c make text-predict valgrind-clean Nov 8, 2016
charmodel-multi-predict.c capitalise in greek too Oct 29, 2016
charmodel-predict.c make text-predict valgrind-clean Nov 8, 2016
charmodel.h make text-predict valgrind-clean Nov 8, 2016
classify-gtk classify: add option specifying audio filetype Oct 4, 2015
classify-simple-test classify-simple-test: fix whitespace Jan 3, 2016
classify-train classify: make --activity-bias default to zero Oct 6, 2015
classify.py classify_stats: calculate accuracies as well as AUC Dec 28, 2015
classify_stats.py classify_stats ROC charts: allow labels to be offset in multi-class c… Dec 28, 2015
colour.c colour.h: shift colourise_float01() into colour.c, add guards Nov 2, 2017
colour.h colour.h: shift colourise_float01() into colour.c, add guards Nov 2, 2017
colour.py classify: rework colour and validation output Oct 6, 2015
context-helpers.h Shush more gcc4.8 format warnings (%llu != u64, even though it is) Dec 24, 2013
context-recurse.c charmodel: add an optional parameter to handle sparse error Apr 19, 2015
convert-saved-net.c Very simple program to strip bptt from saved nets Feb 12, 2014
find-bird-calls classify_stats: calculate accuracies as well as AUC Dec 28, 2015
gstclassify.c gstclassify: set PROP_LOAD_NET_NOW lacked break Nov 2, 2017
gstclassify.h classify: add --balanced-training option Aug 8, 2015
gstparrot.c charmodel: add an optional parameter to handle sparse error Apr 19, 2015
gstparrot.h Add optional pre-synaptic noise Aug 1, 2014
gstrecur_audio.c minor comment improvements, mostly through removal Nov 28, 2013
gstrecur_audio.h Shift recur into its own repository Aug 11, 2013
gstrecur_manager.c minor comment improvements, mostly through removal Nov 28, 2013
gstrecur_manager.h Shift recur into its own repository Aug 11, 2013
gstrecur_video.c minor comment improvements, mostly through removal Nov 28, 2013
gstrecur_video.h Shift recur into its own repository Aug 11, 2013
gstrnnca.c charmodel: add an optional parameter to handle sparse error Apr 19, 2015
gstrnnca.h move ARRAY_LEN into recur-common.h Aug 8, 2015
gtk-recur.c Move Gstreamer plugins to their own directory Aug 2, 2014
gtkdisplay.py python: remove some unused imports Apr 3, 2015
local.mak.example.x86_64 compile with -pg for uftrace Nov 8, 2016
mdct.c Fix up parrot slightly, use transform in place Oct 10, 2013
mdct.h Shift recur into its own repository Aug 11, 2013
mfcc.c Licensing: MPL on top of LGPL is excessive, fiddly Jun 10, 2014
mfcc.h Licensing: MPL on top of LGPL is excessive, fiddly Jun 10, 2014
multi-test classify: rework colour and validation output Oct 6, 2015
opt-helpers.h opt-helpers.h: add header guard ifdefs Nov 2, 2017
pending_properties.h pending-properties.h: add header guard ifdefs Nov 2, 2017
pgm_dump.h Licensing: MPL on top of LGPL is excessive, fiddly Jun 10, 2014
player-common.h Move some common code in C gtk players into player-common.h Nov 22, 2013
plot plot: add top_error_raw to default list Oct 31, 2017
py-recur-helpers.h py-recur-helpers: use common code for saving nets Nov 2, 2017
py-recur-numpy.c py-recur-helpers: use common code for saving nets Nov 2, 2017
py-recur-text.c py-recur-helpers: use common code for saving nets Nov 2, 2017
py-recur-text.h py-recur-text.h: add header guard ifdefs Nov 2, 2017
recur-common.h move ARRAY_LEN into recur-common.h Aug 8, 2015
recur-config.h recur-config.h: add header guard ifdefs Nov 2, 2017
recur-context.c recur-context: cosmetic whitespace Aug 3, 2015
recur-context.h Add optional pre-synaptic noise Aug 1, 2014
recur-nn-helpers.h recur-nn-helpers: fix soft_clip once and for all Aug 22, 2014
recur-nn-init.c do not initialise input and output sizes twice. Oct 23, 2016
recur-nn-io.c io: use proper format codes in error messages Mar 20, 2016
recur-nn.c recur-nn: remove the ReLOG and ReTANH activations Apr 22, 2016
recur-nn.h recur-nn: remove the ReLOG and ReTANH activations Apr 22, 2016
recur-rng.h rng: fix randomise_mem tail Mar 20, 2016
rescale.c rescale.c: downscale allows for possible 1:1 copy Nov 20, 2013
rescale.h rescale.h: add header guard ifdefs Nov 2, 2017
rnnca-player.c rnnca: add option to set the input offsets Nov 9, 2014
setup-charmodel.py setup-charmodel: depend on colour.h Nov 2, 2017
setup-rnnumpy.py colour.h: shift colourise_float01() into colour.c, add guards Nov 2, 2017
text-classify-results.c rename rnn_char_alloc_collapsed_text() Jul 16, 2015
text-classify.c recur-nn: remove the ReLOG and ReTANH activations Apr 22, 2016
text-confabulate.c charmodel: don't reuse raw text buffer for encoded text Jul 14, 2015
text-cross-entropy.c make text-cross-entropy compile again Sep 13, 2016
text-predict.c text-predict: even more valgrind cleanliness Nov 11, 2016
utf8.h utf8.h: add header guard ifdefs Nov 2, 2017
valgrind-python.supp Shift recur into its own repository Aug 11, 2013
window.h Add copyright notices, move all licenses to the /licences/ subdirectory. Apr 18, 2014
xml-lang-classify.c xml-lang-classify: catch up with API change Dec 30, 2014

README.md

Recur: a multimedia RNN miscellany.

Recur is a collection of Gstreamer plugins based on recurrent neural networks, along with a character level language modeller. It began as the technical core of an artwork, and two of the plugins (recur and rnnca) are aimed at the rather useless task of learning to produce abstract video. The most interesting plugin for you is probably classify, which classifies audio streams. It has been used with some success for identifying birds and human languages.

Technical overview

The recurrent neural network (RNN) core uses rectified linear units (ReLU) or rectified square root units. It learns via backpropagation through time (BPTT), often using synchronic mini-batches: the weight updates are combined from tens or hundreds of parallel streams.

The calculations are done in 32 bit floats on the CPU, and are quite fast: on x86-64 it is significantly faster than libatlas and openblas. Recur can achieve this by exploiting knowledge about the ReLU architecture -- in particular, by not bothering to calculate matrix rows that are destined to be multiplied by zero.

The data is laid out in memory to facilitate the use of SIMD instructions, but the code avoids assembly blocks and intrinsics. This generally works, and recent versions of GCC and Clang are able to find reasonable SIMD solutions with minimal encouragement.

Recur was originally an artwork that learnt continuously in an effort to recreate a video. It was working in an isolated environment (no keyboard or network, in a distant city) for three months. It was designed to have an interesting and uninterrupted learning journey, rather than reaching a stable end point. Thus it has various optional regularisers that make no sense for a destination-oriented learner (and maybe they didn't work too well for the exhibit either).

The plugins and RNN core are written in the gnu-11 variant of C, while many scripts are written in Python. The nets are saved using the CDB format.

Prerequisites, configuration, and compilation

The core library needs libcdb, which will be packaged as libcdb-dev on Debian or tinycdb-devel on Fedora.

The Gstreamer plugins require Gstreamer 1.x, Gstreamer 1.x base plugins, and Glib 2.x development files. These are packaged with most Linux distributions, with names like libgstreamer1.0-dev and libglib2.0-dev.

Gstreamer plugins

There are three working plugins so far:

recur

This one is supposed to try to learn to recreate the typical motion and colour of the video it is watching. In fact it makes a great deal of effort to keep changing and avoid crashing, which is somewhat at cross-purposes to the learning.

make libgstrecur.so
gst-inspect-1.0  --gst-plugin-path=. recur

There is a GTK app:

make gtk-recur
./gtk-recur --help

And also various example pipelines in the Makefile.

rnnca

RNNCA stands for Recurrent Neural Network Cellular Automata. It learns rules for a two dimensional cellular automata in imitation of the video it is watching, and uses these to create new video.

make libgstrnnca.so
gst-inspect-1.0  --gst-plugin-path=. rnnca

There is a GTK app to run it:

make rnnca-player
./rnnca-player --help

There is an example on youtube.

classify

This learns to assign class probabilities to a stream of audio. To train it you need a lot of labelled data.

make libgstclassify.so
gst-inspect-1.0  --gst-plugin-path=. classify
classify-train  --help
classify-test  --help
classify-gtk  --help

Documentation is slight. Sorry.

Character level language modelling

The text-predict program learns to predict the next character of a sequence of text. There are a lot of options. The defaults options will train quickly to a cross entropy around 2.

make text-predict
./text-predict --help

Multi-head character level modelling in Python

If you want to measure the comparative cross-entropy of a text against a number of related character-level language models, and you want to use Python 2.7 to wrangle the text, you are probably looking for charmodel.so. To build it you need the python2.7-dev (or equivalent) package. Try:

make charmodel.so
python -c 'import charmodel'

The caravel project is based on this module.

Copyright and license

Copyright (C) 2014 Douglas Bagnall douglas@halo.gen.nz

This software can be distributed under the terms of the GNU Lesser General Public License, versions 2.1 or greater, or the GNU Library General Public License, version 2.

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Other contributions

The contents of the ccan directory and mdct.c are by various authors, and have with various licenses, mostly very liberal. The files text-predict.c, xml-lang-classify, text-confabulate, text-cross-entropy, text-classify, text-classify-results, and the contents of ccan/opt, are covered by the GPLv2. This does not affect your use of the Gstreamer plugins. See licences/README for more detail.

scripts/pycdb.py is a cut-down version of David Wilson's MIT Licensed python-pure-cdb. The license goes like this:

Copyright (c) 2009-2015 David Wilson dw@botanicus.net

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

and can be found in the file itself and at licenses/MIT.pycdb.