Recur: a multimedia RNN miscellany.
Recur is a collection of Gstreamer plugins based on recurrent neural
networks, along with a character level language modeller. It began as
the technical core of an artwork, and two of the plugins (
rnnca) are aimed at the rather useless task of learning to produce
abstract video. The most interesting plugin for you is probably
classify, which classifies audio streams. It has been used with some
success for identifying birds and human languages.
The recurrent neural network (RNN) core uses rectified linear units (ReLU) or rectified square root units. It learns via backpropagation through time (BPTT), often using synchronic mini-batches: the weight updates are combined from tens or hundreds of parallel streams.
The calculations are done in 32 bit floats on the CPU, and are quite fast: on x86-64 it is significantly faster than libatlas and openblas. Recur can achieve this by exploiting knowledge about the ReLU architecture -- in particular, by not bothering to calculate matrix rows that are destined to be multiplied by zero.
The data is laid out in memory to facilitate the use of SIMD instructions, but the code avoids assembly blocks and intrinsics. This generally works, and recent versions of GCC and Clang are able to find reasonable SIMD solutions with minimal encouragement.
Recur was originally an artwork that learnt continuously in an effort to recreate a video. It was working in an isolated environment (no keyboard or network, in a distant city) for three months. It was designed to have an interesting and uninterrupted learning journey, rather than reaching a stable end point. Thus it has various optional regularisers that make no sense for a destination-oriented learner (and maybe they didn't work too well for the exhibit either).
The plugins and RNN core are written in the gnu-11 variant of C, while many scripts are written in Python. The nets are saved using the CDB format.
Prerequisites, configuration, and compilation
The core library needs libcdb,
which will be packaged as
libcdb-dev on Debian or
The Gstreamer plugins require
Gstreamer 1.x, Gstreamer 1.x base plugins, and Glib 2.x development
files. These are packaged with most Linux distributions, with names
There are three working plugins so far:
This one is supposed to try to learn to recreate the typical motion and colour of the video it is watching. In fact it makes a great deal of effort to keep changing and avoid crashing, which is somewhat at cross-purposes to the learning.
make libgstrecur.so gst-inspect-1.0 --gst-plugin-path=. recur
There is a GTK app:
make gtk-recur ./gtk-recur --help
And also various example pipelines in the Makefile.
RNNCA stands for Recurrent Neural Network Cellular Automata. It learns rules for a two dimensional cellular automata in imitation of the video it is watching, and uses these to create new video.
make libgstrnnca.so gst-inspect-1.0 --gst-plugin-path=. rnnca
There is a GTK app to run it:
make rnnca-player ./rnnca-player --help
There is an example on youtube.
This learns to assign class probabilities to a stream of audio. To train it you need a lot of labelled data.
make libgstclassify.so gst-inspect-1.0 --gst-plugin-path=. classify classify-train --help classify-test --help classify-gtk --help
Documentation is slight. Sorry.
Character level language modelling
text-predict program learns to predict the next character of a
sequence of text. There are a lot of options. The defaults options
will train quickly to a cross entropy around 2.
make text-predict ./text-predict --help
Multi-head character level modelling in Python
If you want to measure the comparative cross-entropy of a text against
a number of related character-level language models, and you want to
use Python 2.7 to wrangle the text, you are probably looking for
charmodel.so. To build it you need the
equivalent) package. Try:
make charmodel.so python -c 'import charmodel'
The caravel project is based on this module.
Copyright and license
Copyright (C) 2014 Douglas Bagnall firstname.lastname@example.org
This software can be distributed under the terms of the GNU Lesser General Public License, versions 2.1 or greater, or the GNU Library General Public License, version 2.
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
The contents of the ccan directory and mdct.c are by various authors,
and have with various licenses, mostly very liberal. The files
the contents of
ccan/opt, are covered by the GPLv2. This does not
affect your use of the Gstreamer plugins. See licences/README for more
scripts/pycdb.py is a cut-down version of David Wilson's MIT
Licensed python-pure-cdb. The
license goes like this:
Copyright (c) 2009-2015 David Wilson email@example.com
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
and can be found in the file itself and at