Recursus (with Subwords and WordSquares)One out of three modules of
Steps toward neural text generation, trained on various parts of my still subterranean series 'it I-VI'.
In the current stage, the work is split between various training attempts, and 'curation', namely the selection and rewriting of generated texts, wich act, just as in my database-oriented projects (WordSquares, Subwords), as an 'externalized imagination': instead of looking inward for inspiration, computer processes are crafted and tuned so as to produce the material for the final work (which is, in turn, reworked and refined either manually or with other processes).
The code requires Python 3, Tensorflow and Keras. In my case it was TensorFlow 1.9.0 and Keras 2.2.2. Textgenrnn itself was 1.4.
There are two paths for making the code work:
- on a local device, first make sure that the library is installed and that you have TensorFlow and Keras, then:
$ pip install textgenrnn
(There doesn't seem to be a Conda distribution but, using Conda, I could install it all the same.)
The model in
model/is already trained, more text can be generated through the Jupyter Notebook called
lot_of_it.ipynb. The code is only three cells, one to import, one to restore the model, one to generate text. Changing the names of the files allows one to restore another model (in the other branches, they have been renamed accordingly).
- online, experimenting with the Google Colaboratory notebook, and my copy. This is how I trained the network, going for a certain number of epochs (e.g. ten), then downloading the weights/config/vocab files, then reuploading and training some more in another session.
Sources & Research
I am heavily indebted to Max Woolf's framework in Keras/TensorFlow:
- the repo;
- the already mentioned Google Colaboratory notebook;
- his Youtube tutorial and accompanying walkthrough.
More broadly, for the TensorFlow library in general, and neural networks in particular, I am slowly assimilating material from Parag Mital's course on Kadenze and associated repo, after having followed Rebecca Fiebrink's courses at Goldsmiths and her Machine Learning for Musicians and Artists on Kadenze.
I am still relying mostly on external tools, as a lot of what I wish to achieve seems to have been done already, at least on a technical level: what needs to be done is dig as deep as possible to understand the inner workings of these networks, and ultimately reach the point where I can create ones that have a personal touch.
(Quite a lot) more thoughts on my trajectory in machine learning and computational literature on Recursus.
Two other branches exist:
more, both following the same method & architecture, but trying out different training datasets. Instead of feeding the network almost everything I wrote in the past few years, I am looking at curating the input a bit more. So far, however, the
master branch, trained also for longer, has yielded more convincing results.
Just as with the two other modules of Recursus, I designed a Processing sketch with GUI capabilities allowing me to produce images/pdfs of the results, that I can then have printed. The commands are:
- left and right arrows to switch between texts (all to be copied into the text file in
data, separated by "**", no empty lines before or after);
- up and down arrows to increase/decrease the font size;
- 'j' and 'k' to increase/decrease the horizontal margins (calculated as a ratio of the height, so height/x, and x is what is being increased/decreased);
- 'n' and 'm' to do the same with vertical margins; ' 's' to save.
More in the
data/ folder, and on Recursus.
P.S: ‘ait’, also, according to Oxford Dictionaries, “[in place names] A small island in a river. ‘Raven's Ait’”, which might help us remember where riverruns bring us back to.