This repo includes all scripts required to build a VirtualBox 'Appliance' (an easy-to-install pre-configured VM) that can be used by Deep Learning Workshop participants.
This workshop consists of an introduction to deep learning (from single layer networks-in-the-browser, then using the VM/Jupyter setup to train networks using both Theano (+Lasagne for model components) and Tensorflow (+some sugar layers). The modules also include pretrained state-of-the-art networks, such as GoogLeNet, in various applications) :
-
FOSSASIA 2016 : Deep Learning Workshop (2 hours)
-
PyCon-SG 2016 : Deep Learning Workshop (1.5 hours)
-
DataScienceSG MeetUp : 'Hardcore' session about Deep Learning (2.5 hours)
-
Fifth Elephant, India : Deep Learning Workshop (6 hours : 4x 1.5hr classes in one day)
- Application : Classifying unknown classes of images (~transfer learning)
- Application : Generative art (~style transfer)
- Application : RNN Tagger
- Application : RNN Fun (work-in-progress)
- Application : Anomaly Detection (mis-shaped MNIST digits)
- Application : Reinforcement Learning
- Slides for the talk are here, with an accompanying blog post
-
PyDataSG MeetUp : Talk on RNNs and NLP (1.5 hours)
-
TensorFlow & Deep Learning MeetUp : Talk on transfer learning (0.5 hours)
-
FOSSASIA 2017 : Deep Learning Workshop (1 hour)
-
TensorFlow & Deep Learning MeetUp : Talk on CNNs (0.5 hours)
- Application : Speech Recognition using a CNN (non-workshop version)
- Slides for the talk are [here]((http://redcatlabs.com/2017-03-20_TFandDL_IntroToCNNs/#/), with an accompanying blog post, which includes a video link
-
TensorFlow & Deep Learning MeetUp : Generative Art : Style-Transfer (0.5 hours)
- Application : Generative Art (Style-Transfer)
- Slides for the talk are here
-
APAC Machine Learning & Data Science Community Summit : In the news : AlphaGo and Reinforcement Learning (0.75 hours)
-
TensorFlow & Deep Learning MeetUp : Text : Embeddings, RNNs and NER (~1 hour)
-
TensorFlow & Deep Learning MeetUp : Advanced Text and Language (0.75 hours)
-
FOSSASIA 2018 : Deep Learning Workshop (1 hour)
NB : Ensure Conference Workshop announcement / blurb includes VirtualBox warning label
- Also : for the Art (and potentially other image-focussed) modules, having a few 'personal' images available might be entertaining *
The VM itself includes :
- Jupyter (iPython's successor)
- Running as a server available to the host machine's browser
- Data
- MNIST training and test sets
- Trained models from two of the 'big' ImageNet winners
- Test Images for both recognition, 'e-commerce' and style-transfer modules
- Corpuses and pretrained GloVe for the language examples
- Locally-runnable versions of a CNN demonstrator, and OpenAI's '3-boxes' Reptile demo
- Tool chain(s) (Python-oriented)
Theano / Lasagne- Tensorflow and Keras
- PyTorch (CPU version)
And this repo can itself be run in 'local mode', using scripts in ./local/
to :
- Set up the virtual environment correctly
- Run
jupyter
with the right flags, paths etc
-
Scripts to create working Fedora 25 installation inside VM
- Has working
Python3.x
virtualenv
withJupyter
andTensorFlow / TensorBoard
- Has working
-
Script to transform the VM into a VirtualBox appliance
- Exposing
Jupyter
,TensorBoard
andssh
to host machine
- Exposing
-
Locally hosted Convnet.js for :
- Demonstration of gradient descent ('painting')
-
Locally hosted TensorFlow Playground for :
- Visualising hidden layer, and effect of features, etc
-
Locally hosted cnn demo for :
- Demonstration of how a single CNN 3x3 filter works
-
Existing workshop notebooks :
- Basics
- MNIST
- MNIST CNN
- ImageNet : GoogLeNet
- ImageNet : Inception 3
- CNN for simple Voice Recognition
- 'Anomaly Detection' - identifying mis-shaped MNIST digits
- 'Commerce' - repurpose a trained network to classify our stuff
- 'Art' - Style transfer with Lasagne, but using GoogLeNet features for speed
- 'Reinforcement Learning' - learning to play "Bubble Breaker"
- 'RNN-Tagger' - Processing text, and learning to do case-less Named Entity Recognition
-
Notebook Extras
- U - VM Upgrade tool
- X - BLAS configuration fiddle tool
- Z - GPU chooser (needs Python's
BeautifulSoup
)
-
Create rsync-able image containing :
- VirtualBox appliance image
- including data sets and pre-trained models
- VirtualBox binaries for several likely platforms
- Write to thumb-drives for actual workshop
- and/or upload to DropBox
- VirtualBox appliance image
-
Workshop presentation materials
-
Create sync-to-latest-workbooks script to update existing (taken-home) VMs
-
Create additional 'applications' modules (see 'ideas.md')
-
Monitor TensorBoard - to see whether it reduces its memory footprint enough to switch from Theano...
-
'RNN-Fun' - Discriminative and Generative RNNs
See the local/README file.
Also worth investigating : Google Colab, which allows the Free (as in Beer) use of a K40 GPU in a Jupyter-notebook-like interface. In fact, there is also the possibility of pulling up GitHub-based notebooks directly using the url :
https://colab.research.google.com/github/USER/REPO/blob/master/NOTEBOOK
For a concrete example, look at this link to the recent revamped Reptile code from OpenAI that is in the MetaLearning folder of this repo.
Using the code from : http://pascalbugnion.net/blog/ipython-notebooks-and-git.html (and https://gist.github.com/pbugnion/ea2797393033b54674af ), you can enable this kind of feature just on one repository, rather than installing it globally, as follows...
Within the repository, run :
# Set the permissions for execution :
chmod 754 ./bin/ipynb_optional_output_filter.py
git config filter.dropoutput_ipynb.smudge cat
git config filter.dropoutput_ipynb.clean ./bin/ipynb_optional_output_filter.py
this will add suitable entries to ./.git/config
.
or, alternatively, create the entries manually by ensuring that your .git/config
includes the lines :
[filter "dropoutput_ipynb"]
smudge = cat
clean = ./bin/ipynb_output_filter.py
Note also that this repo includes a <REPO>/.gitattributes
file containing the following:
*.ipynb filter=dropoutput_ipynb
Doing this causes git to run ipynb_optional_output_filter.py
in the REPO/bin
directory,
which only uses import json
to parse the notebook files (and so can be executed as a plain script).
To disable the output-cleansing feature in a notebook (to disable the cleansing on a per-notebook basis),
simply add to its metadata (Edit-Metadata) as a first-level entry (true
is the default):
"git" : { "suppress_outputs" : false },
nbstripout seems to do what we want, and can be installed more easily.
Within the local python environment (or do this globally, as root, if you're committed) :
pip install nbstripout