ALP presentation
======

Asynchronous Learning Platform - by [Thomas Boquet](https://github.com/tboquet) and [Paul Lemaître](https://github.com/DrAnaximandre)


Some features:
* open source!
* Neural Network [keras](https://keras.io/) library support
* partial [sklearn](http://scikit-learn.org/stable/) library support
* GPU or CPU broker that dispaches experiments
* 2 databases that store the models and the results
* a CLI!
* a style-transfered image of a mountain goat.


<img src="last_bouquetin.png" alt="Drawing" style="width: 200px;"/>


Outline
====

* Why ALP?
* What kind of model can I use?
* What do I need to run ALP? What is inside ALP? How stable is it?
* How could ALP help me ?
* Live coding!
* Request for features.


Why do we develop ALP?
=======

* **Assertion** : when doing ML to solve a problem, we spend more time working on 
    + building a model
    + testing different architectures
    + comparing results
    
    than actually work on the ideas that will solve the problem. 


* **Our proposition**: to help that process, we develop an Asynchronous Learning Platform (ALP) that uses the hardware (CPU+GPU) at a convenient capacity and manages the models.


* **Core idea** : that platform relies on independant services running on Docker containers. To the end-user, it is just a matter of importing the right modules in a Notebook. ALP will run the experiments asynchronously and store the architectures and results in persistant databases.





What kind of model can I use?
===

* So far the [keras](https://keras.io/) library is supported with the Tensorflow backend.
    * out-of-the-box keras models runs seamlessly.
    * some tricks are necessary to run your own class/layer/loss (due to serialization challenges).
    
    
    
    
* Several models of the [sklearn](http://scikit-learn.org/stable/) library are supported.
    * Not the ensemble models yet (such as Random Forests).
    * The support of sklearn is mostly historical / for testing or tutorial purposes.
    



What do I need to run ALP? 
===
You need to use a machine running Ubuntu to use ALP, with docker or nvidia-docker if you have a NVIDIA GPU.


What is inside ALP? 
===
ALP relies on Docker, RabbitMQ, Celery, MongoDB and nvidia-docker among other. It also supports interfacing with Fuel thus depends (so far) on Theano. It’s implemented in Python. All the dependencies *should be* in the Docker images. The first launch of ALP might be a bit long as the images need to be pulled, depending on your bandwidth.


How stable is it?
===
0.3.0 is the latest stable release. It seldom crashes if the user does not try funky operations. An effort was put onto continuous integration (using Travis) during the first stages of the development.



How could ALP help me?
===

We believe it might be useful for several applications such as:

* **hyperparameters/architecture tuning**: ALP can help you in dealing with the tedious task of logging all the architectures, parameters and results. They are all automatically stored in the databases and you just have to select the best model given the validation(s) you specified.
* **fitting several models on several data streams**: you have data streams coming from a source and you want to fit a lot of online models, it is easy with ALP. With the support of Fuel generators, you can transform your data on the fly.
* **post analysis**: extract and explore the parameters of models given their score on several data blocks. Sometimes it can be helpful to visualise the successful set of parameters.
* **model deployment in production**: when a model is trained, you can load it and deploy it quickly in production.

Live coding
===

Let us go through some basic exemples in a Jupyter Notebook.

* defining a sklearn model and run it in alp
* simple hyperparameter tuning in alp (with asynchronous fit)




Request for features
===

Let us discuss the features you need so that we can implement them. Some ideas:

* so far the easiest way to work as a team on a machine is to launch a worker by user and attribute the ressources (eg that amount of memory per user/ that many GPUs per user). We could develop a fancier version that gives unused ressources if needed.
* support of Pytorch models
* support of NiftyNet
* abstract backend


You can always [open an issue on the repo](https://github.com/tboquet/python-alp/issues).