<center>
<h1>Codemotion Workshop: Deep Learning Fundamentals</h1>
<br />
<h2>Polo Didattico, 11th April 2018</h2>
<img src="./Images/Codemotion_logo_small.png" />
</center>

<p>A notebook is a collection of text and code snippets, which can be executed in real-time against a Python engine (the kernel).</p>

To start this notebook (from the Anaconda Prompt):
<pre>
jupyter notebook
</pre>
<p>Check that "Python 3" appears in the top right angle of the notebook, and navigate to the current location of the notebook. There are many keyboard shortcuts, three essential ones are:</p>
<ol>
<li><strong>Enter</strong>: edit cell.</li>
<li><strong>Ctrl+Enter</strong>: execute cell.</li>
<li><strong>Shift+Enter</strong>: execute cell and move to the next one.</li>
</ol>

## About me

<ul>
<li> Ph.D. in machine learning (Sapienza, 2016).</li>
<li> <strong>Post-doc fellow</strong> in Sapienza, <strong>research fellow</strong> in Stirling University (UK), previously <strong>lecturer</strong> at Perugia University.</li>
<li> Co-organizer of the <strong>Rome Data Science and Machine Learning Meetup</strong>.</li>
<li> Co-founder of the <strong>Italian Association for Machine Learning</strong>.</li>
<li> <strong>Google Developer Expert</strong> for Machine Learning.</li>
</ul>

# About this workshop

## Machine learning vs. classical programming
<center><img src="./Images/Labirinth.png" /></center>
[Two big challenges in machine learning (Léon Bottou)](http://icml.cc/2015/invited/LeonBottouICML2015.pdf)

## When is machine learning useful?

<ul>
<li><strong>Navigating</strong> the labirinth and finding a solution to it is easy for a programmer, because the task is extremely well defined.</li>
<li><strong>Recognizing the mouse and cheese</strong> in the images is instead extremely complex, because the problem is very hard to formulate, and there are countless possible sources of variation in the image.</li>
</ul>

<p>One alternative to the second problem is an algorithm capable of <em>learning</em> to recognize a mouse from hundreds/thousands of images of mice.</p>

<p>Machine learning is about extracting <em>useful</em> knowledge from data (from which the classical definition, "<em>learn without being explicitly programmed</em>"). Many synonims in practice:</p>
<ul>
<li><strong>Pattern recognition</strong> (historical focus on engineering)</li>
<li><strong>Data mining</strong> (more focus on exploration of data)</li>
<li><strong>Predictive analytics</strong> (focus on predictive modeling)</li>
<li><strong>Knowledge discovery</strong> (common in the databases world)</li>
<li><strong>Inferential statistics</strong> (from people coming from statistics)</li>
</ul>

## Some basic problems in ML

<br />
<div style="background-color:#ADDFFF; padding:0.5em;">
<h3>Supervised learning</h3>
<ul>
<li><strong>Classification</strong>: label an object starting from a set of examples.</li>
<li><strong>Regression</strong>: assign a real value to an object starting from a set of examples.</li>
</ul>
</div>
<br />
<div style="background-color:#BFEE90; padding:0.5em;">
<h3>Unsupervised learning</h3>
<ul>
<li><strong>Clustering</strong>: partition your data in meaningful groups.</li>
<li><strong>Dimensionality reduction / visualization</strong>.
</ul>
</div>

### Machine learning vs. deep learning
<ul>
<li>"Classical" machine learning does not scale to very large / complex dataset, unless a very good preprocessing is done.</li>
<li><strong>Deep learning</strong> is a subset of ML models that can work on high dimensional data (images, text, ...).</li>
</ul>

## What tool should I use?
https://www.kaggle.com/surveys/2017

## Why Python for data science?
<ul>
<li><strong>Strong core libraries</strong> for numerical manipulation.</li>
<li>Huge <strong>active community</strong> surrounding an ecosystem of additional libraries.</li>
<li><strong>General-purpose</strong> language, with different programming styles.</li>
<li>Simple to learn, interactive, quick for prototyping.</li>
</ul>

<center><img src="Images/python_ecosystem.png" /></center>

### Our (main) objectives today:

<ol>
<li>Practical understanding of the Python ecosystem of data science libraries.</li>
<li>Implement an entire supervised learning workflow, with an understanding of all its steps.</li>
<li>Differences between standard machine learning and deep learning.</li>
<li>Implement a new workflow using Google's TensorFlow.</li>
</ol>

# Basic Python concepts

<p>Relased by Guido van Rossum in 1991, among its basic characteristics we can cite:</p>
<ol>
<li>Interpreted language.</li>
<li>Indentation instead of brackets.</li>
<li>Strong focus on readability.</li>
<li>Huge availability of libraries.</li>
</ol>
<p>Its core principles are expressed in a brief text called <a href="http://docs.python-guide.org/en/latest/writing/style/#zen-of-python">The Zen of Python</a>.</p>

In [1]:
# Simple import call (prints The Zen of Python on first use)
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


<p>Python has many data structures pre-implemented, such as dictionaries:</p>

In [3]:
# Adapted from https://wiki.python.org/moin/SimplePrograms

# Define a dictionary (basic Python object)
activities = {8: 'Sleeping',
              9: 'Commuting',
              17: 'Working',
              20: 'Eating',
              22: 'Resting' }

A simple example of for-loop and if-clause in Python:

In [4]:
# For-loop and if-clause (note the indentation)
from time import localtime
for activity_time in sorted(activities.keys()):
    if localtime().tm_hour < activity_time:
        print(activities[activity_time])
        break

else: # Executed if the for loop completes without break
    print('Unknown, AFK or sleeping!')

Working


A few more advanced concepts:

In [4]:
# Lambda expressions can define functions on-the-fly
mult_by_two = lambda x: x*2

# List comprehension allows to process all elements in a list
original_list = [1, 2, 3]
new_list = [mult_by_two(i) for i in [1, 2, 3]]

# Zip to iterate over multiple lists
for (l0, l1) in zip(original_list, new_list):
    print('{0} times 2 is {1}'.format(l0, l1))

1 times 2 is 2
2 times 2 is 4
3 times 2 is 6


<img src="./Images/IPy_header.png" />
<p><strong>IPython</strong> is an enhanced version of the standard Python console:</p>
<ol>
<li>Comprehensive object introspection.</li>
<li>Extensible system of ‘magic’ commands, providing functionalities meant to be used interactively.</li>
<li>Easily embeddable in other Python programs and GUIs.</li>
<li>Integrated with debugger.</li>
</ol>

In [5]:
# List of available magic commands
%magic

<p><strong>Anaconda</strong> is a simple full-stack solution to install all that is needed in order to use Python for scientific computing. Anaconda provides a custom way to install new Python packages:</p>
<pre>
conda install name-of-the-package
</pre>
<p>Or alternatively using the pip package (pre-installed with Anaconda):</p>
<pre>
pip install name-of-the-package
</pre>
<p>Managing new environments with Anaconda:</p>
<pre>
conda create --name py27 python=2.7
conda activate py27
</pre>

<center><img src="./Images/Jupyter_logo.png" />
<a href="https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks">A gallery of interesting IPython Notebooks</a></center>

<center><img src="./Images/Spyder-windows-screenshot.png" /></center>
[Image Source](https://en.wikipedia.org/wiki/Spyder_(software))