# Machine Learning Demystified for Developers

## Getting Started (Locally)
1. Install Docker for [Mac](https://www.docker.com/docker-mac) / [Windows](https://www.docker.com/docker-windows) / [Linux](https://docs.docker.com/install/)
2. Download and Run:
```shell
git clone git@github.com:atomantic/ml_class.git
cd ml_class
docker run -it -v $(pwd):/home/jovyan --rm -p 8888:8888 jupyter/scipy-notebook
```
2. Open the link to jupyter environment given by the docker run command (e.g. http://localhost:8888/?token=f02e34b39ff5c834ca0a22335eb89b3b5858d1cc858ae921) ![terminal output](images/run.png)

## Follow Along!

In [1]:
# In Jupyter you can execute commandline programs by prefixing with a '!'
!python --version

Python 3.6.3


In [2]:
# Import the common packages for exploring Machine Learning
import numpy as np  # <-- common convention for short names of packages...
import pandas as pd
import sklearn
import matplotlib
import matplotlib.pyplot as plt

# Always good to check versions - because DOCS differ!
print('NumPy Version',np.__version__)
print('Pandas Version',pd.__version__)
print('Scikit Learn Version',sklearn.__version__)
print('MatplotLib Version',matplotlib.__version__)

NumPy Version 1.12.1
Pandas Version 0.19.2
Scikit Learn Version 0.18.2
MatplotLib Version 2.0.2


![numpy](images/logo_numpy.jpg)
- [docs](https://docs.scipy.org/doc/)
- n-dimensional array object
- random numbers

In [10]:
# Create a simple NumPy array
a = np.array([[1,2],
              [3,4],
              [5,6],
              [7,8],
              [9,10],
              [11,12]])
print("full array:", a)

# Python uses interesting syntax of slicing data
# Zero-indexed!
print("first row:", a[0])
print("first column values from all rows:", a[:,0])
print("second column, second row:", a[1,1])
print("more complex value pulling:", a[2:4,0])

# Your Turn
print("Playground:", a[1])

full array: [[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]]
first row: [1 2]
first column values from all rows: [ 1  3  5  7  9 11]
second column, second row: 4
more complex value pulling: [5 7]
 [3 4]


![pandas](images/logo_pandas.png)
- [docs](https://pandas.pydata.org/pandas-docs/stable/)
- powerful data analysis and manipulation
- makes data into something like a spreadsheet

In [4]:
# Lets create a DataFrame with Pandas that has more advanced utility functions built in
df = pd.DataFrame(a)
# with column names for ease of use
df.columns = ['Feature 1','Feature 2']

# ** note: Jupyter will 'pretty print' the LAST object you reference without a print()
# But you have to use print('') to show any others before it

print(type(df.values)) # <--- this gets printed
df.values              # <--- but this DOESN'T get printed
df                     # <--- but this does (last direct item)

<class 'numpy.ndarray'>


Unnamed: 0,Feature 1,Feature 2
0,1,2
1,3,4
2,5,6
3,7,8
4,9,10
5,11,12


![matplotlib](images/logo_matplotlib.png)

- [docs](https://matplotlib.org/contents.html)
- powerful data visualization
- interactive with iPython/Jupyter Notebooks

In [11]:
# multiple plots can be created and shown by giving the plots a figure number
plt.figure(1)
# generate some random data (10K numbers between 0-1)
x = np.random.rand(10000)
# create a histogram, placing the values in x into 100 buckets
plt.hist(x, 100)
# render it
plt.show()

In [6]:
# Use the 'magic' % have iPython load matplotlib in interactive mode
%matplotlib notebook

In [7]:
# interactive scatterplot
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2  # 0 to 15 point radii

plt.figure(1)
plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()

<IPython.core.display.Javascript object>

In [12]:
# Use the 'magic' % to see what variables are in memory
%who

N	 a	 area	 colors	 df	 matplotlib	 np	 pd	 plt	 
sklearn	 x	 y	 


See [Magic Commands Docs](http://ipython.readthedocs.io/en/stable/interactive/magics.html)

![scikit-learn](images/logo_scikit.png)
- [docs](http://scikit-learn.org/stable/documentation.html)
- complete machine learning toolkit
- clustering tools
- neural networks
- experimental data

### ...We'll Get to This

## But First

Let's continue to [Demistifying ML Terms](/notebooks/02%20-%20Demistifying%20ML%20Terms.ipynb)