# T81-558: Applications of Deep Neural Networks
**Washington University in St. Louis**

Instructor: [Jeff Heaton](http://sites.wustl.edu/jeffheaton)

# Class 1: Python for Machine Learning

# Course Description

Deep learning is a group of exciting new technologies for neural networks. By using a combination of advanced training techniques neural network architectural components, it is now possible to train neural networks of much greater complexity. This course will introduce the student to deep belief neural networks, regularization units (ReLU), convolution neural networks and recurrent neural networks. High performance computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids. Deep learning allows a model to learn hierarchies of information in a way that is similar to the function of the human brain. Focus will be primarily upon the application of deep learning, with some introduction to the mathematical foundations of deep learning. Students will use the Python programming language to architect a deep learning model for several of real-world data sets and interpret the results of these networks.

# Assignments

Your grade will be calculated according to the following assignments:

Assignment          |Weight|Title
--------------------|------|-------
Class Participation |   10%|Class attendance and participation
Program 1           |   10%|Python for data science
Program 2           |   10%|Tensorflow for classification
Program 3           |   10%|Time series with TensorFlow
Program 4           |   10%|Computer Vision with Tensorflow
Mid Term            |   20%|Understanding of deep learning and TensorFlow
Final Project       |   30%|Adapt deep learning to a past Kaggle competition

# Course Book
The following book will be used to supplement in class discussion.  Internet resources and papers will augment the text with the latest research.

![Artificial Intelligence for Humans, Volume 3: Neural Networks and Deep Learning](http://www.heatonresearch.com/images/books/1505714346-sm.jpg "AIFH Vol3: Deep Learning")

Heaton, J. (2015). *Deep learning and neural networks* (Vol. 3, Artificial Intelligence for Humans). St. Louis, MO: Heaton Research.

**You do not need the other two books in the series.**

# Jeff Heaton
I will be your instructor for this course.  A brief summary of my credentials is given here:

* Master of Information Management (MIM), Washington University in St. Louis, MO
* PhD (candidate) in Computer Science, Nova Southeastern University in Ft. Lauderdale, FL
* Senior Data Scientist, Reinsurance Group of America (RGA)
* Senior Member, IEEE
* jtheaton at domain name of this university
* Other industry certifications: FLMI, ARA, ACS

Social media:

* [Homepage](http://www.heatonresearch.com) - My home page.  Includes my research interests and publications.
* [Linked In](https://www.linkedin.com/in/jeffheaton) - My Linked In profile, feel free to connect.
* [Twitter](https://twitter.com/jeffheaton) - My Twitter feed.
* [Google Scholar](https://scholar.google.com/citations?user=1jPGeg4AAAAJ&hl=en) - My citations on Google Scholar.
* [Research Gate](https://www.researchgate.net/profile/Jeff_Heaton) - My profile/research at Research Gate.
* [Others](http://www.heatonresearch.com/about/) - About me and other social media sites that I am a member of.

# Course Resources

* [IBM Data Science Workbench](https://www.datascientistworkbench.com) - Free web based platform that includes Python, Jupyter Notebooks, and TensorFlow.  No setup needed.
* [Python Anaconda](https://www.continuum.io/downloads) - Python distribution that includes many data science packages, such as Numpy, Scipy, Scikit-Learn, Pandas, and much more.
* [Jupyter Notebooks](http://jupyter.org/) - Easy to use environment that combines Python, Graphics and Text. 
* [TensorFlow](https://www.tensorflow.org/) - Google's mathematics package for deep learning.
* [Kaggle](https://www.kaggle.com/) - Competitive data science.  Good source of sample data.
* [Course GitHub Repository](https://github.com/jeffheaton/t81_558_deep_learning) - All of the course notebooks will be published here.

# What is Deep Learning

This class focuses upon deep learning, which is a very popular type of machine learning that is based upon the original neural networks popularized in the 1980's. There is very little difference between how a deep neural network is calculated compared with the original neural network.  We've always been able to create and calculate deep neural networks.  A deep neural network is nothing more than a neural network with many layers.  While we've always been able to create/calculate deep neural networks, we've lacked an effective means of training them.  Deep learning provides an efficient means to train deep neural networks.

## What is Machine Learning

If deep learning is a type of machine learning, this begs the question, "What is machine learning?"  The following diagram illustrates how machine learning differs from traditional software development.

![ML vs Traditional Software Development](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_1_ml_vs_trad.png "Machine Learning vs Traditional Software Development")

* **Traditional Software Development** - Programmers create programs that specify how to transform input into the desired output.
* **Machine Learning** - Programmers create models that can learn to produce the desired output for given input. This learning fills the traditional role of the computer program. 

Researchers have applied machine learning to many different areas.  This class will explore three specific domains for the application of deep neural networks:

![Application of Machine Learning](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_1_ml_types.png "Application of Machine Learning")

* **Low-Dimension Predictive Modeling** - Several named input values are used to predict another named value that becomes the output.  For example, using four measurements of iris flowers to predict the species.  Neural networks are not always the best choice for low-dimension predictive modeling.
* **Computer Vision** - The use of machine learning to detect patterns in visual data. Neural networks are a good choice for computer vision.
* **Time Series** - The use of machine learning to detect patterns in in time.  Common applications of time series are: financial applications, speech recognition, and even natural language processing (NLP).  Recurrent neural networks are a great choice for time series.

## What are Neural Networks

Neural networks one of the earliest types of machine learning model.  Neural networks were originally introduced in the 1940's and have risen and fallen [several times from popularity](http://hushmagazine.ca/living-2/business/the-believers-the-hidden-story-behind-the-code-that-runs-our-lives). Four researchers have contributed greatly to the development neural networks through their ups and downs: 

![Neural Network Luminaries](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_1_luminaries_ann.png "Neural Network Luminaries")

The current luminaries of artificial neural network (ANN) research and ultimately deep learning, in order as appearing in the above picture:

* [Yann LeCun](http://yann.lecun.com/), Facebook and New York University - Optical character recognition and computer vision using convolutional neural networks (CNN).  The founding father of convolutional nets.
* [Geoffrey Hinton](http://www.cs.toronto.edu/~hinton/), Google and University of Toronto. Extensive work on neural networks. Creator of deep learning and early adapter/creator of backpropagation for neural networks.
* [Yoshua Bengio](http://www.iro.umontreal.ca/~bengioy/yoshua_en/index.html), University of Montreal. Extensive research into deep learning, neural networks, and machine learning.  He has so far remained completely in academia.
* [Andrew Ng](http://www.andrewng.org/), Baidu and Stanford University.  Extensive research into deep learning, neural networks, and application to robotics.

## Why Deep Learning?

For predictive modeling neural networks are not that different than other models, such as:

* Support Vector Machines
* Random Forests
* Gradient Boosted Machines

Like these other models, neural networks can perform both **classification** and **regression**.  When applied to relatively low-dimensional predictive modeling tasks, deep neural networks do not necessarily add significant accuracy over other model types.  Andrew Ng describes the advantage of deep neural networks over traditional model types as follows:

![Why Deep Learning?](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_1_why_deep.png "Why Deep Learning")

Neural networks also have two additional significant advantages over other machine learning models:



# Software Installation
This is a technical class.  You will need to be able to compile and execute Python code that makes use of TensorFlow for deep learning. There are two options to you for accomplish this:

* Use IBM Data Scientist Workbench online
* Install Python, TensorFlow and some IDE (Jupyter, Tensor Flow, etc.)

## Using IBM Data Scientist Workbench

![DSWB Logo](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/images/datascientistworkbenchlogo2-2.png?raw=true "DSWB Logo")

This option allows you to skip any issues associated with installing Python and TensorFlow on your machine. Installing Python is relatively easy.  However, TensorFlow has specific instructions for Windows, Linux and Mac. It is straightforward to install TensorFlow onto a Mac or Linux.  Windows is an entirely different prospect, as Google does not offer specific support for Windows at this time.

The IBM Data Scientist Workbench is a web site that provides you with your own environment to run Jupyter notebook from.  There is nothing proprietary about the workbench, the same code that will run from the IBM system will also run on your local computer. I will be using the Data Scientist Workbench for many of the examples during class.  To make use of this website you will need to register at the following URL:

* [Data Scientist Workbench](https://datascientistworkbench.com/)

When you first sign up, it will take the workbench some time to setup your environment, this could easily take 30 minutes plus.  While your environment is being setup, you will see a cute icon of a dog chasing his tail.

Upon logging into the workbench, you will see a welcome screen similar to the following:

![DSWB Data](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/dswb_home.png "IBM Data Scientist Workbench")

You will primarily make use of the "My Data" and "Jupyter Notebook" buttons on the above page. Clicking "My Data" will reveal all data that is currently held by your account.  This includes both CSV data files, as well as any Jupyter notebooks you might have loaded or created.

![IBM Data Science Workbench](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/dswb_data.png "DSWB Data")

Clicking "Jupyter Notebook" will start Jupyter Notebook.  This allows you to choose which notebook you would like to work with.  If you downloaded a notebook from my GitHub site you can simply drag the **.ipynb** file to the web browser.  You can also choose to create a new Jupyter notebook that you can later download.  The following screen capture shows Jupyter notebook running in Data Scientist Workbench.

![DSWB Jupyter Notebook](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/dswb_jupyter.png "DSWB Jupyter Notebook")


## Installing Python and TensorFlow

It is also possible to install and run Python/TensorFlow entirely from your own computer.  This will be somewhat difficult for Microsoft Windows, as Google has not yet added official support for TensorFlow.  Official support is currently only provided for Mac and Linux.

The first step is to install Python 3.x.  I recommend using the Anaconda release of Python, as it already includes many of the data science related packages that will be needed by this class.  Anaconda directly supports: Windows, Mac and Linux.  Download Anaconda from the following URL:

* [Anaconda](https://www.continuum.io/downloads)

Once Anaconda has been downloaded it is easy to install Jupyter notebooks with the following command:

```
conda install jupyter
```

Once Jupyter is installed, it is started with the following command:

```
jupyter notebook
```


# Python Introduction


* [Anaconda v3.5](https://www.continuum.io/downloads) Scientific Python Distribution, including:
    * [Scikit-Learn](http://scikit-learn.org/)
    * [Pandas](http://pandas.pydata.org/)
    * Others: csv, json, numpy, scipy
* [Jupyter Notebooks](http://jupyter.readthedocs.io/en/latest/install.html)
* [PyCharm IDE](https://www.jetbrains.com/pycharm/)
* [Cx_Oracle](http://cx-oracle.sourceforge.net/)
* [MatPlotLib](http://matplotlib.org/)

## Jupyter Notebooks

Space matters in Python, indent code to define blocks

Jupyter Notebooks Allow Python and Markdown to coexist.

Even $\LaTeX$:

$ f'(x) = \lim_{h\to0} \frac{f(x+h) - f(x)}{h}. $

## Python Versions

* If you see `xrange` instead of `range`, you are dealing with Python 2
* If you see `print x` instead of `print(x)`, you are dealing with Python 2 

## Count to 10 in Python

Use a `for` loop and a `range`.

In [5]:
#Python cares about space!  No curly braces.
for x in range(1,10):  # If you ever see xrange, you are in Python 2
    print(x)  # If you ever see print x (no parenthesis), you are in Python 2

1
2
3
4
5
6
7
8
9


Printing Numbers and Strings
============================

In [3]:
sum = 0
for x in range(1,10):
    sum += x
    print("Adding {}, sum so far is {}".format(x,sum))
    
print("Final sum: {}".format(sum))

Adding 1, sum so far is 1
Adding 2, sum so far is 3
Adding 3, sum so far is 6
Adding 4, sum so far is 10
Adding 5, sum so far is 15
Adding 6, sum so far is 21
Adding 7, sum so far is 28
Adding 8, sum so far is 36
Adding 9, sum so far is 45
Final sum: 45


Lists & Sets
============

In [123]:
c = ['a', 'b', 'c', 'd']
print(c)

['a', 'b', 'c', 'd']


In [7]:
# Iterate over a collection.
for s in c:
    print(s)

a
b
c


In [124]:
# Iterate over a collection, and know where your index.  (Python is zero-based!)
for i,c in enumerate(c):
    print("{}:{}".format(i,c))

0:a
1:b
2:c
3:d


In [21]:
# Manually add items, lists allow duplicates
c = []
c.append('a')
c.append('b')
c.append('c')
c.append('c')
print(c)

['a', 'b', 'c', 'c']


In [19]:
# Manually add items, sets do not allow duplicates
# Sets add, lists append.  I find this annoying.
c = set()
c.add('a')
c.add('b')
c.add('c')
c.add('c')
print(c)

{'c', 'a', 'b'}


In [22]:
# Insert
c = ['a','b','c']
c.insert(0,'a0')
print(c)
# Remove
c.remove('b')
print(c)
# Remove at index
del c[0]
print(c)

['a0', 'a', 'b', 'c']
['a0', 'a', 'c']
['a', 'c']


Maps/Dictionaries/Hash Tables
=============================

In [24]:
map = { 'name': "Jeff", 'address':"123 Main"}
print(map)
print(map['name'])

if 'name' in map:
    print("Name is defined")
    
if 'age' in map:
    print("age defined")
else:
    print("age undefined")

{'name': 'Jeff', 'address': '123 Main'}
Jeff
Name is defined
age undefined


In [26]:
map = { 'name': "Jeff", 'address':"123 Main"}
# All of the keys
print("Values: {}".format(map.keys()))

# All of the values
print("Keys: {}".format(map.values()))

Values: dict_keys(['name', 'address'])
Keys: dict_values(['Jeff', '123 Main'])


Files
=====

In [38]:
# Read a raw text file (avoid this)
import codecs
import os

path = os.getcwd()

# Always specify your encoding! There is no such thing as "its just a text file".
# See... http://www.joelonsoftware.com/articles/Unicode.html
# Also see... http://www.utf8everywhere.org/
encoding = 'utf-8'
filename = os.path.join(path,"auto-mpg.csv")

c = 0

with codecs.open(filename, "r", encoding) as fh:
    # Iterate over this line by line...
    for line in fh:
        c+=1 # Only the first 5 lines
        if c>5: break
        print(line.strip())

mpg,cylinders,displacement,horsepower,weight,acceleration,year,origin,name
18,8,307,130,3504,12,70,1,chevrolet chevelle malibu
15,8,350,165,3693,11.5,70,1,buick skylark 320
18,8,318,150,3436,11,70,1,plymouth satellite
16,8,304,150,3433,12,70,1,amc rebel sst


In [127]:
# Read a CSV file
import codecs
import os
import csv

path = os.getcwd()

encoding = 'utf-8'
filename = os.path.join(path,"auto-mpg.csv")

c = 0

with codecs.open(filename, "r", encoding) as fh:
    reader = csv.reader(fh)
    for row in reader:
        c+=1
        if c>5: break
        print(row)


['mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'year', 'origin', 'name']
['18', '8', '307', '130', '3504', '12', '70', '1', 'chevrolet chevelle malibu']
['15', '8', '350', '165', '3693', '11.5', '70', '1', 'buick skylark 320']
['18', '8', '318', '150', '3436', '11', '70', '1', 'plymouth satellite']
['16', '8', '304', '150', '3433', '12', '70', '1', 'amc rebel sst']


In [47]:
# Read a CSV, symbolic headers
import codecs
import os
import csv

path = os.getcwd()

encoding = 'utf-8'
filename = os.path.join(path,"auto-mpg.csv")

c = 0

with codecs.open(filename, "r", encoding) as fh:
    reader = csv.reader(fh)

    # Generate header index using comprehension.
    # Comprehension is cool, but not necessarily a beginners feature of Python.
    header_idx = {key: value for (value, key) in enumerate(next(reader))}
    
    for row in reader:
        c+=1
        if c>5: break
        print( "Car Name: {}".format(row[header_idx['name']]))


Car Name: chevrolet chevelle malibu
Car Name: buick skylark 320
Car Name: plymouth satellite
Car Name: amc rebel sst
Car Name: ford torino


In [128]:
# Read a CSV, manual stats
import codecs
import os
import csv
import math

path = os.getcwd()

encoding = 'utf-8'
filename_read = os.path.join(path,"auto-mpg.csv")
filename_write = os.path.join(path,"auto-mpg-norm.csv")

c = 0

with codecs.open(filename_read, "r", encoding) as fh:
    reader = csv.reader(fh)

    # Generate header index using comprehension.
    # Comprehension is cool, but not necessarily a beginners feature of Python.
    header_idx = {key: value for (value, key) in enumerate(next(reader))}
    headers = header_idx.keys()
    
    #print([(key,{'count':0}) for key in headers])
    
    fields = {key: value for (key, value) in [(key,{'count':0,'sum':0,'variance':0}) for key in headers] }
    
    # Pass 1, means
    row_count = 0
    for row in reader:
        row_count += 1
        for name in headers:
            try:
                value = float(row[header_idx[name]])
                field = fields[name]
                field['count'] += 1
                field['sum'] += value
            except ValueError:
                pass
    
    # Calculate means, toss sums (part of pass 1)
    for field in fields.values():
        # If 90% are not missing (or non-numeric) calculate a mean
        if (field['count']/row_count)>0.9:
            field['mean'] = field['sum'] / field['count']
            del field['sum']
    
    # Pass 2, standard deviation & variance
    fh.seek(0)
    for row in reader:
        for name in headers:
            try:
                value = float(row[header_idx[name]])
                field = fields[name]
                # If we failed to calculate a mean, no variance.
                if 'mean' in field:
                    field['variance'] += (value - field['mean'])**2
            except ValueError:
                pass
            
    # Calculate standard deviation, keep variance (part of pass 2)
    for field in fields.values():
        # If no variance, then no standard deviation
        if 'mean' in field:
            field['variance'] /= field['count']
            field['sdev'] = math.sqrt(field['variance'])
        else:
            del field['variance']
    
    # Print summary stats
    for key in sorted(fields.keys()):
        print("{}:{}".format(key,fields[key]))
        
        


acceleration:{'sdev': 2.7542223175940177, 'variance': 7.585740574732961, 'count': 398, 'mean': 15.568090452261291}
cylinders:{'sdev': 1.698865960539558, 'variance': 2.8861455518799946, 'count': 398, 'mean': 5.454773869346734}
displacement:{'sdev': 104.13876352708563, 'variance': 10844.882068950259, 'count': 398, 'mean': 193.42587939698493}
horsepower:{'sdev': 38.442032714425984, 'variance': 1477.7898792169979, 'count': 392, 'mean': 104.46938775510205}
mpg:{'sdev': 7.806159061274433, 'variance': 60.93611928991693, 'count': 398, 'mean': 23.514572864321615}
name:{'sum': 0, 'count': 0}
origin:{'sdev': 0.801046637381194, 'variance': 0.6416757152597181, 'count': 398, 'mean': 1.5728643216080402}
weight:{'sdev': 845.7772335198177, 'variance': 715339.1287404363, 'count': 398, 'mean': 2970.424623115578}
year:{'sdev': 3.6929784655780975, 'variance': 13.638089947223559, 'count': 398, 'mean': 76.01005025125629}


In [92]:
# Python list & map structures
customers = [
    {'name': 'Jeff & Tracy Heaton', 'pets': ['Wynton','Cricket']},
    {'name': 'John Smith', 'pets': ['rover']},
    {'name': 'Jane Doe'}
]

print(customers)

for customer in customers:
    print("{}:{}".format(customer['name'],customer.get('pets','no pets')))

[{'name': 'Jeff & Tracy Heaton', 'pets': ['Wynton', 'Cricket']}, {'name': 'John Smith', 'pets': ['rover']}, {'name': 'Jane Doe'}]
Jeff & Tracy Heaton:['Wynton', 'Cricket']
John Smith:['rover']
Jane Doe:no pets


Pandas
======

In [96]:
# Simple dataframe
import os
import pandas as pd

path = os.getcwd()

filename_read = os.path.join(path,"auto-mpg.csv")
df = pd.read_csv(filename_read)
print(df[0:5])

   mpg  cylinders  displacement horsepower  weight  acceleration  year  \
0   18          8           307        130    3504          12.0    70   
1   15          8           350        165    3693          11.5    70   
2   18          8           318        150    3436          11.0    70   
3   16          8           304        150    3433          12.0    70   
4   17          8           302        140    3449          10.5    70   

   origin                       name  
0       1  chevrolet chevelle malibu  
1       1          buick skylark 320  
2       1         plymouth satellite  
3       1              amc rebel sst  
4       1                ford torino  


In [122]:
# Simple dataframe
import os
import pandas as pd

path = os.getcwd()

filename_read = os.path.join(path,"auto-mpg.csv")
df = pd.read_csv(filename_read,na_values=['NA','?'])

# Strip non-numerics
df = df.select_dtypes(include=['int', 'float'])

headers = list(df.columns.values)
fields = []

for field in headers:
    fields.append( {
        'name' : field,
        'mean': df[field].mean(numeric_only=True),
        'var': df[field].var(numeric_only=True),
        'sdev': df[field].std(numeric_only=True)
    })
    
for field in fields:
    print(field)
    

    
    


{'name': 'mpg', 'sdev': 7.8159843125657922, 'mean': 23.514572864321607, 'var': 61.089610774274554}
{'name': 'cylinders', 'sdev': 1.7010042445332123, 'mean': 5.4547738693467336, 'var': 2.8934154399200041}
{'name': 'displacement', 'sdev': 104.26983817119591, 'mean': 193.42587939698493, 'var': 10872.199152247384}
{'name': 'horsepower', 'sdev': 38.491159932828488, 'mean': 104.46938775510205, 'var': 1481.5693929745814}
{'name': 'weight', 'sdev': 846.84177419732646, 'mean': 2970.424623115578, 'var': 717140.9905256757}
{'name': 'acceleration', 'sdev': 2.757688929812677, 'mean': 15.568090452261307, 'var': 7.6048482336113885}
{'name': 'year', 'sdev': 3.6976266467326111, 'mean': 76.010050251256288, 'var': 13.672442818627053}
{'name': 'origin', 'sdev': 0.80205487772661477, 'mean': 1.5728643216080402, 'var': 0.64329202688505494}
