## Distributed Deep Learning on CPU-GPU

### What is Deep Learning?
Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. Deep learning is getting lots of attention lately and for good reason. It’s achieving results that were not possible before.

In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. Models are trained by using a large set of labeled data and neural network architectures that contain many layers.

### Awesome! But more in detail?
Deep Learning is a subfield of Artificial Intelligence and then of Machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.
The *Deep* stays for *Deep Neural Network*. 
In other words the Deep learning is a Machine learning area where neural networks are composed by *stacked layers of neurons of different types*

Due to Deep learning is inspired to the brain, Deep learning Neural Network have different type. 
The most common are:

+ Dense Neural Network (evolution of MLP classifier)
+ Convolutional Neural Network ( mainly used for vision/recognition purposes, they emulate the visual brain cortex)
+ Recurrent Neural Network (mainly used for timeseries forcasting, text / audio elaboration)

Deep Neural Network works fine with clustering because, due to their computation complexity, they are specifically designed for parallel computation (CPU/ GPU with CUDA) and distrbuted computing since they are implemented as graph.

Unfortunately today we will see only how to distribute the computation over a cluster with CPU since there aren't available GPUs on the lab.
Anyway, to work with CUDA on your PC, it's sufficient to install CUDA and the package tensorflow-gpu in a propper way.
Drop me an email if you'd like to install it at home: [stefano.campese@omnys.com](stefano.campese@omnys.com)



### How to work with Distributed Deep learning ?

In order to work in a simple way with clustering and deep lerning three packages are necessary:
+ Tensorflow
+ Keras
+ Dask

### What is Tensorflow?

TensorFlow is an open source software library for numerical computation using data-flow graphs. It was originally developed by the Google Brain Team within Google's Machine Intelligence research organization for machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as wel.

### What is Keras?

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Use Keras if you need a deep learning library that:

Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).
Supports both convolutional networks and recurrent networks, as well as combinations of the two.
Runs seamlessly on CPU and GPU.

Ok, Let's see how to create simple Distributed Deep Dense Neural Network

In [None]:
from dask.distributed import Client
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import accuracy_score
from dask_ml.datasets import make_classification

client = Client('')

In [None]:
from dask_tensorflow import start_tensorflow
tf_spec, dask_spec = start_tensorflow(client, ps=2, worker=2)




In [None]:
from dask_ml.model_selection import train_test_split

X, y = make_classification(n_samples=6000, n_features=22, random_state=42, chunks=1000)

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)

In [None]:
def build_deep_neural_network():
    # create model
    
    model = Sequential()
    model.add(Dense(60, input_shape=(22,), kernel_initializer='normal', activation='relu')) #layer 1 with 60 neurons
    model.add(Dense(30, kernel_initializer='normal', activation='relu')) #layer 2 with 30 neurons
    model.add(Dense(1, kernel_initializer='normal', activation='sigmoid')) #layer 3 with 1 neuron (binary classification)

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model


In [None]:
from sklearn.metrics import accuracy_score
model = build_deep_neural_network()

model.fit(X_train, y_train, verbose=1, epochs=1)
scores = model.evaluate(X_test,y_test, verbose=1)
print("Accuracy: "+str(scores[1]))

Let's try to look the cluster dashboard! Take a look to the Graph tab!

### Mushrooms recognition: poisonous or edible?

In [None]:
import dask.dataframe as dd

df = dd.read_csv('https://www.dropbox.com/s/7u3lm737ogbqsg8/mushrooms_categorized.csv?dl=1')

df = df.repartition(npartitions=4)

df

In [None]:
print(df.head())

In [None]:
y = df['class'].values.compute()#to_dask_array(lengths=True)
X = df[df.columns.difference(['class'])].values.compute()#.to_dask_array(lengths=True)
print(X[2])

In [None]:
from dask_ml.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2, train_size=0.8)

In [None]:
def build_deep_neural_network():
    # create model
    
    model = Sequential()
    model.add(Dense(32, activation='relu', input_shape=(22,)))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [None]:
from sklearn.metrics import accuracy_score
from keras.layers import Dropout
model = build_deep_neural_network()

model.fit(X_train, y_train, verbose=1, epochs=10, batch_size=64)
print(model.predict(X_test))
scores = model.evaluate(X_test,y_test, verbose=1)
print("Accuracy: "+str(scores[1]))

### Dog or Cat?

Let's try to predict if an image is a cat or a dog! Let's try with a Dense Deep Network.
At first load the remote dataset

In [None]:
import dask.dataframe as dd

df = dd.read_csv('https://www.dropbox.com/s/lo8hmhczoo6c7yo/dogs_and_cats.csv?dl=1')

y = df['label'].values.compute()#to_dask_array(lengths=True)
X = df[df.columns.difference(['label'])].values.compute()#.to_dask_array(lengths=True)


Split in train and test

In [None]:
from dask_ml.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2, train_size=0.8)


Let's define the model

In [None]:

def build_deep_neural_network():
    # create model
    model.add(Dense(1024, activation='relu', input_shape=(12288,)))
    model.add(Dense(256, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Train and test the model!

In [None]:
from sklearn.metrics import accuracy_score
from keras.layers import Dropout
model = build_deep_neural_network()

model.fit(X_train, y_train, verbose=1, epochs=10, batch_size=32)
print(model.predict(X_test))
scores = model.evaluate(X_test,y_test, verbose=1)
print("Accuracy: "+str(scores[1]))

The accuracy seems not good! Let's try to use the *Convolutional Neural Network*.
Before to start to work with those network, it's necessary to reshape the data to their original form.
Convolutional Neural Network, in fact, work on multidimensional data. Since the orginal dataset contains the data of dogs and cat image encoded in a monodimensional array, to work properly with these network we have to reshape each element in its original form: 64x64x3, where 64x64 is the resolution and 3 stands for the RGB encoding (in other word the image is coloured and not in grayscale).

Hence, let's start: reshape the images in the right way!

In [None]:
X_train, X_test = X_train.reshape((len(X_train), 64,64, 3)),  X_test.reshape((len(X_test), 64,64, 3))

In [None]:
from keras.layers import Conv2D
from keras.layers import Flatten
def build_deep_cnn_neural_network():
    # create model
    
    model = Sequential()
    model.add(Conv2D(128, kernel_size=(3,3), name='c', strides=(1, 1), padding='valid', activation='relu', input_shape=(64,64,3)))
    model.add(Flatten())
    model.add(Dense(64, activation='relu', name='d'))
    #model.add(Dropout(0.2))
    model.add(Dense(32, activation='relu', name='d1'))
    #model.add(Dropout(0.2))
    model.add(Dense(1, activation='sigmoid', name='d2'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [None]:
from sklearn.metrics import accuracy_score
from keras.layers import Dropout
model = build_deep_cnn_neural_network()

model.fit(X_train, y_train, verbose=1, epochs=10, batch_size=32)
print(model.predict(X_test))
scores = model.evaluate(X_test,y_test, verbose=1)
print("Accuracy: "+str(scores[1]))

### Exercise 1
Take the first implementation of "dog or cat", scale the features with the MinMaxScaler with features range in -1, 1 and then re-run all.

In [None]:
import dask.dataframe as dd


df = dd.read_csv('https://www.dropbox.com/s/lo8hmhczoo6c7yo/dogs_and_cats.csv?dl=1')

y = df['label'].values.compute()
X = df[df.columns.difference(['label'])].values.compute()

In [None]:
from dask_ml.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(-1,1)) #put your code here

from dask_ml.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2, train_size=0.8)

scaler = scaler.fit(X_train) #put your code here
X_train = scaler.transform(X_train) #put your code here 
X_test = scaler.transform(X_test) #put your code here

In [None]:
model = Sequential()
model.add(Dense(1024, activation='relu', input_shape=(12288,)))
model.add(Dense(64, activation='relu', name='d'))
model.add(Dense(32, activation='relu', name='d1'))
model.add(Dense(1, activation='sigmoid', name='d2'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [None]:
from sklearn.metrics import accuracy_score


model.fit(X_train, y_train, verbose=1, epochs=10, batch_size=32, validation_split=0.1) #pur your code here
scores = model.evaluate(X_test,y_test, verbose=1) #put your code here
print("Accuracy: "+str(scores[1]))

What is it happened?
Comment the results

Try to predic the X_train and see what happens!

In [None]:
scores = model.evaluate(X_train,y_train, verbose=1) #put your code here
print("Accuracy: "+str(scores[1]))

This phenomeon is called overfitting.

### Exercise 2
Take the second implementation of "dog or cat", double the covolutional layer, scale the features with the MinMaxScaler and then re-run all.

In [None]:
from dask_ml.preprocessing import MinMaxScaler

scaler = MinMaxScaler()#put your code here

from dask_ml.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2, train_size=0.8)

scaler = scaler.fit(X_train) #put your code here
X_train = scaler.transform(X_train) #put your code here 
X_test = scaler.transform(X_test) #put your code here

In [None]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), name='c', strides=(1, 1), padding='valid', activation='relu', input_shape=(64,64,3)))
model.add(Conv2D(32, kernel_size=(3,3), name='c1', strides=(1, 1), padding='valid', activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu', name='d'))
#model.add(Dropout(0.2))
model.add(Dense(32, activation='relu', name='d1'))
#model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid', name='d2'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [None]:
from sklearn.metrics import accuracy_score
X_train, X_test = X_train.reshape((len(X_train), 64,64, 3)),  X_test.reshape((len(X_test), 64,64, 3))

model.fit(X_train, y_train, verbose=1, epochs=10, batch_size=32, validation_split=0.1) #pur your code here
scores = model.evaluate(X_test,y_test, verbose=1) #put your code here
print("Accuracy: "+str(scores[1]))