# Gradient Descent Notebook

### Introduction
In this notebook, we will play a little bit with scikit-learn's gradient descent classifiers.

Let's try it on some of the datasets we've seen before.
We've seen these steps before.

In [52]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [9]:
from sklearn.preprocessing import StandardScaler
dtypes = {'DATE': 'str',
          'ACMH': 'float64',
          'ACMH_ATTRIBUTES': 'str',
          'ACSH': 'float64',
          'ACSH_ATTRIBUTES': 'str',
          'PRCP': 'float64',
          'PRCP_ATTRIBUTES': 'str',
          'PSUN': 'float64',
          'PSUN_ATTRIBUTES': 'str',
          'SNOW': 'float64',
          'SNOW_ATTRIBUTES': 'str',
          'TAVG': 'float64',
          'TAVG_ATTRIBUTES': 'str',
          'TMAX': 'float64',
          'TMAX_ATTRIBUTES': 'str',
          'TMIN': 'float64',
          'TMIN_ATTRIBUTES': 'str',
          'TOBS': 'float64',
          'TOBS_ATTRIBUTES': 'str',
          'TSUN': 'float64',
          'TSUN_ATTRIBUTES': 'str'}
data = pd.read_csv('denverstapleton.csv',
                   usecols=dtypes.keys(),
                   dtype=dtypes,
                   parse_dates=['DATE'])
overlap = (data['TSUN'].notna() &
           data['TMAX'].notna() &
           data['PRCP'].notna())
# We must convert the data to numpy arrays
sun = data.loc[overlap, 'TSUN'].to_numpy()
temperature = data.loc[overlap, 'TMAX'].to_numpy()
prcp = data.loc[overlap, 'PRCP'].to_numpy()
date = data.loc[overlap, 'DATE'].to_numpy().astype('O')

#normalizing data
target_scaler = StandardScaler()
target_scaler.fit(sun[:,np.newaxis])
transformed_sun = target_scaler.transform(sun[:,np.newaxis])
target_scaler.fit(np.c_[temperature, prcp])

# X is a matrix with columns for each feature
X = np.c_[temperature, prcp]

transformed_X = target_scaler.transform(X)

For reference, here is the Linear model that we used earlier

In [14]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# We fit the model
reg = LinearRegression()
reg.fit(transformed_X, sun)
# We get our predictions
prediction = reg.predict(transformed_X)
# But some are negative, so we can just clip those:
prediction[prediction < 0] = 0

print(f'score: {reg.score(transformed_X, sun)}')
print(f'MSE: {mean_squared_error(prediction, sun)}')

score: 0.34537643153583997
MSE: 31019.652227810413


Now, let's try a model that uses gradient descent.
The model we are going to use is SGDRegressor, which stands for Stochastic Gradient Descent Regressor.


In [16]:
from sklearn.linear_model import SGDRegressor


reg = SGDRegressor()
reg.fit(transformed_X, sun)
# We get our predictions
prediction = reg.predict(transformed_X)
# But some are negative, so we can just clip those:
prediction[prediction < 0] = 0

print(f'score: {reg.score(transformed_X, sun)}')
print(f'MSE: {mean_squared_error(prediction, sun)}')

score: 0.34486042805740735
MSE: 30921.299197452012


### Exercise 1

Now, let's try it on another dataset we've seen before: the Titanic. 
Again, we load it as before

In [137]:
titanic = pd.read_csv('titanic.csv')
titanic

Unnamed: 0,Survived,Pclass,Name,Sex,Age,Siblings/Spouses Aboard,Parents/Children Aboard,Fare
0,0,3,Mr. Owen Harris Braund,male,22.0,1,0,7.2500
1,1,1,Mrs. John Bradley (Florence Briggs Thayer) Cum...,female,38.0,1,0,71.2833
2,1,3,Miss. Laina Heikkinen,female,26.0,0,0,7.9250
3,1,1,Mrs. Jacques Heath (Lily May Peel) Futrelle,female,35.0,1,0,53.1000
4,0,3,Mr. William Henry Allen,male,35.0,0,0,8.0500
...,...,...,...,...,...,...,...,...
882,0,2,Rev. Juozas Montvila,male,27.0,0,0,13.0000
883,1,1,Miss. Margaret Edith Graham,female,19.0,0,0,30.0000
884,0,3,Miss. Catherine Helen Johnston,female,7.0,1,2,23.4500
885,1,1,Mr. Karl Howell Behr,male,26.0,0,0,30.0000


In [157]:
survived = titanic['Survived'].to_numpy()
data = titanic.to_numpy()[:, 1:] #trimmed out whether they survived or not
def gender_to_num(s):
    if(s == 'male'):
        return 0
    else:
        return 1
    
gender = np.array(list(map(gender_to_num, titanic['Sex'].to_numpy())))

numerical_data = np.c_[data[:, 0], gender, data[:,3:]]

numerical_data.shape
survived.shape

(887,)

This time, let's use SGDClassifier. Try to get the model to classify the data points, and see how you do. Feel free to use any data points or all of them. 

In [1]:
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.metrics import confusion_matrix

train_X, test_X, train_y, test_y = train_test_split(numerical_data, survived)

reg = Pipeline([('scaler', StandardScaler()), ('classifier', SGDClassifier(alpha=0.01))])


reg.fit(train_X, train_y)
reg.score(test_X, test_y)
predictions = reg.predict(test_X)
confusion_matrix(test_y, predictions)

NameError: name 'numerical_data' is not defined

Now, try plugging in data describing yourself

In [177]:
def do_i_survive(model, Pclass, name, sex, age, relatives_aboard, parents_children_aboard, fare):
    datapoint = np.array([Pclass, gender_to_num(sex), age, relatives_aboard, parents_children_aboard, fare]).reshape(1, -1)
    print(datapoint)
    return model.predict(datapoint)

do_i_survive(reg, #..fill in the rest)

[[ 3  0 10  1  2 65]]


array([1])

Of course, these exercises aren't super interesting, since we are trying on datasets that we had already gotten very good results on using other methods. The real power of gradient descent is when building larger networks, which we will look at in a later lesson. 