**Drill: Playing with layers**

Now it's your turn. Using the space below, experiment with different hidden layer structures. You can try this on a subset of the data to improve runtime. See how things vary. See what seems to matter the most. Feel free to manipulate other parameters as well. It may also be beneficial to do some real feature selection work.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.model_selection import cross_val_score
from sklearn.neural_network import MLPClassifier


import warnings
warnings.filterwarnings(action="ignore")

In [2]:
artworks = pd.read_csv('https://tf-assets-prod.s3.amazonaws.com/tf-curric/data-science/Artworks.csv')

In [3]:
artworks.columns

Index(['Title', 'Artist', 'ConstituentID', 'ArtistBio', 'Nationality',
       'BeginDate', 'EndDate', 'Gender', 'Date', 'Medium', 'Dimensions',
       'CreditLine', 'AccessionNumber', 'Classification', 'Department',
       'DateAcquired', 'Cataloged', 'ObjectID', 'URL', 'ThumbnailURL',
       'Circumference (cm)', 'Depth (cm)', 'Diameter (cm)', 'Height (cm)',
       'Length (cm)', 'Weight (kg)', 'Width (cm)', 'Seat Height (cm)',
       'Duration (sec.)'],
      dtype='object')

In [4]:
artworks.shape

(138025, 29)

In [5]:
artworks = artworks.sample(frac=.001)

In [6]:
# Select Columns.
artworks = artworks[['Artist', 'Nationality', 'Gender', 'Date', 'Department',
                    'DateAcquired', 'URL', 'ThumbnailURL', 'Height (cm)', 'Width (cm)']]

# Convert URL's to booleans.
artworks['URL'] = artworks['URL'].notnull()
artworks['ThumbnailURL'] = artworks['ThumbnailURL'].notnull()

# Drop films and some other tricky rows.
artworks = artworks[artworks['Department']!='Film']
artworks = artworks[artworks['Department']!='Media and Performance Art']
artworks = artworks[artworks['Department']!='Fluxus Collection']

# Drop missing data.
artworks = artworks.dropna()

In [7]:
artworks.head()

Unnamed: 0,Artist,Nationality,Gender,Date,Department,DateAcquired,URL,ThumbnailURL,Height (cm),Width (cm)
54155,Michael Spano,(American),(Male),1982,Photography,1996-12-10,False,False,69.1,89.4
47188,Lee Friedlander,(American),(Male),1983,Photography,1995-06-15,False,False,56.8,38.5
11084,Josef Albers,(American),(Male),1970-1972,Drawings & Prints,1975-05-09,False,False,28.7,36.4
103626,Unknown photographer,(),(),c. 1930,Photography,2010-10-07,True,True,7.0,11.4
97286,Per Kirkeby,(Danish),(Male),1969,Drawings & Prints,2008-10-08,True,True,10.0,11.9


In [8]:
# Get data types.
artworks.dtypes

Artist           object
Nationality      object
Gender           object
Date             object
Department       object
DateAcquired     object
URL                bool
ThumbnailURL       bool
Height (cm)     float64
Width (cm)      float64
dtype: object

In [9]:
artworks['DateAcquired'] = pd.to_datetime(artworks.DateAcquired)
artworks['YearAcquired'] = artworks.DateAcquired.dt.year
artworks['YearAcquired'].dtype

dtype('int64')

In [10]:
# Remove multiple nationalities, genders, and artists.
artworks.loc[artworks['Gender'].str.contains('\) \('), 'Gender'] = '\(multiple_persons\)'
artworks.loc[artworks['Nationality'].str.contains('\) \('), 'Nationality'] = '\(multiple_nationalities\)'
artworks.loc[artworks['Artist'].str.contains(','), 'Artist'] = 'Multiple_Artists'

# Convert dates to start date, cutting down number of distinct examples.
artworks['Date'] = pd.Series(artworks.Date.str.extract(
    '([0-9]{4})', expand=False))[:-1]

# Final column drops and NA drop.
X = artworks.drop(['Department', 'DateAcquired', 'Artist', 'Nationality', 'Date'], 1)

# Create dummies separately.
artists = pd.get_dummies(artworks.Artist)
nationalities = pd.get_dummies(artworks.Nationality)
dates = pd.get_dummies(artworks.Date)

# Concat with other variables, but artists slows this wayyyyy down so we'll keep it out for now
X = pd.get_dummies(X, sparse=True)
X = pd.concat([X, nationalities, dates], axis=1)

Y = artworks.Department

In [11]:
Y.value_counts()/len(Y)

Drawings & Prints        0.592233
Photography              0.242718
Architecture & Design    0.135922
Painting & Sculpture     0.029126
Name: Department, dtype: float64

In [12]:
# Alright! We've done our prep, let's build the model.
# Neural networks are hugely computationally intensive.
# This may take several minutes to run.

# Establish and fit the model, with a single, 1000 perceptron layer.
mlp1 = MLPClassifier()
mlp1.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(100,), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [13]:
mlp1.score(X, Y)

0.5922330097087378

In [14]:
score1= cross_val_score(mlp1, X, Y, cv=5)
print( 'Accuracy of dataset in mlp1 model: %0.2f (+/- %0.2f)' % (score1.mean(), score1.std() * 2))

Accuracy of dataset in mlp1 model: 0.55 (+/- 0.31)


The MLP model with default parameters is around 0.59

In [15]:
mlp2 = MLPClassifier(hidden_layer_sizes=(500,))
mlp2.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(500,), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [16]:
mlp2.score(X, Y)

0.8446601941747572

In [17]:
score2 = cross_val_score(mlp2, X, Y, cv=5)
print( 'Accuracy of dataset in mlp2 model: %0.2f (+/- %0.2f)' % (score2.mean(), score2.std() * 2))

Accuracy of dataset in mlp2 model: 0.47 (+/- 0.36)


By increasing wide of model the score model increased but accuracy decreased.

In [18]:
mlp3 = MLPClassifier(activation='logistic')
mlp3.fit(X, Y)

MLPClassifier(activation='logistic', alpha=0.0001, batch_size='auto',
              beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(100,), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [19]:
mlp3.score(X, Y)

0.6699029126213593

In [20]:
score3 = cross_val_score(mlp3, X, Y, cv=5)
print( 'Accuracy of dataset in mlp3 model: %0.2f (+/- %0.2f)' % (score3.mean(), score3.std() * 2))

Accuracy of dataset in mlp3 model: 0.52 (+/- 0.10)


By using logistic activation for model, the score increased.

In [21]:
mlp4 = MLPClassifier(hidden_layer_sizes=(500, ),activation='logistic')
mlp4.fit(X, Y)

MLPClassifier(activation='logistic', alpha=0.0001, batch_size='auto',
              beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(500,), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [22]:
mlp4.score(X, Y)

0.7961165048543689

In [23]:
score4 = cross_val_score(mlp4, X, Y, cv=5)
print( 'Accuracy of dataset in mlp4 model: %0.2f (+/- %0.2f)' % (score4.mean(), score4.std() * 2))

Accuracy of dataset in mlp4 model: 0.55 (+/- 0.06)


By using Logistic activation and make the model wider, the score and accuracy of the model increased

In [24]:
mlp5 = MLPClassifier(hidden_layer_sizes=(500,30,))
mlp5.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(500, 30), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [25]:
mlp5.score(X, Y)

0.5922330097087378

In [26]:
score5 = cross_val_score(mlp5, X, Y, cv=5)
print( 'Accuracy of dataset in mlp5 model: %0.2f (+/- %0.2f)' % (score5.mean(), score5.std() * 2))

Accuracy of dataset in mlp5 model: 0.49 (+/- 0.25)


By increasing the layer the score of model decreased.

In [27]:
mlp6 = MLPClassifier(hidden_layer_sizes=(500,30, ),activation='logistic')
mlp6.fit(X, Y)

MLPClassifier(activation='logistic', alpha=0.0001, batch_size='auto',
              beta_1=0.9, beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(500, 30), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [28]:
mlp6.score(X, Y)

0.7087378640776699

In [29]:
score6 = cross_val_score(mlp6, X, Y, cv=5)
print( 'Accuracy of dataset in mlp6 model: %0.2f (+/- %0.2f)' % (score6.mean(), score6.std() * 2))

Accuracy of dataset in mlp6 model: 0.50 (+/- 0.13)


By increasing the layer and using logistic activation score of the model increased.

In [30]:
mlp7 = MLPClassifier(hidden_layer_sizes=(500,30,20,10,5))
mlp7.fit(X, Y)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(500, 30, 20, 10, 5), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

In [31]:
mlp7.score(X, Y)

0.5922330097087378

In [32]:
score7 = cross_val_score(mlp7, X, Y, cv=5)
print( 'Accuracy of dataset in mlp7 model: %0.2f (+/- %0.2f)' % (score7.mean(), score7.std() * 2))

Accuracy of dataset in mlp7 model: 0.52 (+/- 0.29)


By increasing layer the score and accuracy of the model decreased.

As a generall, the score of MPL model decreased or increased by increasing wide and layer of model and change activition  depends on the dataset.