## DMA 2023 ##
Make sure you fill in any place that says `YOUR CODE HERE` or `YOUR ANSWER HERE`, as well as your name below:

In [1]:
NAME = "Jose Fernandez-Rocha"

# Lab 4: Neural Networks #
**Please read the following instructions very carefully**

## Working on the assignment / FAQs
- **Always use the seed/random_state as *42* wherever applicable** (This is to ensure repeatability in answers, across questions, students and coding environments).
- All questions will be graded manually.
- Most questions have two cells:
  - A code cell for your work/code
  - A text cell for giving your final answer
- The points each question carries are indicated.
- Most assignments have bonus questions for extra credit, do try them out!
- **Submitting the assignment** : Download the '.ipynb' and '.pdf' files from Colab and upload them to Gradescope. Do not delete any outputs from cells before submitting. Make sure to assign pages to questions when uploading your PDF to Gradescope.
- That's about it. Happy coding!


## About the dataset
This assignment uses a dataset obtained from the JSE Data Archive that contains biological and self-reported activity traits of a sample of college students at a single university uploaded in 2013. Background Information on the dataset: http://jse.amstat.org/v21n2/froelich/eyecolorgender.txt

For this lab, the dataset has already been split into a training set `df_train` and a test set `df_test`.


In [2]:
import pandas as pd
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC

from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.feature_extraction import DictVectorizer

from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV, ParameterGrid

import numpy as np

import warnings
warnings.filterwarnings("ignore")

In [3]:
!wget http://askoski.berkeley.edu/~zp/lab_4_training.csv
!wget http://askoski.berkeley.edu/~zp/lab_4_test.csv

df_train = pd.read_csv('./lab_4_training.csv')
df_test = pd.read_csv('./lab_4_test.csv')
df_train.head()

--2023-09-26 04:05:05--  http://askoski.berkeley.edu/~zp/lab_4_training.csv
Resolving askoski.berkeley.edu (askoski.berkeley.edu)... 169.229.192.179
Connecting to askoski.berkeley.edu (askoski.berkeley.edu)|169.229.192.179|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 79177 (77K) [text/csv]
Saving to: ‘lab_4_training.csv’


2023-09-26 04:05:05 (533 KB/s) - ‘lab_4_training.csv’ saved [79177/79177]

--2023-09-26 04:05:05--  http://askoski.berkeley.edu/~zp/lab_4_test.csv
Resolving askoski.berkeley.edu (askoski.berkeley.edu)... 169.229.192.179
Connecting to askoski.berkeley.edu (askoski.berkeley.edu)|169.229.192.179|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 26519 (26K) [text/csv]
Saving to: ‘lab_4_test.csv’


2023-09-26 04:05:05 (368 KB/s) - ‘lab_4_test.csv’ saved [26519/26519]



Unnamed: 0.1,Unnamed: 0,gender,age,year,eyecolor,height,miles,brothers,sisters,computertime,exercise,exercisehours,musiccds,playgames,watchtv
0,577,male,20,third,hazel,72.0,180.0,0,0,5.0,No,0.0,100.0,10.0,10.0
1,677,male,19,second,hazel,72.0,120.0,1,1,16.0,Yes,9.0,70.0,3.0,5.0
2,1738,male,20,second,brown,63.0,55.0,1,2,15.0,Yes,4.5,15.0,4.0,13.0
3,1355,male,20,third,green,78.0,200.0,0,0,10.0,Yes,9.0,20.0,10.0,10.0
4,891,female,19,second,green,67.0,280.0,2,0,4.0,Yes,2.0,164.0,0.0,2.0


In [4]:
df_test.head()

Unnamed: 0.1,Unnamed: 0,gender,age,year,eyecolor,height,miles,brothers,sisters,computertime,exercise,exercisehours,musiccds,playgames,watchtv
0,1303,male,20,second,green,73.0,210.0,0,1,10.0,Yes,5.0,50.0,1.0,15.0
1,36,male,20,third,other,71.0,90.0,1,0,15.0,Yes,4.0,10.0,0.0,1.0
2,489,male,22,fourth,hazel,75.0,200.0,0,1,1.0,Yes,2.0,150.0,1.0,10.0
3,1415,male,19,second,brown,72.0,35.0,2,2,20.0,Yes,5.0,100.0,0.0,7.0
4,616,male,22,fourth,hazel,71.0,15.0,2,1,10.0,Yes,7.0,10.0,0.0,5.0


***
### Question 1 (1 point)###
Calculate a baseline accuracy measure using the majority class, assuming a target variable of `gender`. The majority class is the most common value of the target variable in a particular dataset. Accuracy is calculated as (true positives + true negatives) / (all negatives and positives).

**Question 1.a**  
Find the majority class in the training set. If you always predicted this class in the training set, what would your accuracy be?

In [5]:
# YOUR CODE HERE
df_train.groupby('gender').count()
majority = "female"
true_positive = (df_train['gender'] == 'female').sum()
true_negative = 0
total = df_train['gender'].count()

accuracy = (true_positive + true_negative) / total
accuracy

0.5427852348993288

**Answer: YOUR ANSWER HERE**

**Question 1.b**   
If you always predicted this same class (majority from the training set) in the test set, what would your accuracy be?

In [6]:
# YOUR CODE HERE
df_test.groupby('gender').count()
majority_test = "female"
true_positive_test = (df_test['gender'] == 'female').sum()
true_negative_test = 0
total_test = df_test['gender'].count()

accuracy_test = (true_positive_test + true_negative_test) / total_test
accuracy_test

0.5226130653266332

**Answer: YOUR ANSWER HERE**

***
### Question 2 (1.5 points)###
Get started with Neural Networks.

   
Choose a NN implementation (we recommend Sklearn MLPclassifier) and specify which you choose. Be sure the implementation allows you to modify the number of hidden layers and hidden nodes per layer.  

NOTE: When possible, specify the logsig (`sigmoid`/`logistic`) function as the transfer function (another word for activation function) and use Levenberg-Marquardt backpropagation (`lbfgs`). It is possible to specify logistic in Sklearn MLPclassifier.  

**My NN implementation of choice: Sklearn**

**Question 2.a**   
Train a neural network with a single 10 node hidden layer. Only use the `height` feature of the dataset to predict the `gender`. You will have to change `gender` to a 0 and 1 class. After training, use your trained model to predict the class (`gender`) using the `height` feature from the training set. What is the accuracy of this prediction?

In [7]:
# YOUR CODE HERE
df_train = df_train.replace({'first"': "first"})
df_test = df_test.replace({'first"': "first"})

In [8]:
df_train.head()
r_df_train = df_train.replace('male', 0).replace('female', 1)
r_df_test = df_test.replace('male', 0).replace('female', 1)


In [9]:
X_train = r_df_train[['height']]
Y_train = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=100, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)

clf.fit(X_train,Y_train)
y_pred_train=clf.predict(X_train)
print(accuracy_score(Y_train,y_pred_train))

0.5427852348993288


**Answer: 54.27%**

**Question 2.b (0.5 points)**  
Take the trained model from question 2.a and use it to predict the test set. This can be accomplished by taking the trained model and giving it the `height` feature values from the test set. What is the accuracy of this model on the test set?

In [10]:
# YOUR CODE HERE
X_test = r_df_test[['height']]
Y_test = r_df_test['gender']
y_pred_test=clf.predict(X_test)
print(accuracy_score(Y_test,y_pred_test))

0.5226130653266332


**Answer: 52.26%**

**Question 2.c**   
Neural Networks tend to prefer smaller, normalized feature values. Try taking the log of the `height` feature in both training and testing sets or use a Standard Scalar operation in SKlearn to centre and normalize the data between 0-1 for continuous values. Repeat question 2.a and 2.b with the log version or the normalized and centered version of this feature.

In [11]:
# YOUR CODE HERE
log_X_train = np.log(r_df_train[['height']])
log_Y_train = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=100, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)

clf.fit(log_X_train,log_Y_train)
log_y_pred_train=clf.predict(log_X_train)
print("Training Accuracy:", accuracy_score(log_Y_train,log_y_pred_train))

log_X_test = np.log(r_df_test[['height']])
log_Y_test = r_df_test['gender']

log_y_pred_test=clf.predict(log_X_test)
print("Test Accuracy", accuracy_score(log_Y_test,log_y_pred_test))



Training Accuracy: 0.8439597315436241
Test Accuracy 0.8542713567839196


**Answer (accuracy on training set): 84.39%**

**Answer (accuracy on test set): 85.42%**

***

### Question 3 (1 point) ###
Many of the remaining features in the dataset are categorical. No ML method accepts categorical features, so transform `year`, `eyecolor`, `exercise` into a set of binary features, one feature per unique original feature value, and mark the binary feature as ‘1’ if the feature value matches the original value and ‘0’ otherwise. Using only these one-hot transformed features, train and predict the class of the test set. What was your accuracy using a Neural Network with a single 10 node hidden layer?

In [12]:
# YOUR CODE HERE
from sklearn import preprocessing
df_train

def encoded(r_df_train):
  cat_df = r_df_train.select_dtypes(include=[object])
  le = preprocessing.LabelEncoder()
  cat_df_2 = cat_df.apply(le.fit_transform)
  enc = preprocessing.OneHotEncoder()
  enc.fit(cat_df_2)
  onehotlabels = enc.transform(cat_df_2).toarray()
  encoded_cat_df = pd.DataFrame(onehotlabels)
  return encoded_cat_df


cat_X_train = encoded(r_df_train)
cat_Y_train = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)

clf.fit(cat_X_train,cat_Y_train)
cat_y_pred_train=clf.predict(cat_X_train)
print("Test Accuracy:", accuracy_score(cat_Y_train,cat_y_pred_train))
cat_X_test = encoded(r_df_test)
cat_Y_test = r_df_test['gender']
cat_y_pred_test=clf.predict(cat_X_test)
print("Train Accuracy:", accuracy_score(cat_Y_test,cat_y_pred_test))

Test Accuracy: 0.5713087248322147
Train Accuracy: 0.5527638190954773


**Answer: Test accuracy was 57.13% and train accuracy was 55.27%**

***
### Question 4 (3 points)###
Using a NN, report the accuracy on the test set of a model that trained only on `height` and the `eyecolor` features of instances in the training set.

**Question 4.a**  
What is the accuracy on the test set using the original `height` values (no pre-processing) and `eyecolor` as a one-hot?

In [13]:
# YOUR CODE HERE
df_4 = r_df_train[['height', 'eyecolor']]
df_4_test = r_df_test[['height', 'eyecolor']]
encoded_4 = encoded(df_4)
encoded_4['height'] = df_4['height']
encoded_4_test = encoded(df_4_test)
encoded_4_test['height'] = df_4_test['height']
encoded_4.columns = encoded_4.columns.astype(str)
encoded_4_test.columns = encoded_4_test.columns.astype(str)

e_X_train = encoded_4
e_Y_train = r_df_train['gender']

clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)
clf.fit(e_X_train,e_Y_train)
e_y_pred_train=clf.predict(e_X_train)
print("Train Accuracy:", accuracy_score(e_Y_train,e_y_pred_train))
e_X_test = encoded_4_test
e_Y_test = r_df_test['gender']
e_y_pred_test=clf.predict(e_X_test)
print("Test Accuracy:", accuracy_score(e_Y_test,e_y_pred_test))

Train Accuracy: 0.8187919463087249
Test Accuracy: 0.8291457286432161


**Answer: Train accuracy of 81.87% and test accuracy of 82.91%**

**Question 4.b**  
What is the accuracy on the test set using the log of `height` values (applied to both training and testing sets) and `eyecolor` as a one-hot?

In [14]:
# YOUR CODE HERE
df_4b = r_df_train[['height', 'eyecolor']]
df_4b_test = r_df_test[['height', 'eyecolor']]
encoded_4b = encoded(df_4b)
encoded_4b['height'] = np.log(df_4b['height'])
encoded_4b_test = encoded(df_4b_test)
encoded_4b_test['height'] = np.log(df_4b_test['height'])
encoded_4b.columns = encoded_4.columns.astype(str)
encoded_4b_test.columns = encoded_4_test.columns.astype(str)
#training
be_X_train = encoded_4b
be_Y_train = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)
clf.fit(be_X_train,be_Y_train)
be_y_pred_train=clf.predict(be_X_train)
print("Training Accuracy:", accuracy_score(be_Y_train,be_y_pred_train))

be_X_test = encoded_4b_test
be_Y_test = r_df_test['gender']

be_y_pred_test=clf.predict(be_X_test)
print("Test Accuracy:", accuracy_score(be_Y_test,be_y_pred_test))

Training Accuracy: 0.8162751677852349
Test Accuracy: 0.821608040201005


**Answer: Training Accuracy: 81.62% and Test Accuracy: 82.16% **

**Question 4.c**  
What is the accuracy on the test set using the Z-score of `height` values and `eyecolor` as a one-hot?

Z-score is a normalization function. It is the value of a feature minus the average value for that feature (in the training set), divided by the standard deviation of that feature (in the training set). Remember that, whenever applying a function to a feature in the training set, it also has to be applied to that same feature in the test set.

In [15]:
# YOUR CODE HERE
df_4c = r_df_train[['height', 'eyecolor']]
df_4c_test = r_df_test[['height', 'eyecolor']]
encoded_4c = encoded(df_4c)
encoded_4c_test = encoded(df_4c_test)

encoded_4c['height'] = (df_4c['height'] - df_4c['height'].mean())/df_4c['height'].std()
encoded_4c_test['height'] = (df_4c_test['height'] - df_4c_test['height'].mean())/df_4c_test['height'].std()
encoded_4c.columns = encoded_4.columns.astype(str)
encoded_4c_test.columns = encoded_4_test.columns.astype(str)
#training
ce_X_train = encoded_4c
ce_Y_train = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)
clf.fit(ce_X_train,ce_Y_train)
ce_y_pred_train=clf.predict(ce_X_train)
print("Training Accuracy:", accuracy_score(ce_Y_train,ce_y_pred_train))

ce_X_test = encoded_4c_test
ce_Y_test = r_df_test['gender']

ce_y_pred_test=clf.predict(ce_X_test)
print("Test Accuracy:", accuracy_score(ce_Y_test,ce_y_pred_test))

Training Accuracy: 0.8447986577181208
Test Accuracy: 0.8517587939698492


**Answer: Training Accuracy: 84.47% and Test Accuracy: 85.17%**

***
### Question 5 (1.5 points) ###
Repeat question 4 for `playgames` & `eyecolor`.

**Question 5.a**  \\
What is the accuracy on the test set using the original `playgames` values (no pre-processing) and `eyecolor` as a one-hot?

In [16]:
# YOUR CODE HERE
df_5 = r_df_train[['exercisehours', 'eyecolor']]
df_5_test = r_df_test[['exercisehours', 'eyecolor']]
encoded_5 = encoded(df_5)
encoded_5['exercisehours'] = df_5['exercisehours']
encoded_5_test = encoded(df_5_test)
encoded_5_test['exercisehours'] = df_5_test['exercisehours']
encoded_5.columns = encoded_4.columns.astype(str)
encoded_5_test.columns = encoded_4_test.columns.astype(str)
e_X_train5 = encoded_5
e_Y_train5 = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)
clf.fit(e_X_train5,e_Y_train5)
e_y_pred_train5=clf.predict(e_X_train5)
print("Training Accuracy:",accuracy_score(e_Y_train5,e_y_pred_train5))

e_X_test5 = encoded_5_test
e_Y_test5 = r_df_test['gender']

e_y_pred_test5=clf.predict(e_X_test5)
print("Test Accuracy:",accuracy_score(e_Y_test5,e_y_pred_test5))

Training Accuracy: 0.584731543624161
Test Accuracy: 0.5653266331658291


**Answer: Training Accuracy: 58.47% and Test Accuracy: 56.53%**

**Question 5.b**  \\
What is the accuracy on the test set using the log of `playgames` values (applied to both training and testing sets) and `eyecolor` as a one-hot?

Note: You can drop rows that have 0 in the `playgames` column, in order to avoid -inf values when applying log.

In [17]:
# YOUR CODE HERE
data_train = r_df_train[['exercisehours', 'eyecolor', 'gender']]
data_test = r_df_test[['exercisehours', 'eyecolor', 'gender']]

data_train['exercisehours'] = data_train['exercisehours'].replace(0.0, 0.01)
data_test['exercisehours'] = data_test['exercisehours'].replace(0.0,0.01)

df_5b = data_train[['exercisehours', 'eyecolor']]
df_5b_test = data_test[['exercisehours', 'eyecolor']]

encoded_5b = encoded(df_5b)
encoded_5b['exercisehours'] = np.log(df_5b['exercisehours'])

encoded_5b_test = encoded(df_5b_test)
encoded_5b_test['exercisehours'] = np.log(df_5b_test['exercisehours'])

#training
encoded_5b.columns = encoded_4.columns.astype(str)
encoded_5b_test.columns = encoded_4_test.columns.astype(str)
X_train = encoded_5b
Y_train = data_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)
clf.fit(X_train,Y_train)
y_pred_train=clf.predict(X_train)
print("Training Accuracy:",accuracy_score(Y_train,y_pred_train))

X_test = encoded_5b_test
Y_test = data_test['gender']

y_pred_test=clf.predict(X_test)
print("Test Accuracy:",accuracy_score(Y_test,y_pred_test))

Training Accuracy: 0.5855704697986577
Test Accuracy: 0.5603015075376885


**Answer: Training Accuracy: 58.55% and Test Accuracy: 56.03%**

**Question 5.c** \\
What is the accuracy on the test set using the Z-score of `playgames` values and `eyecolor` as a one-hot?

In [18]:
# YOUR CODE HERE
df_5c = r_df_train[['exercisehours', 'eyecolor']]
df_5c_test = r_df_test[['exercisehours', 'eyecolor']]
encoded_5c = encoded(df_5c)
encoded_5c_test = encoded(df_5c_test)

encoded_5c['exercisehours'] = (df_5c['exercisehours'] - df_5c['exercisehours'].mean())/df_5c['exercisehours'].std()
encoded_5c_test['exercisehours'] = (df_5c_test['exercisehours'] - df_5c_test['exercisehours'].mean())/df_5c_test['exercisehours'].std()

#training
encoded_5c.columns = encoded_4.columns.astype(str)
encoded_5c_test.columns = encoded_4_test.columns.astype(str)
ce_X_train5 = encoded_5c
ce_Y_train5 = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)
clf.fit(ce_X_train5,ce_Y_train5)
ce_y_pred_train5=clf.predict(ce_X_train5)
print("Training Accuracy:",accuracy_score(ce_Y_train5,ce_y_pred_train5))

ce_X_test5 = encoded_5c_test
ce_Y_test5 = r_df_test['gender']

ce_y_pred_test5=clf.predict(ce_X_test5)
print("Test Accuracy:",accuracy_score(ce_Y_test5,ce_y_pred_test5))

Training Accuracy: 0.5914429530201343
Test Accuracy: 0.5678391959798995


**Answer: Training Accuracy: 59.14% and Test Accuracy: 56.78%**

***
### Question 6 (2 points)###
Combine the features from question 3, 4, and 5 (`year`, `eyecolor`, `exercise`, `height`, `playgames`). For numeric features use the best normalization method from questions 4 and 5.

**Question 6.a**   
What was the NN accuracy on the test set using the single 10 node hidden layer?

In [19]:
# YOUR CODE HERE
df_6_train = encoded(r_df_train)
df_6_train['height'] = (r_df_train['height'] - r_df_train['height'].mean())/r_df_train['height'].std()
df_6_train['exercisehours'] = (r_df_train['exercisehours'] - r_df_train['exercisehours'].mean())/r_df_train['exercisehours'].std()

df_6_test = encoded(r_df_test)
df_6_test['height'] = (r_df_test['height'] - r_df_test['height'].mean())/r_df_test['height'].std()
df_6_test['exercisehours'] = (r_df_test['exercisehours'] - r_df_test['exercisehours'].mean())/r_df_test['exercisehours'].std()
df_6_train.columns = df_6_train.columns.astype(str)
df_6_test.columns = df_6_test.columns.astype(str)
cat_X_train = df_6_train
cat_Y_train = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,), activation = 'logistic', max_iter=50, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)

clf.fit(cat_X_train,cat_Y_train)
cat_y_pred_train=clf.predict(cat_X_train)
print("Training Accuracy:",accuracy_score(cat_Y_train,cat_y_pred_train))

cat_X_test = df_6_test
cat_Y_test = r_df_test['gender']

cat_y_pred_test=clf.predict(cat_X_test)
print("Test Accuracy:",accuracy_score(cat_Y_test,cat_y_pred_test))

Training Accuracy: 0.8523489932885906
Test Accuracy: 0.8391959798994975


**Answer: Training Accuracy: 85.23% and Test Accuracy: 83.91%**

***
### Question 7- Bonus (1 point)###
Can you improve your test set prediction accuracy by 3% or more? See how close to that milestone of improvement you can get by modifying the hyperparameters of  Neural Networks (the number of hidden layers, number of hidden nodes in each layer, the learning rate, the type of activation function etc.).

A great guide to tuning parameters is explained in this guide: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf. While the guide is specific to SVM and in particular the C and gamma parameters of the RBF kernel, the method applies more generally to any ML technique with tuning parameters.

Please give your new prediction accuracy on the test set, and write a paragraph in a text cell below with an explanation of your approach and evaluation metrics.


In [20]:
# YOUR CODE HERE
df_7_train = encoded(r_df_train)
df_7_train['height'] = (r_df_train['height'] - r_df_train['height'].mean())/r_df_train['height'].std()
df_7_train['exercisehours'] = (r_df_train['exercisehours'] - r_df_train['exercisehours'].mean())/r_df_train['exercisehours'].std()

df_7_test = encoded(r_df_test)
df_7_test['height'] = (r_df_test['height'] - r_df_test['height'].mean())/r_df_test['height'].std()
df_7_test['exercisehours'] = (r_df_test['exercisehours'] - r_df_test['exercisehours'].mean())/r_df_test['exercisehours'].std()
df_7_test.columns = df_7_test.columns.astype(str)
df_7_train.columns = df_7_train.columns.astype(str)
cat_X_train7 = df_7_train
cat_Y_train7 = r_df_train['gender']
clf = MLPClassifier(hidden_layer_sizes=(10,10,), activation = 'logistic', max_iter=100, alpha=0.0001,
                     solver='lbfgs', verbose=10,  random_state=21,tol=0.000000001)

clf.fit(cat_X_train7,cat_Y_train7)
cat_y_pred_train7=clf.predict(cat_X_train7)
print("Training Accuracy:",accuracy_score(cat_Y_train7,cat_y_pred_train7))

cat_X_test7 = df_7_test
cat_Y_test7 = r_df_test['gender']

cat_y_pred_test7=clf.predict(cat_X_test7)
print("Test Accuracy:",accuracy_score(cat_Y_test7,cat_y_pred_test7))

Training Accuracy: 0.8833892617449665
Test Accuracy: 0.8366834170854272


**Answer: Training Accuracy: 88.33% and Test Accuracy: 83.66%**

**Explanation: By increasing the amount of hidden layers(10 to 10,10), as well as increasing the max_iter to 100, I was able to get a higher training and test accuracy.**