# Neural Network Smartphone Activity Detector

In this activity, you will train a neural network to use smartphone data to predict the activity of the user. 

This dataset has already been separated into input features and target activities. Additional information on the dataset can be found here. 

http://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions

### Data Pre-Processing

Prepare the data for the neural network. This includes splitting the data into a training and testing dataset, Scaling the data, and encoding the categorical target values

In [2]:
from pathlib import Path

import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

In [25]:
#reading into compiled csv
data = Path("Resources/spotify_main_raw_data.csv")
df = pd.read_csv(data)
df.shape

(388, 22)

In [26]:
#dropping irrelevant columns
raw_data = df.drop(['Unnamed: 0',"artist","track","type","id","uri","track_href","analysis_url","time_signature"],axis=1)
raw_data.head(6)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms,playlist
0,0.564,0.364,10,-5.845,0,0.0631,0.759,0.0,0.0839,0.591,86.538,192277,ryan
1,0.701,0.519,1,-6.382,1,0.0516,0.314,0.0,0.207,0.498,89.977,223044,ryan
2,0.309,0.74,7,-5.917,0,0.0456,0.00854,0.0258,0.119,0.166,144.861,834720,ryan
3,0.552,0.637,5,-6.568,1,0.0445,0.464,1.6e-05,0.136,0.333,97.97,314280,ryan
4,0.655,0.885,7,-4.116,1,0.0438,0.00117,0.000473,0.0448,0.938,100.088,189293,ryan
5,0.564,0.534,4,-6.05,1,0.0238,0.28,0.0,0.121,0.409,85.964,206533,ryan


In [27]:
# Define the features X set and the target y vector
X = raw_data.drop("playlist",axis=1)
y = raw_data.loc[:, ["playlist"]]


In [28]:
#multiclass classification
y.playlist.value_counts()

sarah    100
terry     99
ryan      84
alex      55
abdul     50
Name: playlist, dtype: int64

In [29]:
# Split the dataset into training and testing data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

# Acknowledge the dimension of both test and training data
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((291, 12), (97, 12), (291, 1), (97, 1))

In [30]:
# Scale the training and testing input features using StandardScaler
X_scaler = StandardScaler()
X_scaler.fit(X_train)

X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

In [31]:
# Apply One-hot encoding to the target labels
enc = OneHotEncoder()
enc.fit(y_train)

encoded_y_train = enc.transform(y_train).toarray()
encoded_y_test = enc.transform(y_test).toarray()
encoded_y_train[0]

array([1., 0., 0., 0., 0.])

# Build a Deep Neural Network

In [41]:
# Create a sequential model
model = Sequential()

In [42]:
# Add the first layer where the input dimensions are the X.shape[1] = 12 columns of the training data
model.add(Dense(50, activation='relu', input_dim = X.shape[1]))

# Add the 2nd layer where the input dimensions are the 12 columns of the training data
model.add(Dense(50, activation='relu', input_dim = 50))

# Add output layer
model.add(Dense(5, activation="softmax", input_dim = 50))

In [43]:
# The output layer has 12 columns that are one-hot encoded
y_train.playlist.value_counts()

# 5 output since there're 5 playlists
number_outputs = 5

In [44]:
# Compile the model using categorical_crossentropy for the loss function, the adam optimizer,
# and add accuracy to the training metrics
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])


In [45]:
# Print the model summary
model.summary()


Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_6 (Dense)             (None, 50)                650       
                                                                 
 dense_7 (Dense)             (None, 50)                2550      
                                                                 
 dense_8 (Dense)             (None, 5)                 255       
                                                                 
Total params: 3,455
Trainable params: 3,455
Non-trainable params: 0
_________________________________________________________________


In [46]:
# Use the training data to fit (train) the model
# @NOTE: Experiment with the number of training epochs to find the minimum iterations required to achieve a good accuracy
model.fit(
    X_train_scaled,
    encoded_y_train,
    epochs=20,
    shuffle=True,
    verbose=2
)

Epoch 1/20
10/10 - 0s - loss: 1.5961 - accuracy: 0.2612 - 189ms/epoch - 19ms/step
Epoch 2/20
10/10 - 0s - loss: 1.5054 - accuracy: 0.3505 - 6ms/epoch - 598us/step
Epoch 3/20
10/10 - 0s - loss: 1.4431 - accuracy: 0.4227 - 5ms/epoch - 500us/step
Epoch 4/20
10/10 - 0s - loss: 1.3942 - accuracy: 0.4399 - 5ms/epoch - 500us/step
Epoch 5/20
10/10 - 0s - loss: 1.3535 - accuracy: 0.4639 - 6ms/epoch - 599us/step
Epoch 6/20
10/10 - 0s - loss: 1.3170 - accuracy: 0.5052 - 5ms/epoch - 500us/step
Epoch 7/20
10/10 - 0s - loss: 1.2803 - accuracy: 0.5326 - 5ms/epoch - 500us/step
Epoch 8/20
10/10 - 0s - loss: 1.2504 - accuracy: 0.5533 - 5ms/epoch - 483us/step
Epoch 9/20
10/10 - 0s - loss: 1.2261 - accuracy: 0.5430 - 5ms/epoch - 500us/step
Epoch 10/20
10/10 - 0s - loss: 1.1978 - accuracy: 0.5567 - 6ms/epoch - 600us/step
Epoch 11/20
10/10 - 0s - loss: 1.1707 - accuracy: 0.5464 - 5ms/epoch - 517us/step
Epoch 12/20
10/10 - 0s - loss: 1.1505 - accuracy: 0.5567 - 6ms/epoch - 600us/step
Epoch 13/20
10/10 - 0s -

<keras.callbacks.History at 0x1aec2268f48>

# Evaluate the Model

In [47]:
# Evaluate the model using the testing data
model_loss, model_accuracy = model.evaluate(X_test_scaled, encoded_y_test, verbose=2)
print(f"Normal Neural Network - Loss: {model_loss}, Accuracy: {model_accuracy}")

4/4 - 0s - loss: 1.4570 - accuracy: 0.4742 - 61ms/epoch - 15ms/step
Normal Neural Network - Loss: 1.4570331573486328, Accuracy: 0.47422680258750916


In [48]:
# Make predictions
predicted = model.predict(X_test_scaled)
predicted = enc.inverse_transform(predicted).flatten().tolist()
results = pd.DataFrame({
    "Actual": y_test.playlist.values,
    "Predicted": predicted
})
results.head(10)

Unnamed: 0,Actual,Predicted
0,sarah,terry
1,sarah,sarah
2,sarah,sarah
3,ryan,sarah
4,terry,ryan
5,abdul,abdul
6,abdul,sarah
7,terry,terry
8,sarah,sarah
9,sarah,sarah


In [49]:
# Print the Classification Report
from sklearn.metrics import classification_report
print(classification_report(results.Actual, results.Predicted))

              precision    recall  f1-score   support

       abdul       0.54      0.50      0.52        14
        alex       0.25      0.14      0.18        14
        ryan       0.44      0.39      0.41        18
       sarah       0.35      0.50      0.41        24
       terry       0.69      0.67      0.68        27

    accuracy                           0.47        97
   macro avg       0.45      0.44      0.44        97
weighted avg       0.48      0.47      0.47        97



NameError: name 'x_train' is not defined