In [3]:
!git clone https://github.com/NataliaVrabcova/assessment-1-neural-networks

Cloning into 'assessment-1-neural-networks'...
remote: Enumerating objects: 9, done.[K
remote: Counting objects: 100% (9/9), done.[K
remote: Compressing objects: 100% (8/8), done.[K
remote: Total 9 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (9/9), 2.01 MiB | 11.41 MiB/s, done.
Resolving deltas: 100% (1/1), done.


In [5]:
import pandas as pd

# Adjust the path to match your repository structure
data = pd.read_csv('/content/assessment-1-neural-networks/healthcare_noshows_appointments.csv')

# Verify the dataset is loaded
print(data.head())

      PatientId  AppointmentID Gender ScheduledDay AppointmentDay  Age  \
0  2.987250e+13        5642903      F   2016-04-29     2016-04-29   62   
1  5.589978e+14        5642503      M   2016-04-29     2016-04-29   56   
2  4.262962e+12        5642549      F   2016-04-29     2016-04-29   62   
3  8.679512e+11        5642828      F   2016-04-29     2016-04-29    8   
4  8.841186e+12        5642494      F   2016-04-29     2016-04-29   56   

       Neighbourhood  Scholarship  Hipertension  Diabetes  Alcoholism  \
0    JARDIM DA PENHA        False          True     False       False   
1    JARDIM DA PENHA        False         False     False       False   
2      MATA DA PRAIA        False         False     False       False   
3  PONTAL DE CAMBURI        False         False     False       False   
4    JARDIM DA PENHA        False          True      True       False   

   Handcap  SMS_received  Showed_up  Date.diff  
0    False         False       True          0  
1    False        

In [6]:
# Importing libraries
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from keras.models import Sequential
from keras.layers import Dense, Dropout, Input
from keras.optimizers import Adam

Preprocessing the Data

In [7]:
# Selecting features and target variable
X = data.drop(['Showed_up', 'PatientId', 'AppointmentID'], axis=1)
y = data['Showed_up']

# Encoding categorical variables
X = pd.get_dummies(X, drop_first=True)

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Feature scaling
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

Defining the architecture of the neural network:

In [8]:
# Defining the neural network model
model = Sequential([
    Input(shape=(X_train.shape[1],)),
    Dense(64, activation='relu'),
    Dropout(0.5),
    Dense(32, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(loss='binary_crossentropy', optimizer=Adam(), metrics=['accuracy'])

**The Adam **optimizer was chosen due to its adaptability and efficiency in handling sparse gradients. The default learning rate of 0.001 was used, as it generally works well for a wide range of tasks and often provides a good starting point for optimization. This ensures stable and efficient convergence without the need for extensive fine-tuning.

Train the model with your preprocessed data:

In [14]:
# Train the model
history = model.fit(X_train, y_train, epochs=25, batch_size=32, validation_data=(X_test, y_test))

Epoch 1/25
[1m2675/2675[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 6ms/step - accuracy: 0.8028 - loss: 0.4330 - val_accuracy: 0.8000 - val_loss: 0.4476
Epoch 2/25
[1m2675/2675[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 3ms/step - accuracy: 0.8012 - loss: 0.4341 - val_accuracy: 0.8004 - val_loss: 0.4489
Epoch 3/25
[1m2675/2675[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 3ms/step - accuracy: 0.8015 - loss: 0.4326 - val_accuracy: 0.7999 - val_loss: 0.4517
Epoch 4/25
[1m2675/2675[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 3ms/step - accuracy: 0.8008 - loss: 0.4356 - val_accuracy: 0.7988 - val_loss: 0.4533
Epoch 5/25
[1m2675/2675[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 2ms/step - accuracy: 0.8012 - loss: 0.4342 - val_accuracy: 0.7992 - val_loss: 0.4503
Epoch 6/25
[1m2675/2675[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 3ms/step - accuracy: 0.8010 - loss: 0.4329 - val_accuracy: 0.7991 - val_loss: 0.4490
Epoch 7/25
[

The model was trained for 25 epochs. This number was selected to allow the model sufficient iterations to learn patterns in the data while avoiding overfitting. Early stopping techniques could be employed in future experiments to determine the optimal number of epochs dynamically based on validation performance.

Evaluate the model’s performance:

In [16]:
# Evaluate the model
y_pred = (model.predict(X_test) > 0.5).astype(int)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

[1m669/669[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step
Accuracy: 0.7997


Evaluate the model on the test set

In [13]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)

[1m669/669[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.7976 - loss: 0.4510
Test Loss: 0.44962742924690247
Test Accuracy: 0.7992335557937622


Test Loss:
  This value indicates the average error (or loss) of the model when it made predictions on the test data. A lower loss generally means that the model made fewer mistakes.

Test Accuracy:
  This metric shows the proportion of correct predictions (true positives + true negatives) out of the total predictions on the test set. A value closer to 1 indicates better performance.



In [18]:
import numpy as np

# Given parameters
X = np.random.rand(100, 14)  # Example data (100 samples, 14 features)
W1 = np.random.rand(14, 64)  # Weights for first hidden layer (14 input features, 64 neurons)
b1 = np.random.rand(1, 64)   # Bias for first hidden layer (64 neurons)
learning_rate = 0.01

# Forward Propagation for the first hidden layer
Z1 = np.dot(X, W1) + b1  # Linear transformation
A1 = np.maximum(0, Z1)   # ReLU activation

# Assuming we have a loss gradient with respect to A1
dA1 = np.random.rand(100, 64)  # Gradient of loss with respect to A1

# Backward Propagation for the first hidden layer
dZ1 = dA1 * (Z1 > 0)  # Derivative of ReLU
dW1 = np.dot(X.T, dZ1)  # Gradient of loss with respect to W1
db1 = np.sum(dZ1, axis=0, keepdims=True)  # Gradient of loss with respect to b1

# Update weights and biases
W1 -= learning_rate * dW1
b1 -= learning_rate * db1

print("Updated Weights:", W1)
print("Updated Biases:", b1)

    Forward Propagation:
        The input X is multiplied by the weight matrix W and added to the bias b to get Z1.
        The ReLU activation function then applies A1 = max(0, Z1).

    Backward Propagation:
        We compute the gradient of the loss with respect to A1 which is then used to calculate gradients for Z1.
        These gradients are used to update the weights W1 and biases b1 using gradient descent.

This approach provides a clear view of how the data propagates through the network and how the weights and biases are adjusted during training. This can be directly related to your dataset and network configuration.

After the backpropagation step, it appears that the updated weights have undergone significant changes, reflecting the adjustments made to minimize the loss during training. These changes are a positive sign that the model is learning from the data and refining its predictions. The weights have likely been updated to better capture the relationships within the dataset, which should enhance the model's performance.

Monitoring the loss and validation accuracy will provide a clearer picture of how well the model is performing following these updates. If the loss is decreasing steadily, it indicates that the model is making progress towards convergence.