# Final Project: Deep Neutral Network with KerasTuner
## Model 3: Predict Glacier Retreat with Tensor Flow's Keras

The Keras Tuner hyperparameters used here are:

{'activation': 'relu', 'first_units': 21, 'num_layers': 4, 'units_0': 11, 'units_1': 6, 'units_2': 16, 'units_3': 11, 'units_4': 16, 'tuner/epochs': 20, 'tuner/initial_epoch': 0, 'tuner/bracket': 0, 'tuner/round': 0}

The use of Keras Tuner provided the first pass response of Loss: 1.162386417388916, Accuracy: 0.6666666865348816
The final model after several iterations was improved to Loss: 0.8554542064666748, Accuracy: 0.7777777910232544

In [1]:
# Imports
import pandas as pd
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
from pathlib import Path

---

Step 1: Prepare the data to be used on a neural network model

Note:  The raw data files have been cleaned and prepared for use in the

Jupyter Source File "Data_Cleaning_Glacier_Retreat".  The output for this file

is in the "resources" folder and includes data sets for each of the following:
1.  Environmental Parameters:  Global Temperature, Global Sea Rise, and Global CO2
        Filename:  env_parameters_1.csv

2.  Population, Economic, and Farming Parameters: World Population, Urban Population,
        investment, and cereal output production by acre.
        filename: pop_farm_parameters_2.csv

3.  Change in Temperature by Country:  Average delta T by country.
        filename: dT_Country_parameters_3.csv

4.  Change in Forestation by Country:  Percent change in forestation by country.
        filename: deforest_parameters_4.csv

---
### Third Model: Glacier Retreat with Change in Temperature by Country

In [2]:
# Review and load the dataset for Typical Climate Change Measurements
file_path = "resources\dT_Country_parameters_3.csv"
df_parameters_1 = pd.read_csv(file_path)

# Review the DataFrame
df_parameters_1.head()

Unnamed: 0,year,glacier_retreat,AFG,ALB,DZA,AND,AGO,ARG,AUS,AUT,...,USA,VIR,URY,VUT,VEN,VNM,PSE,WLD,ZMB,ZWE
0,1979,0,0.361,0.203,0.654,0.058,0.291,0.266,0.375,-0.112,...,-0.306,-0.062,0.128,0.103,0.202,0.416,0.606,0.226,0.249,0.189
1,1980,1,0.6,-0.414,0.232,-0.188,0.279,0.373,0.887,-0.274,...,0.412,0.718,0.571,0.147,0.346,0.368,-0.204,0.332,0.138,-0.024
2,1981,0,0.483,-0.351,0.215,0.178,-0.071,0.378,0.495,0.277,...,0.871,0.514,0.5,0.099,0.134,0.49,-0.215,0.443,-0.158,-0.36
3,1982,1,-0.346,0.173,0.399,1.044,0.164,0.359,0.186,0.384,...,-0.343,0.052,0.574,0.072,0.116,0.196,-0.562,0.086,0.34,0.17
4,1983,1,0.164,-0.128,0.56,0.859,0.487,0.046,0.633,1.062,...,0.54,0.552,-0.052,0.215,0.469,0.082,-1.068,0.46,1.064,1.223


In [3]:
# Check the glacier retreat value counts
df_parameters_1["glacier_retreat"].value_counts()
# low quantity but evenly split

glacier_retreat
0    25
1    19
Name: count, dtype: int64

Step 2: Using the preprocessed data, create the features (`X`) and target (`y`) datasets. The target dataset should be defined by the preprocessed DataFrame column “glacier_retreat”. The remaining columns should define the features dataset.

In [4]:
# Define the target set y using the glacier_retreat column
# Remember that .values creates a numpy array
y = df_parameters_1["glacier_retreat"].values
y

array([0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1,
       1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0],
      dtype=int64)

In [5]:
# Define features set X by selecting remaining columns, drop year and 
# glacier_retreat
X = df_parameters_1.drop(["year","glacier_retreat"],axis=1)

# Review the features DataFrame
display(X.head(3),X.tail(3))

Unnamed: 0,AFG,ALB,DZA,AND,AGO,ARG,AUS,AUT,BHS,BHR,...,USA,VIR,URY,VUT,VEN,VNM,PSE,WLD,ZMB,ZWE
0,0.361,0.203,0.654,0.058,0.291,0.266,0.375,-0.112,0.133,0.856,...,-0.306,-0.062,0.128,0.103,0.202,0.416,0.606,0.226,0.249,0.189
1,0.6,-0.414,0.232,-0.188,0.279,0.373,0.887,-0.274,0.377,0.351,...,0.412,0.718,0.571,0.147,0.346,0.368,-0.204,0.332,0.138,-0.024
2,0.483,-0.351,0.215,0.178,-0.071,0.378,0.495,0.277,-0.03,0.408,...,0.871,0.514,0.5,0.099,0.134,0.49,-0.215,0.443,-0.158,-0.36


Unnamed: 0,AFG,ALB,DZA,AND,AGO,ARG,AUS,AUT,BHS,BHR,...,USA,VIR,URY,VUT,VEN,VNM,PSE,WLD,ZMB,ZWE
41,0.498,1.498,1.926,2.562,1.162,1.123,1.416,2.315,1.611,2.027,...,1.324,1.32,0.89,1.226,1.35,1.477,1.455,1.711,0.891,0.389
42,1.327,1.536,2.33,1.533,1.553,1.031,0.629,1.395,0.879,2.464,...,1.144,0.922,0.79,1.147,0.734,1.114,1.787,1.447,0.822,-0.125
43,2.012,1.518,1.688,3.243,1.212,0.643,0.754,2.498,1.48,2.017,...,1.217,0.894,0.382,1.479,0.533,1.033,1.074,1.394,0.686,-0.49


### Step 3: Split the features and target sets into training and testing datasets.


In [6]:
# Split the preprocessed data into a training and testing dataset
# Assign the function a random_state equal to 13
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05, random_state=13)

### Step 4: Use scikit-learn's `StandardScaler` to scale the features data.

In [7]:
# Create a StandardScaler instance
scaler = StandardScaler()

# Fit the scaler to the features training dataset
X_scaler = scaler.fit(X_train)

# Scale the Data <<< added by student
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)


---

## Compile and Evaluate a Model Using a Neural Network

### Step 1: Create a deep neural network by assigning the number of input features, the number of layers, and the number of neurons on each layer using Tensorflow’s Keras.



In [8]:
# Define the the number of inputs (features) to the model
number_input_features = len(X_train.columns)
# Review the number of features
number_input_features

150

In [9]:
#Note: KerasTuner Values

# Define the number of hidden nodes in each layer
hidden_nodes_layer1 = 11
hidden_nodes_layer2 = 6
hidden_nodes_layer3 = 16
hidden_nodes_layer4 = 11

# Define the number of neurons in the output layer
nn_output_layer = 1

In [10]:
# Create the Sequential model instance
nn = tf.keras.models.Sequential()

# Add Hidden Layers
nn.add(tf.keras.layers.Dense(units=hidden_nodes_layer1,
                             input_dim=number_input_features,activation="relu"))

nn.add(tf.keras.layers.Dense(units=hidden_nodes_layer2, activation="relu"))
nn.add(tf.keras.layers.Dense(units=hidden_nodes_layer3, activation="relu"))
nn.add(tf.keras.layers.Dense(units=hidden_nodes_layer4, activation="relu"))

# Add the output layer to the model specifying the number of output neurons
# and activation function
nn.add(tf.keras.layers.Dense(units=nn_output_layer, activation="sigmoid"))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [11]:
# Display the Sequential model summary
nn.summary()

### Step 2: Compile and fit the model using the `binary_crossentropy` loss function, the `adam` optimizer, and the `accuracy` evaluation metric.


In [12]:
# Compile the Sequential model
nn.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])

In [13]:
# Fit the model using 20 epochs and the training data

fit_model = nn.fit(X_train_scaled, y_train, epochs=100)

Epoch 1/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 6ms/step - accuracy: 0.5348 - loss: 0.6898
Epoch 2/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5348 - loss: 0.6786 
Epoch 3/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5615 - loss: 0.6705 
Epoch 4/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5615 - loss: 0.6635 
Epoch 5/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5719 - loss: 0.6566 
Epoch 6/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5927 - loss: 0.6457 
Epoch 7/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5719 - loss: 0.6412 
Epoch 8/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.6623 - loss: 0.6293 
Epoch 9/100
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37

Step 3: Evaluate the model using the test data to determine the model’s loss and accuracy.


In [14]:
# Evaluate the model loss and accuracy metrics using the evaluate method and the test data
model_loss, model_accuracy = nn.evaluate(X_test_scaled,y_test,verbose=2)

# Display the model loss and accuracy results
# Note:  
print(f"Loss: {model_loss}, Accuracy: {model_accuracy}")

1/1 - 0s - 108ms/step - accuracy: 0.6667 - loss: 1.1624
Loss: 1.162386417388916, Accuracy: 0.6666666865348816


Note:  This model does not provide much assistance in predicting outcome.

Future work:  The glacier retreat response could be slow to respond.  Try
shifting the data.

Future work:  Vary the X data, removing some columns and leaving others




Step 4: Save and export the model to file.



In [15]:
# Set the model's file path
file_path = Path('saved_models/kt_model_3.keras')

# Export your model to a keras file
nn.save(file_path)

---
### Predict Glacier Retreat by Using your Neural Network Model

Step 1: Reload the saved model.

In [16]:
# Set the model's file path
file_path = Path('saved_models/kt_model_3.keras')

# Load the model to a new object
nn = tf.keras.models.load_model(file_path)

Step 2: Make predictions on the testing data and save the predictions to a DataFrame.

In [17]:
# Make predictions with the test data
predictions = nn.predict(X_test_scaled,verbose=2)

# Display a sample of the predictions
predictions[0:5]

1/1 - 0s - 44ms/step


array([[4.5044910e-02],
       [2.5747294e-04],
       [3.2077745e-01]], dtype=float32)

In [18]:
# Save the predictions to a DataFrame and round the predictions to binary results
predictions_df = pd.DataFrame(columns=["predictions"], data=predictions)
predictions_df["predictions"] = round(predictions_df["predictions"],0)
predictions_df

Unnamed: 0,predictions
0,0.0
1,0.0
2,0.0


### Step 4: Display a classification report with the y test data and predictions

In [19]:
# Print the classification report with the y test data and predictions
print(classification_report(y_test, predictions_df["predictions"].values))

              precision    recall  f1-score   support

           0       0.67      1.00      0.80         2
           1       0.00      0.00      0.00         1

    accuracy                           0.67         3
   macro avg       0.33      0.50      0.40         3
weighted avg       0.44      0.67      0.53         3



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
