# Block 40: Workshop
## Monitoring Jungle Health in National Parks 
## Scenario:   
In the United States, the well-being of our national park is facing growing threats, including climate change, deforestation, and human activities. Safeguarding these vital ecosystems for future generations' benefit is of utmost importance to the National Park Service. 
## Objective:  
Your objective is to develop a deep learning model that assesses the health of jungles in various national parks across the USA. The health status of a jungle is represented as a binary value: 1 for a healthy jungle and 0 for an unhealthy jungle. 
## Problem Statement:  
To construct a robust model, it is recommended that you implement hyperparameter tuning and incorporate dropout in your neural network architecture. This will help ensure the model's accuracy and generalizability, enabling the National Park Service to effectively monitor and conserve these invaluable ecosystems. 
## Dataset Columns:  

* Park Name: Name of the national park 

* Average Temperature: Average annual temperature of the park in Celsius 

* Rainfall: Annual rainfall in mm. 

* Human Intervention: Number of human-made constructions or interventions in the park (e.g., roads, buildings) 

* Wildlife Population: Number of wild animals spotted in a year in the park 

* Vegetation Density: Percentage of area covered by vegetation. 

* Air Quality Index: Air Quality Index of the Park 

* Water Quality: Quality of water sources in the park on a scale of 1–10 (10 being the cleanest) 

* Jungle Health: 1 if the jungle is healthy, 0 otherwise. 

In [24]:
#pip install keras_tuner

Defaulting to user installation because normal site-packages is not writeable
Collecting keras_tuner
  Using cached keras_tuner-1.4.7-py3-none-any.whl.metadata (5.4 kB)
Using cached keras_tuner-1.4.7-py3-none-any.whl (129 kB)
Installing collected packages: keras_tuner
Successfully installed keras_tuner-1.4.7

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


####  The "keras-tuner" library is a tool for hyperparameter tuning of machine learning models built with the Keras library.

### Import the required libraries. 

In [17]:
import pandas as pd
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
import keras_tuner
from keras_tuner import RandomSearch
from sklearn.preprocessing import OneHotEncoder

### Data Preprocessing:
1. Load the dataset.

In [2]:
###
### YOUR CODE HERE
###
df = pd.read_csv('jungle_health_data.csv')
df

Unnamed: 0,Park_Name,Average_Temperature,Rainfall,Human_Intervention,Wildlife_Population,Vegetation_Density,Air_Quality_Index,Water_Quality,Jungle_Health
0,Park_7,15.120311,595.661641,36,4883,62.208080,447.863178,6.539063,0
1,Park_4,24.521912,576.043696,24,4478,79.900456,14.472158,2.734220,0
2,Park_8,21.859258,1992.414515,17,3461,69.255130,119.822892,6.457475,0
3,Park_5,12.044253,1818.071068,18,3726,92.740289,16.194569,7.097170,1
4,Park_7,28.375010,2104.261116,22,1598,97.620765,167.921744,5.731933,0
...,...,...,...,...,...,...,...,...,...
995,Park_10,19.012393,1496.472001,41,4354,83.400163,400.401344,2.387328,0
996,Park_10,27.649189,1313.348140,19,1424,56.666341,316.061214,8.700793,0
997,Park_8,12.113870,2406.040042,37,2670,92.592296,251.756756,5.632026,0
998,Park_2,29.765505,1545.079433,24,2402,90.859553,30.054278,1.951305,0


2. Pre-process the data.

In [18]:
###
### YOUR CODE HERE
###

df = pd.get_dummies(df)
df.replace({True: 1, False: 0}, inplace = True)

df

Unnamed: 0,Average_Temperature,Rainfall,Human_Intervention,Wildlife_Population,Vegetation_Density,Air_Quality_Index,Water_Quality,Jungle_Health,Park_Name_Park_1,Park_Name_Park_10,Park_Name_Park_2,Park_Name_Park_3,Park_Name_Park_4,Park_Name_Park_5,Park_Name_Park_6,Park_Name_Park_7,Park_Name_Park_8,Park_Name_Park_9
0,15.120311,595.661641,36,4883,62.208080,447.863178,6.539063,0,0,0,0,0,0,0,0,1,0,0
1,24.521912,576.043696,24,4478,79.900456,14.472158,2.734220,0,0,0,0,0,1,0,0,0,0,0
2,21.859258,1992.414515,17,3461,69.255130,119.822892,6.457475,0,0,0,0,0,0,0,0,0,1,0
3,12.044253,1818.071068,18,3726,92.740289,16.194569,7.097170,1,0,0,0,0,0,1,0,0,0,0
4,28.375010,2104.261116,22,1598,97.620765,167.921744,5.731933,0,0,0,0,0,0,0,0,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,19.012393,1496.472001,41,4354,83.400163,400.401344,2.387328,0,0,1,0,0,0,0,0,0,0,0
996,27.649189,1313.348140,19,1424,56.666341,316.061214,8.700793,0,0,1,0,0,0,0,0,0,0,0
997,12.113870,2406.040042,37,2670,92.592296,251.756756,5.632026,0,0,0,0,0,0,0,0,0,1,0
998,29.765505,1545.079433,24,2402,90.859553,30.054278,1.951305,0,0,0,1,0,0,0,0,0,0,0


3. Data Splitting
- Split the data into features and target.
- Split the data into training and test sets.

In [24]:
###
### YOUR CODE HERE
###
from sklearn.model_selection import train_test_split
y = df.Jungle_Health
X = df.drop('Jungle_Health', axis=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, train_size=0.8)

x_val, x_test, y_val, y_test = train_test_split(x_test, y_test, train_size=0.5)

4. Define the model-building function. 
- Define a function to build a neural network model with hyperparameters.
- Create a sequential model (a linear stack of layers).
- Compile the model.

In [22]:
# Define a function to build a neural network model with hyperparameters
def build_model(hp):
    # Create a Sequential model (a linear stack of layers)
    model = keras.Sequential()
    
    # Input layer
    model.add(layers.InputLayer(input_shape=(x_train.shape[1],)))
    
    # Hidden layers: The number of hidden layers and their properties are determined by a hyperparameter search.
    for i in range(hp.Int('num_layers', 1, 5)):  # Iterate over a range of possible numbers of hidden layers (1 to 5).
        # Add a dense (fully connected) layer with variable units and ReLU activation.
        model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 256, 32), activation='relu'))
        
        # Add a dropout layer with variable dropout rate.
        model.add(layers.Dropout(rate=hp.Float('dropout_' + str(i), 0.0, 0.5, step=0.1)))
    
    # Output layer: A single neuron with sigmoid activation for binary classification.
    model.add(layers.Dense(1, activation='sigmoid'))
    
    # Compile the model:
    # - Use the Adam optimizer with a variable learning rate.
    # - Binary cross-entropy is used as the loss function for binary classification.
    # - Accuracy is used as a metric for evaluation.
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),  # Learning rate is a hyperparameter choice.
        loss='binary_crossentropy',  # Binary cross-entropy loss for binary classification.
        metrics=['accuracy'])  # Model performance metric.
    
    return model  # Return the compiled model.

5. Hyperparameter Tuning:  
- Import the necessary Keras Tuner module.
- Define the tuner with hyperparameter tuning configuration.
- Summarize the search space, showing the range of hyperparameters to explore.
- Perform the hyperparameter search by training the models.
- Retrieve the optimal hyperparameters from the search results (best combination).

In [25]:
# Import the necessary Keras Tuner module
import keras_tuner
import keras

# Define the tuner with hyperparameter tuning configuration
tuner = RandomSearch(hypermodel=build_model ,directory="results", project_name="custom_training", objective='val_loss', max_trials = 5)

    
    
#Start the search and get the best model:
tuner.search(x_train, y_train, epochs=5, validation_data=(x_val, y_val))
best_model = tuner.get_best_models()[0]

# Summarize the search space, showing the range of hyperparameters to explore.
tuner.search_space_summary()

# Perform the hyperparameter search by training the models
tuner.search(x_train, y_train, epochs=30, validation_split=0.2)

# Retrieve the optimal hyperparameters from the search results (best combination)
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]

Trial 5 Complete [00h 00m 02s]
val_loss: 1.0335578918457031

Best val_loss So Far: 0.12092425674200058
Total elapsed time: 00h 00m 12s
Search space summary
Default search space size: 12
num_layers (Int)
{'default': None, 'conditions': [], 'min_value': 1, 'max_value': 5, 'step': 1, 'sampling': 'linear'}
units_0 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 256, 'step': 32, 'sampling': 'linear'}
dropout_0 (Float)
{'default': 0.0, 'conditions': [], 'min_value': 0.0, 'max_value': 0.5, 'step': 0.1, 'sampling': 'linear'}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}
units_1 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 256, 'step': 32, 'sampling': 'linear'}
dropout_1 (Float)
{'default': 0.0, 'conditions': [], 'min_value': 0.0, 'max_value': 0.5, 'step': 0.1, 'sampling': 'linear'}
units_2 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 256, 'step': 32, 'sam

6. Model Training
- Build a model using the best hyperparameters found during the hyperparameter tuning.
- Train the model on the training data.

In [29]:
# Build a model using the best hyperparameters found during the hyperparameter tuning.
model = tuner.hypermodel.build(best_hps)

# Train the model on the training data.
# - X_train and y_train are the training data and labels.
# - epochs: Number of training iterations.
# - validation_split: Fraction of training data to use for validation.
model.fit(x_train, y_train, epochs=10, validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7f04ca4af8e0>

7. Model Evaluation
- Evaluate the trained model on the test dataset to assess its performance.
- Use the model's evaluate method to calculate the test loss and test accuracy.
- Print the test accuracy as a percentage.

In [46]:
# Evaluate the trained model on the test dataset to assess its performance.

# Use the model's evaluate method to calculate the test loss and test accuracy.
# - X_test and y_test are the test data and labels.
final_scores = model.evaluate(x_test, y_test)

# Print the test accuracy as a percentage.
# - test_accuracy is a decimal number, so it's converted to a percentage using formatting.
print(f'Final Model Test Loss = {final_scores[0]:0.3%}')
print(f'Final Model Test Accuracy = {final_scores[1]:.3%}')

Final Model Test Loss = 21.338%
Final Model Test Accuracy = 93.000%
