#### **Introduction**

This notebook documents a project focused on predicting forest cover types using the [Kaggle dataset: Forest Cover Type Prediction Dataset](https://www.kaggle.com/competitions/forest-cover-type-prediction/data)
. The project is part of the **Big Data** module of ENIT's 3rd year MIndS and is undertaken by **Group 4**: Chaima Balti, Roukaya Lakhzouri, and Salsabil Rouahi. We are working under the supervision of our professor, **Moez Ben Haj Hmida**.


#### Load the Scikit-learn model


In [1]:
import gradio as gr
import numpy as np
import joblib
import os


In [2]:

folder_path = 'models_folder'
model_filename = os.path.join(folder_path, 'Random_Forest.joblib')
model = joblib.load(model_filename)


#### Define The Prediction Function


This function predicts the type of forest cover for a given set of environmental and geographic features using a Random Forest model that was trained previously. The output is an image path corresponding to the predicted forest cover type, allowing for easy visualization of the prediction

In [3]:

# Define the classes
# Map each class to its corresponding image path
CLASSES = {
    'Spruce/Fir': "images/Spruce.jpg",
    'Lodgepole Pine': "images/Lodgepole Pine.jpg",
    'Ponderosa Pine': "images/Ponderosa Pine.jpg",
    'Cottonwood/Willow': "images/Cottonwood.jpg",
    'Aspen': "images/Aspen.jpg",
    'Douglas-fir': "images/Douglas-fir.jpg",
    'Krummholz': "images/krummholz.jpg"
}

def predict_forest_cover(Elevation, Aspect, Slope, Horizontal_Distance_To_Hydrology,
                         Vertical_Distance_To_Hydrology, Horizontal_Distance_To_Roadways,
                         Hillshade_9am, Hillshade_Noon, Hillshade_3pm,
                         Horizontal_Distance_To_Fire_Points, Wilderness_Area1, Wilderness_Area2,
                         Wilderness_Area3, Wilderness_Area4, *Soil_Types):
    # Assemble features into a list
    features = [
        Elevation, Aspect, Slope, Horizontal_Distance_To_Hydrology,
        Vertical_Distance_To_Hydrology, Horizontal_Distance_To_Roadways,
        Hillshade_9am, Hillshade_Noon, Hillshade_3pm,
        Horizontal_Distance_To_Fire_Points, Wilderness_Area1, Wilderness_Area2,
        Wilderness_Area3, Wilderness_Area4, *Soil_Types
    ]
    
    # Convert to numpy array and reshape for sklearn
    features = np.array(features).reshape(1, -1)

    # Predict using the trained model
    prediction = model.predict(features)
    
    # Map the numeric prediction to the class label
    class_label = list(CLASSES.keys())[int(prediction[0]) - 1]

    return CLASSES[class_label]



##### Define Gradio interface

In [4]:
import random 

This section defines a Gradio-based web interface to interact with the trained Random Forest model for forest cover type prediction. Users can manually input features or generate random features, and the app displays the corresponding forest cover type as an image.

### Key Components:
#### Random Feature Generator:

The generate_random_features function creates a set of random input features for testing purposes. It generates values for:
Environmental and geographic data such as Elevation, Aspect, Slope, etc.
Binary values for wilderness areas and soil types.

In [5]:
# Function to generate random features
def generate_random_features():
    random_features = [
        random.randint(0, 10000),  # Id
        random.randint(0, 4000),  # Elevation
        random.randint(0, 360),  # Aspect
        random.randint(0, 90),   # Slope
        random.randint(0, 5000), # Horizontal Distance to Hydrology
        random.randint(-500, 500),  # Vertical Distance to Hydrology
        random.randint(0, 10000),  # Horizontal Distance to Roadways
        random.randint(0, 255),  # Hillshade 9am
        random.randint(0, 255),  # Hillshade Noon
        random.randint(0, 255),  # Hillshade 3pm
        random.randint(0, 10000),  # Horizontal Distance to Fire Points
        random.choice([0, 1]),   # Wilderness Area 1
        random.choice([0, 1]),   # Wilderness Area 2
        random.choice([0, 1]),   # Wilderness Area 3
        random.choice([0, 1])    # Wilderness Area 4
    ] + [random.choice([0, 1]) for _ in range(40)]  # Soil Types
    return random_features

#### Gradio Interface:

The app is built using Gradio's Blocks layout, which provides an intuitive interface for inputting data and visualizing predictions.
Interface Design:

##### Input Fields:
Numeric inputs for environmental features (e.g., Elevation, Aspect, Slope).


Checkboxes for wilderness areas and soil types.
##### Buttons:
A "Predict" button to submit inputs for forest cover prediction.

A "Generate Random Features" button to auto-fill inputs with randomly generated values.
##### Output:
An image display to show the predicted forest cover type.

In [6]:


# Define the Gradio interface
with gr.Blocks() as app:
    with gr.Row():
        gr.Markdown("### Forest Cover Prediction")
    
    # Inputs for numeric fields
    input_fields = []
    with gr.Row():
        input_fields.append(gr.Number(label="Id", interactive=True, precision=0))
        input_fields.append(gr.Number(label="Elevation", interactive=True))
        input_fields.append(gr.Number(label="Aspect", interactive=True))
    with gr.Row():
        input_fields.append(gr.Number(label="Slope", interactive=True))
        input_fields.append(gr.Number(label="Horizontal Distance to Hydrology", interactive=True))
        input_fields.append(gr.Number(label="Vertical Distance to Hydrology", interactive=True))
    with gr.Row():
        input_fields.append(gr.Number(label="Horizontal Distance to Roadways", interactive=True))
        input_fields.append(gr.Number(label="Hillshade 9am", interactive=True))
        input_fields.append(gr.Number(label="Hillshade Noon", interactive=True))
    with gr.Row():
        input_fields.append(gr.Number(label="Hillshade 3pm", interactive=True))
        input_fields.append(gr.Number(label="Horizontal Distance to Fire Points", interactive=True))

    # Wilderness areas
    wilderness_areas = []
    with gr.Row():
        wilderness_areas.append(gr.Checkbox(label="Wilderness Area 1"))
        wilderness_areas.append(gr.Checkbox(label="Wilderness Area 2"))
        wilderness_areas.append(gr.Checkbox(label="Wilderness Area 3"))
        wilderness_areas.append(gr.Checkbox(label="Wilderness Area 4"))

    # Soil types
    soil_types = []
    for i in range(0, 40, 4):
        with gr.Row():
            for j in range(4):
                soil_types.append(gr.Checkbox(label=f"Soil Type {i + j + 1}"))

    # Buttons and outputs
    with gr.Row():
        predict_button = gr.Button("Predict")
        generate_button = gr.Button("Generate Random Features")
    
    output = gr.Image(label="Cover Type")

    # Button to generate random values
    def on_generate():
        random_values = generate_random_features()
        updates = [gr.update(value=random_values[i]) for i in range(len(random_values))]
        return updates

    generate_button.click(on_generate, inputs=None, outputs=[*input_fields, *wilderness_areas, *soil_types])
    predict_button.click(predict_forest_cover, inputs=[*input_fields, *wilderness_areas, *soil_types], outputs=output)

# Launch the app
app.launch()

* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.


