<p>
  <img src="index/arcgis_classify_logo.jpg" alt="Logo", width="200">
</p>

# ArcGIS_Classify
* ArcGIS_Classify is a Python-based Jupyter Notebook for ArcGIS Pro for quick and easy land-use classifications.
* This project was created as a final submission for the course "Python in QGIS and ArcGIS Pro" by Sven Harpering and Philippe Rieffel at Universität Münster.
* The authors are Jonas Starke and Kieran Galbraith. For questions, contact us at: jstarke@uni-muenster.de or k_galb01@uni-muenster.de.
* More info under: github.com/kgalb01/ArcGIS_Classify

## 1 What are Land Use / Land Cover Classifications (LULC)?
<details open>
* LULC identify the physical material on the surface of the earth, like forests or water
* Identification based mostly on reflective properties captured in multispectral imagery
    - Images captured by satelites, like Sentinel-2 or by airborne drones

![MultiSpectralImagery_Example](index/multi_spec_example.png)

<sup>*Figure 1: Diagram showing how multispectral images are captured. Source: „Fernerkundung und maschinelle Lernverfahren“ by Hanna Meyer, 2021*</sup>

* The classification is often done using a machine learning or deep learning algorithm-based model
    * For the modeltraining a algorithm is fed with trainingdata that includes labeled examples of different land cover types
    * This helps the algorithm recognize patterns and features associated with each type
* Based on the model it is then possible to create a prediction map where previously unlabeled data is now classified, enabling efficient and large-scale land use and land cover mapping
* A finished very basic LULC could look like this:

![LULC_Example](index/lulc_terra-classifier_example.jpg)

<sup>*Figure 2: Screenshot showing a LULC from Münster using a model trained with data from Dortmund. Source: Terra Classifier App created for the course "Geosoftware II" by Edzer Pebesma & Christian Knoth, 2024*</sup>
</details>

## 2 Necessities
* For a proper LULC, you'll need:
    1. An area of interest and remote sensing data of the AOI
        - We recommend Sentinel-2 as they are easy to get and easy to use
        - More info here: [Copernicus Browser](https://browser.dataspace.copernicus.eu)
    2. An area of training and remote sensing data of the AOT (can be the same as the area of interest, but doesn't have to be)
    3. Training data from the area of training: A .GeoJSON or .gpkg file containing labeled examples of land cover types of your area of training
        - NOTE: You'll need at least three entries for each label and at least three labels for the LULC to work properly. The more you have, the better.

In the following part, we'll start with a detailed tutorial. If you're already fairly familiar with training data, you can skip to the code block in "3.1 Integrating training data" and hit "Run".

BUT! First things first! Hit the "Run" Button on the next code block to install all necessary Modules and Packages so we can be sure that everythings works as intended!


In [None]:
# Install necessary packages
%pip install ipywidgets ipyfilechooser ipyleaflet shapely numpy gdal pyproj matplotlib geopandas scikit-learn joblib rasterio seaborn

In [2]:
# Import all necessary packages

# Import standard libraries
import os
import json
import pickle

# Import IPython and Jupyter-related packages
from IPython.display import display, clear_output
import ipywidgets as widgets
import ipyfilechooser as filechooser

# Import geospatial and mapping packages
from ipyleaflet import Map, basemaps, GeoJSON, DrawControl, basemap_to_tiles, Polygon as LeafletPolygon
import geopandas as gpd
from shapely.geometry import shape, mapping
from pyproj import Proj, transform
from osgeo import gdal

# Import raster and image processing packages
import rasterio
from rasterio.windows import from_bounds
import rasterio.mask

# Import machine learning and data processing packages
import numpy as np
import joblib
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedKFold
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Import visualization packages
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
from matplotlib.backends.backend_pdf import PdfPages
import seaborn as sns

In [31]:
# In case you want to start all over again, hit the "Run" button on this cell.
%reset -f

## 3 Tutorial: Create training Data
<details open>

* NOTE: This is not the ideal way of creating training data. The ideal way is through groundtruthing, but for testing purposes, this method should be fine for now.

* If you're running into trouble creating your own training data and you want to try out a different method you can follow this tutorial on [Youtube](https://www.youtube.com/watch?v=O-yYfS1EFxg)
* Step 1: 
    - Run the following code block to either a) upload already existing training data, b) use the example data included in this project, or c) create new data.

![train_example_1](index/train_example_1.jpg)

<sup>*Figure 6: Screenshot from the UI for the training data upload*</sup>

* Step 2: 
    - NOTE: If you integrated your own training data or used the existing one, you can skip this step.
    - Search for your area of training, which means the area you want to use to train your model.
        * The more similar the area of interest and the area of training are to each other, the more accurate your results will be.
    - Scan the map to find representative examples of each class and use the provided map tools to create labeled points or polygons.
        * HINT: You can hit "Toggle Map" on top of the map to change the layer to a satellite map. This might help with the accuracy of your training data.
    - Once you feel like you have enough classes and entries for each class, hit the "Save GeoJSON" button on top of the map and navigate to where you want to save your data.
        * NOTE: You'll need to name your file "your_filename.geojson" to properly save your work!

![train_example_2](index/train_example_2.jpg)

<sup>*Figure 7: Screenshot from the UI for the training data creation process*</sup>

* Step 3:
    - The created training data should look something like the following:

![train_example_0](index/train_example_0.jpg)

<sup>*Figure 8: Screenshot from the training data from Dortmund*</sup>

Your training data should now be integrated into the system. To test that, move on to the next code block and hit "Run".
</details>

### 3.1 Integrating training data

In [None]:
# Incorperate training data for the machine learning model


# Global variables to store the paths of the TIF and GeoJSON files
uploaded_tif_path = None
uploaded_geojson_path = None

# Define the button widgets
button_set_path = widgets.Button(description="Set path")  # Button to set the path for training data
button_use_example = widgets.Button(description="Example data")  # Button to use example data
button_create_new = widgets.Button(description="New data")  # Button to create new data
button_abort = widgets.Button(description="Cancel")  # Button to abort the operation
button_save = widgets.Button(description="Save GeoJSON", disabled=True)  # Button to save the GeoJSON
button_toggle_map = widgets.ToggleButton(description="Toggle Map", value=False)  # Toggle button to switch map layers

# Variable to store the training data path
training_data_path = ""
geojson_data = None

# Define the file chooser widget
fc = filechooser.FileChooser()

# Container for storing drawn features
drawn_features = []

# Create a map centered on Münster
m = Map(center=(51.9606649, 7.6261347), zoom=12)

# Create draw control and add it to the map
draw_control = DrawControl()
m.add_control(draw_control)

# Widgets for user input
label_input = widgets.Text(description="Label:")  # Text input for label
class_id_input = widgets.Text(description="ClassID:")  # Text input for class ID
submit_button = widgets.Button(description="Submit")  # Button to submit user input

# Container for user input widgets
input_widgets = widgets.VBox([label_input, class_id_input, submit_button])
display(input_widgets)
input_widgets.layout.display = 'none'

# Define the satellite layer and OSM layer
satellite_layer = basemap_to_tiles(basemaps.Esri.WorldImagery)
osm_layer = basemap_to_tiles(basemaps.OpenStreetMap.Mapnik)

# Add initial OSM layer to the map
m.add_layer(osm_layer)

# Handle drawing created event
def handle_draw_created(event=None, action=None, geo_json=None):
    """
    Function to handle the event when a feature is drawn on the map.
    Parameters:
    - event: The event object.
    - action: The action performed.
    - geo_json: The GeoJSON representation of the drawn feature.
    """
    global current_feature
    current_feature = geo_json  # Extract the geojson data from the event
    input_widgets.layout.display = 'block'  # Show input widgets for label and class ID

# Attach the draw created event to the draw control
draw_control.on_draw(handle_draw_created)

# Handle submit button click
def on_submit_clicked(b):
    """
    Function to handle the event when the submit button is clicked.
    Parameters:
    - b: The button object representing the submit button.
    Global Variables:
    - drawn_features: A list to store the drawn features.
    - current_feature: A dictionary representing the current feature being submitted.
    """
    global drawn_features, current_feature
    current_feature['properties'] = {
        'Label': label_input.value,
        'ClassID': class_id_input.value,
        'fid': len(drawn_features) + 1
    }
    drawn_features.append(current_feature)
    
    # Clear inputs
    label_input.value = ''
    class_id_input.value = ''
    input_widgets.layout.display = 'none'
    
    # Enable save button
    button_save.disabled = False

submit_button.on_click(on_submit_clicked)

# Define the button click event handlers
def on_set_path_clicked(b):
    """
    Function to handle the event when the set path button is clicked.
    Parameters:
    - b: The button object representing the set path button.
    """
    display(fc)
    fc.register_callback(on_file_chosen)

def on_file_chosen(chooser):
    """
    Function to handle the event when a file is chosen using the file chooser.
    Parameters:
    - chooser: The file chooser object.
    Global Variables:
    - training_data_path: The path of the chosen file.
    """
    global training_data_path, uploaded_geojson_path, geojson_data
    training_data_path = chooser.selected
    uploaded_geojson_path = training_data_path  # Set global variable for GeoJSON path
    print(f"Training data path set to: {training_data_path}")
    load_geojson()

def on_use_example_clicked(b):
    """
    Function to handle the event when the use example button is clicked.
    Parameters:
    - b: The button object representing the use example button.
    Global Variables:
    - training_data_path: The path of the example data file.
    """
    global training_data_path, uploaded_geojson_path, geojson_data
    training_data_path = "Data/training_dortmund.geojson"
    uploaded_geojson_path = training_data_path  # Set global variable for GeoJSON path
    print(f"Example data path set to: {training_data_path}")
    load_geojson()

def on_create_new_clicked(b):
    """
    Function to handle the event when the create new button is clicked.
    Parameters:
    - b: The button object representing the create new button.
    """
    print("Create new data clicked")
    clear_output()
    display(button_save)
    display(button_toggle_map)
    display(m)
    display(input_widgets)

def on_abort_clicked(b):
    # Clear the output to hide the buttons
    clear_output()
    print("User aborted the operation")

def load_geojson():
    global geojson_data
    try:
        geojson_data = gpd.read_file(uploaded_geojson_path)
        print("GeoJSON data loaded successfully.")
    except Exception as e:
        print(f"Error loading GeoJSON data: {e}")

def save_geojson(geojson_data, path):
    """
    Function to save the GeoJSON data to a file.
    Parameters:
    - geojson_data: The GeoJSON data to be saved.
    - path: The path of the file to save the GeoJSON data.
    """
    with open(path, 'w') as f:
        json.dump(geojson_data, f)
    print(f"GeoJSON data saved to: {path}")

def on_save_clicked(b):
    """
    Function to handle the event when the save button is clicked.
    Parameters:
    - b: The button object representing the save button.
    """
    global drawn_features
    clear_output()
    fc_save = filechooser.FileChooser()
    display(fc_save)
    fc_save.register_callback(on_save_path_chosen)

def on_save_path_chosen(chooser):
    """
    Function to handle the event when a save path is chosen using the file chooser.
    Parameters:
    - chooser: The file chooser object.
    """
    path = chooser.selected
    geojson_data = {
        "type": "FeatureCollection",
        "features": drawn_features
    }
    save_geojson(geojson_data, path)
    load_geojson()

# Handle map toggle
def on_toggle_map_clicked(change):
    """
    Function to handle the event when the map toggle button is clicked.
    Parameters:
    - change: The change event object.
    """
    if change['new']:
        m.remove_layer(osm_layer)
        m.add_layer(satellite_layer)
    else:
        m.remove_layer(satellite_layer)
        m.add_layer(osm_layer)

button_toggle_map.observe(on_toggle_map_clicked, 'value')

# Assign the event handlers to the buttons
button_set_path.on_click(on_set_path_clicked)
button_use_example.on_click(on_use_example_clicked)
button_create_new.on_click(on_create_new_clicked)
button_abort.on_click(on_abort_clicked)
button_save.on_click(on_save_clicked)

# Display the buttons
display(button_set_path, button_use_example, button_create_new, button_abort)


### 3.2 Validating the GeoJSON data

In [4]:
# Display the map with the GeoJSON data

# Function to center the map on a random point from the GeoJSON data
def get_random_center(gdf):
    """
    Function to get a random center point from the GeoDataFrame.
    Parameters:
    - gdf: The GeoDataFrame containing the training data.
    Returns:
    - A tuple representing the coordinates of the random center point.
    """
    random_feature = gdf.sample(1).geometry.iloc[0]
    
    if random_feature.geom_type == 'Point':
        return random_feature.y, random_feature.x
    elif random_feature.geom_type in ['Polygon', 'MultiPolygon']:
        centroid = random_feature.centroid
        return centroid.y, centroid.x
    else:
        raise ValueError("Unsupported geometry type")

# Check if geojson_data is already loaded
if 'geojson_data' in globals() and isinstance(geojson_data, gpd.GeoDataFrame):
    # Get a random center point from the geojson_data
    try:
        center_point = get_random_center(geojson_data)
    except ValueError as e:
        print(f"Error in geometry type: {e}")
        center_point = (51.9606649, 7.6261347)  # Default to Münster if there's an error
else:
    print("No training data loaded. Please load the training data in the first notebook.")
    center_point = (51.9606649, 7.6261347)  # Default to Münster if no data is loaded

# Create a map centered on the random point
m = Map(center=center_point, zoom=12)

# Define the satellite layer and OSM layer
satellite_layer = basemap_to_tiles(basemaps.Esri.WorldImagery)
osm_layer = basemap_to_tiles(basemaps.OpenStreetMap.Mapnik)

# Add initial OSM layer to the map
m.add_layer(osm_layer)

def display_geojson():
    """
    Function to display the GeoJSON data on the map.
    """
    if geojson_data is None:
        print("No training data loaded. Please load the training data first.")
    else:
        geojson_layer = GeoJSON(data=json.loads(geojson_data.to_json()))
        m.add_layer(geojson_layer)
        display(m)

# Display the map with GeoJSON data
display_geojson()


Map(center=[51.56521874857406, 7.42048862632758], controls=(ZoomControl(options=['position', 'zoom_in_text', '…

## 4 Tutorial: Fetch Sentinel-2 data
<details open>

Next up, we’ll download and integrate Sentinel-2 data. If you already have remote sensing data, you can skip this tutorial and hit “run” on the next code block.

* Step 0: Visit the Copernicus Browser and sign up/in.
* Step 1: 
    - In the browser, switch to the "Search" tab.
    - Set the search to "Sentinel-2", "L2A", and to your desired cloud coverage (we recommend no more than 20%).
    - At the bottom, you'll need to determine the time period from which you want to gather the data.
    - On the right side, you have to set your area of interest using the provided map tools.
    - If you set everything correctly, hit the "Search" button at the bottom of the "Search" tab and wait for the results.

![Copernicus_Example_1](index/cop_example_1.jpg)

<sup>*Figure 3: Screenshot from the Copernicus browser, showing how to set the "Search" tab*</sup>
    
* Step 2:
    - If the search was successful, you'll see which images overlap your area of interest. Click on the one that fits your needs/interests the most.
        * If there isn't a single image that completely covers your area of interest, you'll need to create a mosaic. More about that later. In that case, you should download every image covering your area of interest.
    - You'll be greeted with a new window showing previews of the available data. Hit the download button on the one you're confident has the least amount of clouds.
    - Wait for the download to finish.
    - NOTE: If you can't find images, go back to the "Search" tab from Step 1 and change either the cloud coverage or the timeframe for your data.

![Copernicus_Example_2](index/cop_example_2.jpg)

<sup>*Figure 4: Screenshot from the Copernicus browser, showing how to set the "Download" section*</sup>

* Step 3:
    - Once downloaded, you'll have a .zip file containing your satellite data. Extract that file and navigate to "GRANULE/.../IMG_DATA/R10m".
    - Here you'll find single files for the "Red", "Green", and "Blue" bands of the Sentinel-2 satellite, as well as the "Near infrared" bands.
    - Copy the "..._B02_...", "..._B03_...", "..._B04_...", and "..._B08_..." files into a separate folder so you can easily find them.
    - Continue with the rest of the tutorial; we'll need these files later.

![Copernicus_Example_3](index/cop_example_3.jpg)

<sup>*Figure 5: Screenshot from the Windows file browser, showing the file selection for the "Red", "Green", "Blue", and "Near infrared" bands of the downloaded data*</sup>
</details>


### 4.1 Integrating Sentinel-2 data of the Area of Interest

#### 4.1.1 Incorperating and processing multiple bands

In [None]:
# Incorpate the Sentinel-2 data for the machine learning model in form of bands in multiple files for the Area of Interest (AOI)

aoi_path = None
class Sentinel2Processor:
    def __init__(self):
        self.bbox = None
        self.blue_path = None
        self.green_path = None
        self.red_path = None
        self.nir_path = None
        self.create_widgets()

    def create_widgets(self):
        self.question1()

    def question1(self):
        """
        Function to ask the user if they have Sentinel-2 data in the form of bands in multiple files.
        """
        self.clear_output()
        question = widgets.Label("Do you have Sentinel-2 data in the form of bands in multiple files?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q1_yes)
        no_button.on_click(self.q1_no)

        display(question, yes_button, no_button)

    def q1_yes(self, b):
        """
        Function to handle the event when the user clicks "Yes" in response to the first question.
        """
        self.clear_output()
        question = widgets.Label("Do you need / want to crop your raster data?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q2_yes)
        no_button.on_click(self.q2_no)

        display(question, yes_button, no_button)

    def q1_no(self, b):
        """
        Function to handle the event when the user clicks "No" in response to the first question.
        """
        self.clear_output()
        display(widgets.Label("Please use the tutorial to gather Sentinel-2 data"))

    def q2_yes(self, b):
        """
        Function to handle the event when the user clicks "Yes" in response to the second question.
        """
        self.clear_output()
        display(widgets.Label("Draw a rectangle on the map to select the bounding box"))

        m = Map(center=(51.23, 9.35), zoom=9, basemap=basemap_to_tiles(basemaps.OpenStreetMap.Mapnik))
        draw_control = DrawControl(rectangle={'shapeOptions': {'color': '#0000FF'}})
        m.add_control(draw_control)
        display(m)

        finish_button = widgets.Button(description="Finish")
        cancel_button = widgets.Button(description="Cancel")

        def handle_draw(target, action, geo_json):
            """
            Function to handle the event when a feature is drawn on the map.
            """
            bbox = shape(geo_json['geometry']).bounds
            self.bbox = self.convert_bbox_to_utm(bbox)
            rect = LeafletPolygon(locations=[[[bbox[1], bbox[0]], [bbox[1], bbox[2]], [bbox[3], bbox[2]], [bbox[3], bbox[0]], [bbox[1], bbox[0]]]], color="blue", fill_opacity=0.5)
            m.add_layer(rect)
            print(f"BBOX coordinates (Lat/Lon): {bbox}")
            print(f"BBOX coordinates (UTM): {self.bbox}")

        def finish(b):
            """
            Function to handle the event when the user clicks "Finish" in response to the second question.
            """
            if self.bbox:
                self.q3()
            else:
                display(widgets.Label("No bounding box selected."))

        def cancel(b):
            """
            Function to handle the event when the user clicks "Cancel" in response to the second question.
            """
            self.clear_output()
            display(widgets.Label("User aborted the operation."))

        draw_control.on_draw(handle_draw)
        finish_button.on_click(finish)
        cancel_button.on_click(cancel)

        display(finish_button, cancel_button)

    def q2_no(self, b):
        """
        Function to handle the event when the user clicks "No" in response to the second question.
        """
        self.q3()

    def q3(self):
        """
        Function to ask the user for the path to the blue band data.
        """
        self.ask_for_band("blue", "B02")

    def ask_for_band(self, color, code):
        """
        Function to ask the user for the path to a specific band data.
        Parameters:
        - color: The color of the band.
        - code: The code of the band.
        """
        self.clear_output()
        question = widgets.Label(f"Please set the path to the data of the {color} band ('_ _ _ {code} _ _ _ .jp2')")
        file_chooser = filechooser.FileChooser()

        upload_button = widgets.Button(description="Upload")

        def handle_upload(b):
            """
            Function to handle the event when the user clicks "Upload" to select a file.
            """
            selected_file = file_chooser.selected
            if selected_file and code in selected_file:
                setattr(self, f"{color}_path", selected_file)
                if color == "blue":
                    self.ask_for_band("green", "B03")
                elif color == "green":
                    self.ask_for_band("red", "B04")
                elif color == "red":
                    self.ask_for_band("nir", "B08")
                elif color == "nir":
                    self.ask_for_final_confirmation()
            else:
                self.clear_output()
                display(widgets.Label(f"Data integration of the {color} band not successful"))

        upload_button.on_click(handle_upload)

        display(question, file_chooser, upload_button)

    def ask_for_final_confirmation(self):
        """
        Function to ask the user for final confirmation before starting the processing.
        """
        self.clear_output()
        display(widgets.Label("All bands successfully incorporated. Processing now ready - do you want to continue?"))
        
        continue_button = widgets.Button(description="Continue")
        cancel_button = widgets.Button(description="Cancel")

        continue_button.on_click(self.process_bands)
        cancel_button.on_click(self.cancel_operation)

        display(continue_button, cancel_button)

    def cancel_operation(self, b):
        """
        Function to handle the event when the user clicks "Cancel" in response to the final confirmation.
        """
        self.clear_output()
        display(widgets.Label("User aborted the operation."))

    def clear_output(self):
        """
        Function to clear the output area.
        """
        clear_output()

    def process_bands(self, b=None):
        """
        Function to process the selected bands.
        """
        self.clear_output()
        display(widgets.Label("Starting the processing. This might take a while."))

        if not self.bbox:
            display(widgets.Label("No bounding box selected, using default bbox."))
            bbox = (394861, 5746419, 420134, 5767397)
        else:
            bbox = self.bbox

        blue_tif = self.convert_jp2_to_tif(self.blue_path)
        green_tif = self.convert_jp2_to_tif(self.green_path)
        red_tif = self.convert_jp2_to_tif(self.red_path)
        nir_tif = self.convert_jp2_to_tif(self.nir_path)

        blue, _ = self.load_and_clip_raster(blue_tif, bbox)
        green, _ = self.load_and_clip_raster(green_tif, bbox)
        red, _ = self.load_and_clip_raster(red_tif, bbox)
        nir, _ = self.load_and_clip_raster(nir_tif, bbox)

        if blue.size == 0 or green.size == 0 or red.size == 0 or nir.size == 0:
            self.clear_output()
            display(widgets.Label("Error: One of the bands is empty after clipping. Please check the bounding box and try again."))
            return

        # Stack the bands to create a composite image
        global Sen2AOI
        Sen2AOI = np.dstack((red, green, blue, nir))

        display(widgets.Label("Processing successful, you can now go on to the next code block!"))

    def convert_jp2_to_tif(self, input_path):
        """
        Function to convert a JP2 file to a TIFF file.
        Parameters:
        - input_path: The path to the input JP2 file.
        Returns:
        - The path to the output TIFF file.
        """
        in_image = gdal.Open(input_path)
        driver = gdal.GetDriverByName("GTiff")
        out_tif_path = input_path.replace('.jp2', '.tif')
        out_image = driver.CreateCopy(out_tif_path, in_image, 0)
        in_image = None
        out_image = None
        return out_tif_path

    def load_and_clip_raster(self, file_path, bbox):
        """
        Function to load and clip a raster image based on a bounding box.
        Parameters:
        - file_path: The path to the raster image file.
        - bbox: The bounding box coordinates.
        Returns:
        - The clipped image and its transformation.
        """
        with rasterio.open(file_path, 'r') as src:
            print(f"Loading with bbox: {bbox}")
            window = from_bounds(*bbox, src.transform)
            print(f"Window: {window}")
            clipped_image = src.read(1, window=window)
            print(f"Dimensions after clipping: {clipped_image.shape}")
            transform = src.window_transform(window)
        return clipped_image, transform

    def convert_bbox_to_utm(self, bbox):
        """
        Function to convert a bounding box from WGS84 to UTM Zone 32N.
        Parameters:
        - bbox: The input bounding box.
        Returns:
        - The converted bounding box in UTM coordinates.
        """
        in_proj = Proj(init='epsg:4326')  # WGS84
        out_proj = Proj(init='epsg:32632')  # UTM Zone 32N

        x_min, y_min = transform(in_proj, out_proj, bbox[0], bbox[1])
        x_max, y_max = transform(in_proj, out_proj, bbox[2], bbox[3])
        return (x_min, y_min, x_max, y_max)

# Run the application
interest = Sentinel2Processor()

#### 4.1.2 Incorperating and processing single file data

In [None]:
# Incorporating the Sentinel-2 data for the machine learning model in the form of a single .TIF file for the Area of Interest (AOI)

aoi_path = None
class Sentinel2Processor:
    def __init__(self):
        self.bbox = None
        self.file_path = None
        self.aoi_path = None  # AOI path
        self.create_widgets()

    def create_widgets(self):
        """Create and display the initial widgets for user interaction."""
        self.question1()

    def question1(self):
        """
        Ask the user if they have Sentinel-2 data in the form of a single .TIF file.
        """
        self.clear_output()
        question = widgets.Label("Do you have Sentinel-2 data in the form of a single file containing multiple bands?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q1_yes)
        no_button.on_click(self.q1_no)

        display(question, yes_button, no_button)

    def q1_yes(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the first question.
        """
        self.clear_output()
        question = widgets.Label("Do you need / want to crop your raster data?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q2_yes)
        no_button.on_click(self.q2_no)

        display(question, yes_button, no_button)

    def q1_no(self, b):
        """
        Handle the event when the user clicks "No" in response to the first question.
        """
        self.clear_output()
        display(widgets.Label("Please use the tutorial to gather Sentinel-2 data."))

    def q2_yes(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the second question.
        """
        self.clear_output()
        display(widgets.Label("Draw a rectangle on the map to select the bounding box"))

        self.map_widget = Map(center=(51.23, 9.35), zoom=9, basemap=basemap_to_tiles(basemaps.OpenStreetMap.Mapnik))
        draw_control = DrawControl(rectangle={'shapeOptions': {'color': '#0000FF'}})
        self.map_widget.add_control(draw_control)
        display(self.map_widget)

        finish_button = widgets.Button(description="Finish")
        cancel_button = widgets.Button(description="Cancel")

        draw_control.on_draw(self.handle_draw)
        finish_button.on_click(self.finish_bbox)
        cancel_button.on_click(self.cancel_bbox)

        display(finish_button, cancel_button)

    def q2_no(self, b):
        """
        Handle the event when the user clicks "No" in response to the second question.
        """
        self.q3()

    def handle_draw(self, target, action, geo_json):
        """
        Handle the event when a feature is drawn on the map.
        """
        bbox = shape(geo_json['geometry']).bounds
        self.bbox = self.convert_bbox_to_utm(bbox)
        rect = LeafletPolygon(locations=[[[bbox[1], bbox[0]], [bbox[1], bbox[2]], [bbox[3], bbox[2]], [bbox[3], bbox[0]], [bbox[1], bbox[0]]]], color="blue", fill_opacity=0.5)
        self.map_widget.add_layer(rect)
        print(f"BBOX coordinates (Lat/Lon): {bbox}")
        print(f"BBOX coordinates (UTM): {self.bbox}")

    def finish_bbox(self, b):
        """
        Handle the event when the user clicks "Finish" after drawing the bounding box.
        """
        if self.bbox:
            self.q3()
        else:
            display(widgets.Label("No bounding box selected."))

    def cancel_bbox(self, b):
        """
        Handle the event when the user clicks "Cancel" in response to the bounding box question.
        """
        self.clear_output()
        display(widgets.Label("User aborted the operation."))

    def q3(self):
        """
        Ask the user if they have a .TIF file or a .grd file.
        """
        self.clear_output()
        question = widgets.Label("Do you have a .TIF file or .grd?")
        tif_button = widgets.Button(description=".TIF")
        grd_button = widgets.Button(description=".grd")

        tif_button.on_click(self.q3_tif)
        grd_button.on_click(self.q3_grd)

        display(question, tif_button, grd_button)

    def q3_tif(self, b):
        """
        Handle the event when the user clicks ".TIF" in response to the third question.
        """
        self.clear_output()
        self.ask_for_file(".tif")

    def q3_grd(self, b):
        """
        Handle the event when the user clicks ".grd" in response to the third question.
        """
        self.clear_output()
        question = widgets.Label("The used Python modules can't incorporate .grd files. Do you want to convert your data to the .TIF file format?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q4_yes)
        no_button.on_click(self.q4_no)

        display(question, yes_button, no_button)

    def q4_yes(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the fourth question.
        """
        self.clear_output()
        self.ask_for_file(".grd")

    def q4_no(self, b):
        """
        Handle the event when the user clicks "No" in response to the fourth question.
        """
        self.clear_output()
        display(widgets.Label("User aborted the operation. Please use the tutorial to gather Sentinel-2 data."))

    def ask_for_file(self, extension):
        """
        Ask the user for the path to a file of a specific type.
        Parameters:
        - extension: The file extension to look for (e.g., ".tif" or ".grd").
        """
        self.clear_output()
        question = widgets.Label(f"Please set the path to the data of the file with extension '{extension}'")
        file_chooser = filechooser.FileChooser()

        upload_button = widgets.Button(description="Upload")

        def handle_upload(b):
            """
            Handle the event when the user clicks "Upload" to select a file.
            """
            selected_file = file_chooser.selected
            if selected_file and selected_file.endswith(extension):
                self.file_path = selected_file
                self.aoi_path = selected_file  # Set AOI path
                global aoi_path
                print(f"File path set to: {self.file_path}")
                if extension == ".grd":
                    self.convert_grd_to_tif()
                else:
                    self.process_tiff_and_plot()
            else:
                display(widgets.Label(f"Data integration not successful. File must be a {extension} file."))

        upload_button.on_click(handle_upload)

        display(question, file_chooser, upload_button)

    def convert_grd_to_tif(self):
        """
        Convert a .grd file to .tif format and process the .tif file.
        """
        self.clear_output()
        output_tiff_path = self.file_path.replace('.grd', '.tif')

        try:
            self._convert_grd_to_tiff(self.file_path, output_tiff_path)
            display(widgets.Label("Data conversion successful"))
            self.file_path = output_tiff_path
            self.process_tiff_and_plot()
        except Exception as e:
            display(widgets.Label("Data integration and transformation not successful"))

    def _convert_grd_to_tiff(self, input_grd_path, output_tiff_path):
        """
        Convert a GRD file to a TIF file using GDAL.
        """
        dataset = gdal.Open(input_grd_path)
        if dataset is None:
            raise FileNotFoundError(f"Cannot open {input_grd_path}")

        geo_transform = dataset.GetGeoTransform()
        projection = dataset.GetProjection()

        driver = gdal.GetDriverByName('GTiff')
        if driver is None:
            raise RuntimeError("GTiff driver is not available")

        output_dataset = driver.Create(output_tiff_path, dataset.RasterXSize, dataset.RasterYSize, dataset.RasterCount, gdal.GDT_Float32)
        output_dataset.SetGeoTransform(geo_transform)
        output_dataset.SetProjection(projection)

        for band in range(1, dataset.RasterCount + 1):
            input_band = dataset.GetRasterBand(band)
            output_band = output_dataset.GetRasterBand(band)
            data = input_band.ReadAsArray()
            output_band.WriteArray(data)

        dataset = None
        output_dataset = None

    def process_tiff_and_plot(self):
        """
        Process the .tif file, display the number of bands, and plot the RGB composite.
        """
        self.clear_output()
        dataset = gdal.Open(self.file_path)
        if dataset is None:
            display(widgets.Label("Failed to open the TIF file."))
            return

        num_bands = dataset.RasterCount
        display(widgets.Label(f"Number of bands: {num_bands}"))

        if num_bands >= 3:
            self.plot_rgb_composite(dataset)

    def plot_rgb_composite(self, dataset):
        """
        Plot an RGB composite of the Sentinel-2 image using the first 4 bands.
        """
        global Sen2AOI
        bands = [dataset.GetRasterBand(i + 1).ReadAsArray() for i in range(4)]
        
        for i, band in enumerate(bands[:4]):
            print(f"Band {i+1} min: {band.min()}, max: {band.max()}")

        Sen2AOI = np.stack(bands[:4], axis=-1)
        print(Sen2AOI.shape)

        Sen2AOI = Sen2AOI.astype(np.float32)
        for i in range(Sen2AOI.shape[-1]):
            Sen2AOI[..., i] = (Sen2AOI[..., i] - Sen2AOI[..., i].min()) / (Sen2AOI[..., i].max() - Sen2AOI[..., i].min())

        plt.imshow(Sen2AOI)
        plt.title('Sentinel-2 Stacked Image')
        plt.show()

    def convert_bbox_to_utm(self, bbox):
        """
        Convert latitude/longitude bounding box to UTM coordinates.
        """
        in_proj = Proj(init='epsg:4326')  # WGS84
        out_proj = Proj(init='epsg:32632')  # UTM Zone 32N

        x_min, y_min = transform(in_proj, out_proj, bbox[0], bbox[1])
        x_max, y_max = transform(in_proj, out_proj, bbox[2], bbox[3])
        return (x_min, y_min, x_max, y_max)

    def clear_output(self):
        """
        Clear the output area to remove previous widgets and information.
        """
        clear_output(wait=True)


# Run the application
interest = Sentinel2Processor()

### 4.2 Integrating Sentinel-2 data of the Area of Training
<details>
    <summary>Click me for more Info</summary>
    
* Now that we have processed the Sentinel-2 data for the Area of Interest, we need to do the same for the Training Area. Run the code whenever you're ready, and we'll guide you through the process again.
* Remember: The Training Area is where your training data is located and where your model will be trained. The Area of Interest is where the algorithm will predict land use/land cover.

</details>

#### 4.2.1 Incorperating and processing multiple bands for the Area of Training

In [None]:
# Incorporating the Sentinel-2 data for the machine learning model in form of mu file for the Area of Training (AOT)

class Sentinel2Processor:
    def __init__(self):
        self.bbox = None
        self.blue_path = None
        self.green_path = None
        self.red_path = None
        self.nir_path = None
        self.create_widgets()

    def create_widgets(self):
        self.question1()

    def question1(self):
        """
        Function to ask the user if they have Sentinel-2 data in the form of bands in multiple files.
        """
        self.clear_output()
        question = widgets.Label("Do you have Sentinel-2 data in the form of bands in multiple files?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q1_yes)
        no_button.on_click(self.q1_no)

        display(question, yes_button, no_button)

    def q1_yes(self, b):
        """
        Function to handle the event when the user clicks 'Yes' in response to the first question.
        """
        self.clear_output()
        question = widgets.Label("Is your Area of Training, as in the place where your Training data are located, the same as your Area of Interest, for which you want to create a prediction map?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q1a_yes)
        no_button.on_click(self.q1a_no)

        display(question, yes_button, no_button)

    def q1_no(self, b):
        """
        Function to handle the event when the user clicks 'No' in response to the first question.
        """
        self.clear_output()
        question = widgets.Label("Do you need / want to crop your raster data?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q2_yes)
        no_button.on_click(self.q2_no)

        display(question, yes_button, no_button)

    def q1a_yes(self, b):
        """
        Function to handle the event when the user clicks 'Yes' in response to the second question about Area of Training.
        """
        self.clear_output()
        try:
            global Sen2AOT
            if 'Sen2AOI' in globals():
                print("Sen2AOI is present in globals.")
                print(f"Type of Sen2AOI: {type(Sen2AOI)}")
                print(f"Size of Sen2AOI: {Sen2AOI.shape}")
                if isinstance(Sen2AOI, np.ndarray):
                    # Copying the data
                    Sen2AOT = Sen2AOI.copy()
                    display(widgets.Label("Data for the Area of Interest found. Data copied to Sen2AOT."))
                else:
                    display(widgets.Label("Data for the Area of Interest is not a valid ndarray."))
            else:
                display(widgets.Label("No data for your Area of Interest found. Please upload these data first."))
        except Exception as e:
            display(widgets.Label(f"An error occurred: {e}"))

    def q1a_no(self, b):
        """
        Function to handle the event when the user clicks 'No' in response to the second question about Area of Training.
        """
        self.clear_output()
        display(widgets.Label("Do you need / want to crop your raster data?"))
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q2_yes)
        no_button.on_click(self.q2_no)

        display(yes_button, no_button)

    def q2_yes(self, b):
        """
        Function to handle the event when the user clicks 'Yes' in response to the second question.
        """
        self.clear_output()
        display(widgets.Label("Draw a rectangle on the map to select the bounding box"))

        m = Map(center=(51.23, 9.35), zoom=9, basemap=basemap_to_tiles(basemaps.OpenStreetMap.Mapnik))
        draw_control = DrawControl(rectangle={'shapeOptions': {'color': '#0000FF'}})
        m.add_control(draw_control)
        display(m)

        finish_button = widgets.Button(description="Finish")
        cancel_button = widgets.Button(description="Cancel")

        def handle_draw(target, action, geo_json):
            """
            Function to handle the event when a feature is drawn on the map.
            """
            bbox = shape(geo_json['geometry']).bounds
            self.bbox = self.convert_bbox_to_utm(bbox)
            rect = LeafletPolygon(locations=[[[bbox[1], bbox[0]], [bbox[1], bbox[2]], [bbox[3], bbox[2]], [bbox[3], bbox[0]], [bbox[1], bbox[0]]]], color="blue", fill_opacity=0.5)
            m.add_layer(rect)
            print(f"BBOX coordinates (Lat/Lon): {bbox}")
            print(f"BBOX coordinates (UTM): {self.bbox}")

        def finish(b):
            """
            Function to handle the event when the user clicks 'Finish' in response to the second question.
            """
            if self.bbox:
                self.q3()
            else:
                display(widgets.Label("No bounding box selected."))

        def cancel(b):
            """
            Function to handle the event when the user clicks 'Cancel' in response to the second question.
            """
            self.clear_output()
            display(widgets.Label("User aborted the operation."))

        draw_control.on_draw(handle_draw)
        finish_button.on_click(finish)
        cancel_button.on_click(cancel)

        display(finish_button, cancel_button)

    def q2_no(self, b):
        """
        Function to handle the event when the user clicks 'No' in response to the second question.
        """
        self.q3()

    def q3(self):
        """
        Function to ask the user for the path to the blue band data.
        """
        self.ask_for_band("blue", "B02")

    def ask_for_band(self, color, code):
        """
        Function to ask the user for the path to a specific band data.
        Parameters:
        - color: The color of the band.
        - code: The code of the band.
        """
        self.clear_output()
        question = widgets.Label(f"Please set the path to the data of the {color} band ('_ _ _ {code} _ _ _ .jp2')")
        file_chooser = filechooser.FileChooser()

        upload_button = widgets.Button(description="Upload")

        def handle_upload(b):
            """
            Function to handle the event when the user clicks 'Upload' to select a file.
            """
            selected_file = file_chooser.selected
            if selected_file and code in selected_file:
                setattr(self, f"{color}_path", selected_file)
                if color == "blue":
                    self.ask_for_band("green", "B03")
                elif color == "green":
                    self.ask_for_band("red", "B04")
                elif color == "red":
                    self.ask_for_band("nir", "B08")
                elif color == "nir":
                    self.ask_for_final_confirmation()
            else:
                self.clear_output()
                display(widgets.Label(f"Data integration of the {color} band not successful"))

        upload_button.on_click(handle_upload)

        display(question, file_chooser, upload_button)

    def ask_for_final_confirmation(self):
        """
        Function to ask the user for final confirmation before starting the processing.
        """
        self.clear_output()
        display(widgets.Label("All bands successfully incorporated. Processing now ready - do you want to continue?"))
        
        continue_button = widgets.Button(description="Continue")
        cancel_button = widgets.Button(description="Cancel")

        continue_button.on_click(self.process_bands)
        cancel_button.on_click(self.cancel_operation)

        display(continue_button, cancel_button)

    def cancel_operation(self, b):
        """
        Function to handle the event when the user clicks 'Cancel' in response to the final confirmation.
        """
        self.clear_output()
        display(widgets.Label("User aborted the operation."))

    def clear_output(self):
        """
        Function to clear the output area.
        """
        clear_output()

    def process_bands(self, b=None):
        """
        Function to process the selected bands.
        """
        self.clear_output()
        display(widgets.Label("Starting the processing. This might take a while."))

        if not self.bbox:
            display(widgets.Label("No bounding box selected, using default bbox."))
            bbox = (394861, 5746419, 420134, 5767397)
        else:
            bbox = self.bbox

        blue_tif = self.convert_jp2_to_tif(self.blue_path)
        green_tif = self.convert_jp2_to_tif(self.green_path)
        red_tif = self.convert_jp2_to_tif(self.red_path)
        nir_tif = self.convert_jp2_to_tif(self.nir_path)

        blue, _ = self.load_and_clip_raster(blue_tif, bbox)
        green, _ = self.load_and_clip_raster(green_tif, bbox)
        red, _ = self.load_and_clip_raster(red_tif, bbox)
        nir, _ = self.load_and_clip_raster(nir_tif, bbox)

        if blue.size == 0 or green.size == 0 or red.size == 0 or nir.size == 0:
            self.clear_output()
            display(widgets.Label("Error: One of the bands is empty after clipping. Please check the bounding box and try again."))
            return

        # Stack the bands to create a composite image
        global Sen2AOT
        Sen2AOT = np.dstack((red, green, blue, nir))

        display(widgets.Label("Processing successful, you can now go on to the next code block!"))

    def convert_jp2_to_tif(self, input_path):
        """
        Function to convert a JP2 file to a TIFF file.
        Parameters:
        - input_path: The path to the input JP2 file.
        Returns:
        - The path to the output TIFF file.
        """
        in_image = gdal.Open(input_path)
        driver = gdal.GetDriverByName("GTiff")
        out_tif_path = input_path.replace('.jp2', '.tif')
        out_image = driver.CreateCopy(out_tif_path, in_image, 0)
        in_image = None
        out_image = None
        return out_tif_path

    def load_and_clip_raster(self, file_path, bbox):
        """
        Function to load and clip a raster image based on a bounding box.
        Parameters:
        - file_path: The path to the raster image file.
        - bbox: The bounding box coordinates.
        Returns:
        - The clipped image and its transformation.
        """
        with rasterio.open(file_path, 'r') as src:
            print(f"Loading with bbox: {bbox}")
            window = from_bounds(*bbox, src.transform)
            print(f"Window: {window}")
            clipped_image = src.read(1, window=window)
            print(f"Dimensions after clipping: {clipped_image.shape}")
            transform = src.window_transform(window)
        return clipped_image, transform

    def convert_bbox_to_utm(self, bbox):
        """
        Function to convert a bounding box from WGS84 to UTM Zone 32N.
        Parameters:
        - bbox: The input bounding box.
        Returns:
        - The converted bounding box in UTM coordinates.
        """
        in_proj = Proj(init='epsg:4326')  # WGS84
        out_proj = Proj(init='epsg:32632')  # UTM Zone 32N

        x_min, y_min = transform(in_proj, out_proj, bbox[0], bbox[1])
        x_max, y_max = transform(in_proj, out_proj, bbox[2], bbox[3])
        return (x_min, y_min, x_max, y_max)

# Run the application
train = Sentinel2Processor()

#### 4.2.2 Incorperating and processing single file data for the Area of Training

In [None]:
# Incorporating the Sentinel-2 data for the machine learning model in form of multiple bands in a single file for the Area of Training (AOT)

class Sentinel2Processor:
    def __init__(self):
        self.bbox = None
        self.file_path = None
        self.create_widgets()

    def create_widgets(self):
        """Create and display the initial widgets for user interaction."""
        self.question1()

    def question1(self):
        """
        Ask the user if they have Sentinel-2 data in the form of a single .TIF file.
        """
        self.clear_output()
        question = widgets.Label("Do you have Sentinel-2 data in the form of a single file containing multiple bands?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q1_yes)
        no_button.on_click(self.q1_no)

        display(question, yes_button, no_button)

    def q1_yes(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the first question.
        """
        self.clear_output()
        question = widgets.Label("Is your Area of Training, as in the place where your Training data are located, the same as your Area of Interest, for which you want to create a prediction map?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q1_yes_aoi)
        no_button.on_click(self.q1_no_aoi)

        display(question, yes_button, no_button)

    def q1_no(self, b):
        """
        Handle the event when the user clicks "No" in response to the first question.
        """
        self.clear_output()
        display(widgets.Label("Please use the tutorial to gather Sentinel-2 data."))

    def q1_yes_aoi(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the Area of Training question.
        """
        self.clear_output()
        if 'Sen2AOI' in globals():
            sen2aoi = globals()['Sen2AOI']
            # Copy data to Sen2AOT
            globals()['Sen2AOT'] = sen2aoi
            display(widgets.Label("Data for your Area of Training has been successfully copied to Sen2AOT."))
        else:
            display(widgets.Label("No data for your Area of Interest found. Please upload these data first!"))

    def q1_no_aoi(self, b):
        """
        Handle the event when the user clicks "No" in response to the Area of Training question.
        """
        self.clear_output()
        self.q2_no()

    def q2_yes(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the second question.
        """
        self.clear_output()
        display(widgets.Label("Draw a rectangle on the map to select the bounding box"))

        self.map_widget = Map(center=(51.23, 9.35), zoom=7, basemap=basemap_to_tiles(basemaps.OpenStreetMap.Mapnik))
        draw_control = DrawControl(rectangle={'shapeOptions': {'color': '#0000FF'}})
        self.map_widget.add_control(draw_control)
        display(self.map_widget)

        finish_button = widgets.Button(description="Finish")
        cancel_button = widgets.Button(description="Cancel")

        draw_control.on_draw(self.handle_draw)
        finish_button.on_click(self.finish_bbox)
        cancel_button.on_click(self.cancel_bbox)

        display(finish_button, cancel_button)

    def q2_no(self, b=None):
        """
        Handle the event when the user clicks "No" in response to the second question.
        """
        self.q3()

    def handle_draw(self, target, action, geo_json):
        """
        Handle the event when a feature is drawn on the map.
        """
        bbox = shape(geo_json['geometry']).bounds
        self.bbox = self.convert_bbox_to_utm(bbox)
        rect = LeafletPolygon(locations=[[[bbox[1], bbox[0]], [bbox[1], bbox[2]], [bbox[3], bbox[2]], [bbox[3], bbox[0]], [bbox[1], bbox[0]]]], color="blue", fill_opacity=0.5)
        self.map_widget.add_layer(rect)
        print(f"BBOX coordinates (Lat/Lon): {bbox}")
        print(f"BBOX coordinates (UTM): {self.bbox}")

    def finish_bbox(self, b):
        """
        Handle the event when the user clicks "Finish" after drawing the bounding box.
        """
        if self.bbox:
            self.q3()
        else:
            display(widgets.Label("No bounding box selected."))

    def cancel_bbox(self, b):
        """
        Handle the event when the user clicks "Cancel" in response to the bounding box question.
        """
        self.clear_output()
        display(widgets.Label("User aborted the operation."))

    def q3(self):
        """
        Ask the user if they have a .TIF file or a .grd file.
        """
        self.clear_output()
        question = widgets.Label("Do you have a .TIF file or .grd?")
        tif_button = widgets.Button(description=".TIF")
        grd_button = widgets.Button(description=".grd")

        tif_button.on_click(self.q3_tif)
        grd_button.on_click(self.q3_grd)

        display(question, tif_button, grd_button)

    def q3_tif(self, b):
        """
        Handle the event when the user clicks ".TIF" in response to the third question.
        """
        self.clear_output()
        self.ask_for_file(".tif")

    def q3_grd(self, b):
        """
        Handle the event when the user clicks ".grd" in response to the third question.
        """
        self.clear_output()
        question = widgets.Label("The used Python modules can't incorporate .grd files. Do you want to convert your data to the .TIF file format?")
        yes_button = widgets.Button(description="Yes")
        no_button = widgets.Button(description="No")

        yes_button.on_click(self.q4_yes)
        no_button.on_click(self.q4_no)

        display(question, yes_button, no_button)

    def q4_yes(self, b):
        """
        Handle the event when the user clicks "Yes" in response to the fourth question.
        """
        self.clear_output()
        self.ask_for_file(".grd")

    def q4_no(self, b):
        """
        Handle the event when the user clicks "No" in response to the fourth question.
        """
        self.clear_output()
        display(widgets.Label("User aborted the operation. Please use the tutorial to gather Sentinel-2 data."))

    def ask_for_file(self, extension):
        """
        Ask the user for the path to a file of a specific type.
        Parameters:
        - extension: The file extension to look for (e.g., ".tif" or ".grd").
        """
        self.clear_output()
        question = widgets.Label(f"Please set the path to the data of the file with extension '{extension}'")
        file_chooser = filechooser.FileChooser()

        upload_button = widgets.Button(description="Upload")

        def handle_upload(b):
            """
            Handle the event when the user clicks "Upload" to select a file.
            """
            selected_file = file_chooser.selected
            if selected_file and selected_file.endswith(extension):
                global uploaded_tif_path  # Store the file path in a global variable
                self.file_path = selected_file
                uploaded_tif_path = self.file_path  # Assign to global variable
                if extension == ".grd":
                    self.convert_grd_to_tif()
                else:
                    self.process_tiff_and_plot()
            else:
                display(widgets.Label(f"Data integration not successful. File must be a {extension} file."))

        upload_button.on_click(handle_upload)

        display(question, file_chooser, upload_button)

    def convert_grd_to_tif(self):
        """
        Convert a .grd file to .tif format and process the .tif file.
        """
        self.clear_output()
        output_tiff_path = self.file_path.replace('.grd', '.tif')

        try:
            self._convert_grd_to_tiff(self.file_path, output_tiff_path)
            display(widgets.Label("Data conversion successful"))
            self.file_path = output_tiff_path
            global uploaded_tif_path
            uploaded_tif_path = self.file_path  # Update global variable with the new file path
            self.process_tiff_and_plot()
        except Exception as e:
            display(widgets.Label("Data integration and transformation not successful"))

    def _convert_grd_to_tiff(self, input_grd_path, output_tiff_path):
        """
        Convert a GRD file to a TIF file using rasterio.
        """
        # Implement the conversion logic using rasterio or any other appropriate library
        pass  # Placeholder for actual conversion logic

    def process_tiff_and_plot(self):
        """
        Process the .tif file, display the number of bands, and plot the RGB composite.
        """
        self.clear_output()
        with rasterio.open(self.file_path) as dataset:
            bands = []
            for i in range(1, dataset.count + 1):
                band = dataset.read(i)
                bands.append(band)

            stacked_bands = np.stack(bands, axis=-1)
            print(stacked_bands.shape)

        num_bands = stacked_bands.shape[2]
        display(widgets.Label(f"Number of bands: {num_bands}"))

        if num_bands >= 3:
            self.plot_rgb_composite(stacked_bands)

    def plot_rgb_composite(self, stacked_bands):
        """
        Plot an RGB composite of the Sentinel-2 image using the first 3 bands.
        """
        global Sen2AOT
        for i in range(stacked_bands.shape[2]):
            print(f"Band {i+1} min: {stacked_bands[..., i].min()}, max: {stacked_bands[..., i].max()}")

        Sen2AOT = stacked_bands.astype(np.float32)
        for i in range(Sen2AOT.shape[-1]):
            Sen2AOT[..., i] = (Sen2AOT[..., i] - Sen2AOT[..., i].min()) / (Sen2AOT[..., i].max() - Sen2AOT[..., i].min())

        plt.imshow(Sen2AOT[..., :3])
        plt.title('Sentinel-2 RGB Composite')
        plt.show()

    def convert_bbox_to_utm(self, bbox):
        """
        Convert latitude/longitude bounding box to UTM coordinates.
        """
        p1 = Proj(proj='latlong', datum='WGS84')
        p2 = Proj(proj='utm', zone=33, datum='WGS84')
        utm_bbox = []
        for coord in bbox:
            utm_bbox.append(transform(p1, p2, coord[0], coord[1]))
        return utm_bbox

    def clear_output(self):
        """
        Clear the output area to remove previous widgets and information.
        """
        clear_output(wait=True)

# Instantiate and run the Sentinel2Processor
train = Sentinel2Processor()


## 5 Modeltraining
* This part should be fairly easy: Just follow the steps in the UI and if you've integrated all of the other data correctly, the modeltraining should be smooth and quick
* For any metrics to the precision of the model and explanations of the values you can skip to the end of this document
* It should be noted though that we'll need to calculate the "NDVI", or "Normalized Difference Vegetation Index", which will help the algorithm differentiate vegetation a bit better. You don't have to do anything for that except for making sure that at least the "red" and the "nir" band are included in your rasterdata

![NDVI_example](index/ndvi_example.png)

<sup>*Figure 5: Calculated NDVI based on the values in the 'red' and the 'nir' chanel of the area of training (Dortmund)*</sup>

* NOTE: Calculating the NDVI lead to the majority of errors during this project; We still plan to include it in the future, as it enhances the models ability to see and understand naturally occuring patterns but right now it's removed

In [None]:
# Start the model training process using the Sentinel-2 data

# Ensure that the paths are set
if 'uploaded_tif_path' not in globals() or 'uploaded_geojson_path' not in globals():
    raise ValueError("Please upload the .tif and .geojson files and confirm the paths before starting the model training.")

tif_path = uploaded_tif_path
geojson_path = uploaded_geojson_path

# Ensure that the paths are not None
if not tif_path or not geojson_path:
    raise ValueError("One of the paths is not correctly set. Please make sure both files are uploaded correctly.")

# Load GeoJSON training data
train_data = gpd.read_file(geojson_path)

# Display CRS of the GeoJSON dataset
print("GeoJSON CRS:", train_data.crs)

# Convert labels to numerical values
label_encoder = LabelEncoder()
train_data['Label'] = label_encoder.fit_transform(train_data['Label'])

# Load Sentinel-2 bands as raster stack
with rasterio.open(tif_path) as src:
    sentinel_meta = src.meta
    sentinel_crs = src.crs  # Get CRS of the raster
    
    # Display CRS of the raster dataset
    print("Raster CRS:", sentinel_crs)

    # Check if the CRS of the GeoJSON matches the raster's CRS and reproject if necessary
    if train_data.crs != sentinel_crs:
        print("Reprojecting GeoJSON data to match the raster's CRS.")
        train_data = train_data.to_crs(sentinel_crs)

    # Extract polygon geometries from the GeoJSON file
    geoms = train_data.geometry.apply(mapping)

    # Initialize list to store the extracted data
    extracted_data = []

    # Loop through each polygon geometry and label
    for geom, label in zip(geoms, train_data['Label']):
        try:
            # Mask the raster with the current polygon geometry
            out_image, out_transform = rasterio.mask.mask(src, [geom], crop=True)
            # Reshape the masked image into a 2D array, where each row is a pixel and the columns are the bands
            out_image = out_image.reshape(out_image.shape[0], -1).T
            # Create an array of the same length as out_image, filled with the current label
            labels = np.full(out_image.shape[0], label)
            # Add data to the list
            extracted_data.append(np.column_stack((out_image, labels)))
        except ValueError:
            print(f"Polygon with label {label} does not intersect with the raster and will be skipped.")

# Combine all extracted data into an array
if extracted_data:  # Only proceed if valid data is present
    extracted_data = np.vstack(extracted_data)
else:
    raise ValueError("No data could be extracted. Please check your input data.")

# Split predictors (bands) and labels
X = extracted_data[:, :-1]  # Predictors (B02, B03, etc.)
y = extracted_data[:, -1]   # Labels

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, stratify=y)

# Train the Random Forest model
rf_model = RandomForestClassifier(n_estimators=500, random_state=42)
rf_model.fit(X_train, y_train)

# Display trained model and variable importance
print("Random Forest Model:")
print(rf_model)

# Feature Importances
importances = rf_model.feature_importances_
indices = np.argsort(importances)[::-1]
for i in range(len(importances)):
    print(f"Band {i+1}: {importances[indices[i]]}")

# Calculate accuracy on the test set
y_pred = rf_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model accuracy: {accuracy:.4f}")

# Save trained model and label encoder
joblib.dump(rf_model, "RF_Model.pkl")
joblib.dump(label_encoder, "Label_Encoder.pkl")


## 6 Prediction Map
* As the last part this part should be fairly easy: Just follow the steps in the UI
* NOTE: You still have to have all data integrated in the Python Environment, so make sure to not reset anything!

In [None]:
# Start the prediction process using the trained model and the Area of Interest (AOI) data

if 'aoi_path' not in globals() or 'Sen2AOI' not in globals():
    raise ValueError("AOI data was not loaded correctly. Please ensure that the AOI data is loaded correctly.")

# Load the model and label encoder
rf_model = joblib.load("RF_Model.pkl")
label_encoder = joblib.load("Label_Encoder.pkl")

with rasterio.open(aoi_path) as src:
    aoi_meta = src.meta
    aoi_bands = src.read()  # Read all bands

aoi_reshaped = aoi_bands.reshape(len(aoi_bands), -1).T

print("Calculating the prediction! This might take a while. I suggest to go and grab a coffee while this loads.")

aoi_prediction = rf_model.predict(aoi_reshaped)
aoi_prediction = aoi_prediction.reshape(aoi_bands.shape[1:])

# Assign colors dynamically based on the number of classes in the label encoder
num_classes = len(label_encoder.classes_)
colors = plt.get_cmap('Dark2')(np.linspace(0, 1, num_classes))  # Use 'tab10' or another color scheme
cmap = mcolors.ListedColormap(colors)

# Plot the prediction for the AOI
plt.figure(figsize=(10, 10))
im = plt.imshow(aoi_prediction, cmap=cmap)
plt.title("Random Forest Prediction")

# Create a dynamic legend based on the labels
legend_labels = {label: colors[i] for i, label in enumerate(label_encoder.classes_)}
cbar = plt.colorbar(im, ticks=np.arange(num_classes))
cbar.ax.set_yticklabels(label_encoder.classes_)
plt.show()

# Adjust the output path to save in the same directory as the AOI file
output_dir = os.path.dirname(aoi_path)
output_tif_filename = "RF_Prediction_AOI.tif"
output_pdf_filename = "RF_Prediction_AOI.pdf"
output_tif_path = os.path.join(output_dir, output_tif_filename)
output_pdf_path = os.path.join(output_dir, output_pdf_filename)

# Check if the directory exists and create it if it doesn't
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

# Adjust metadata and set a valid nodata value for uint8
output_meta = aoi_meta.copy()
output_meta.update({"count": 1, "dtype": 'uint8', "nodata": None})

# Save the predictions as a new TIFF file in the specified directory
with rasterio.open(output_tif_path, 'w', **output_meta) as dst:
    dst.write(aoi_prediction.astype(rasterio.uint8), 1)

print(f"Prediction saved to: {output_tif_path}")

# Save the prediction as a PDF
with PdfPages(output_pdf_path) as pdf:
    plt.figure(figsize=(10, 10))
    im = plt.imshow(aoi_prediction, cmap=cmap)
    plt.title("Random Forest Prediction")
    cbar = plt.colorbar(im, ticks=np.arange(num_classes))
    cbar.ax.set_yticklabels(label_encoder.classes_)
    pdf.savefig()  # Save the current figure to the PDF
    plt.close()

print(f"Prediction saved as PDF to: {output_pdf_path}")

print("Thank you for testing out our App! There is a lot of room for improvements and we're always grateful for helpful criticism and suggestions!")


* After the prediction is done you should be able to save your results as a .TIF or .PDF File
* You can also read the models metrics such as accuracy in Chapter 5: Modeltraining if you'd like
* The finished model could look like this:


![class_example](index/class_example.jpg)

<sup>*Figure 6: Calculated prediction of Münster based on a spatial prediction model of Dortmund*</sup>

Thank you for testing our App! For any bugs or recommendations feel free to contact us!