# Team: Wildland Fire Azure AI
## Project: Jasper Fire Recovery - Bridging Technology and Disaster Recover 

### Introduction
###### In this project, we explore how technology, specifically geospatial data and automated workflows, can be used to support disaster recovery efforts. The Jasper Fire serves as a real-world case study for testing our approach. The goal is to show how leveraging code and spatial data can enhance situational awareness, optimize resource allocation, and ultimately learn how natural environments can be restored to their pre-disaster states. 

### Project Overview: 
###### This project investigates the recovery of the 2000 Jasper Fire in the Black Hills of South Dakota. Leveraging Landsat satellite imagery, machine learning models, and Azure Machine Learning Studio, we analyze the ecological recovery and explore actionable insights for forest management and wildfire resilience.

![Jasper Fire Burn Scar](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_185318_UTC/391249586-31db6714-f3a1-4f44-b377-8fb557f8d2e9.png)

### Objectives
###### * Utilize Landsat imagery to assess post-fire recovery.
###### * Analyze ecological metrics like burn severity and vegetation recovery. 
###### * Develop machine learning workflows for geospatial data analysis. 
###### * Highlight ongoing restoration efforts.

### Background

###### The Jasper Fire, ignited in August 2000, burned over 83,000 acres in the Black Hills National Forest. This catastrophic event led to significant ecological damage and has required extensive restoration efforts. 

#### Challenges of Recovery 

###### * Manual replanting efforts are limited to a 2-week window each year.
###### * Replanting focuses on native Ponderosa Pine at a rate of 400 acres per year.
###### * Increased wildfire frequency due to climate change poses ongoing challenges.

#### Solution: Remote Sensing & AI/ML for Wildfire Recovery 
###### 
###### We developed a prototype solution using geospatial analysis and AI/ML tools within Azure ML Studio to:
###### * Track vegetation recovery
###### * Assess land use changes
###### * Guide future fire resilience efforts

#### Details of the Solution
###### * **Data Storage:** Azure Storage Blob for scalable data storage.
###### * **Geospatial Analysis:**  Landsat imagery to monitor vegetation and land use.
###### * **Python Notebooks:**  Automate data analysis and visualization.
###### * **AI/ML Models:**  Azure ML Studio to predict fire-prone areas and track vegetation health.


## Packages 

###### In order to run this script you will need to run Python 3.8 and the following libraries: 
###### rasterio 
###### matplotlib 
###### numpy 
###### scikit-learn
###### pandas
###### pygments
###### azureml-core 


#### Compute and Kernel

###### Make sure your kernel is Python 3.8 - AzureML

## 1. Install Libraries 

In [None]:
!conda create -n azureml_py38 -c conda-forge python=3.8 -y
!conda activate azureml_py38

In [None]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from osgeo import gdal

# Import other essential libraries for your analysis
# For example:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

In [None]:
from azureml.core import Workspace, Dataset

# Connect to your workspace
ws = Workspace.from_config()  # This assumes you have a config file (config.json)

# Access your data using relative paths
dataset = Dataset.File.from_files(path=[
    (ws.datastores['workspaceblobstore'], 'UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/'), 
    (ws.datastores['workspaceblobstore'], 'UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001//'), 
])

# Explore the dataset
dataset.to_pandas_dataframe()

In [None]:
import sys
import platform

print("Python version:", sys.version)
print("Python executable:", sys.executable)
print("Platform:", platform.platform())

# Check for specific packages and their versions
try:
    import numpy
    print("NumPy version:", numpy.__version__)
except ImportError:
    print("NumPy not found.")

try:
    import pandas
    print("Pandas version:", pandas.__version__)
except ImportError:
    print("Pandas not found.")

# Add more checks for other packages you need (e.g., matplotlib, scikit-learn, gdal, etc.)

In [None]:
# Import necessary libraries
import rasterio
import geopandas as gpd
import matplotlib.pyplot as plt

# Load Landsat imagery using rasterio
with rasterio.open('path/to/landsat/image.tif') as src:
    landsat_data = src.read()
    landsat_meta = src.meta

# Perform band calculations using NumPy
ndvi = (landsat_data[4] - landsat_data[3]) / (landsat_data[4] + landsat_data[3])

# Visualize NDVI using matplotlib
plt.figure(figsize=(10, 8))
plt.imshow(ndvi, cmap='RdYlGn')
plt.title('NDVI from Landsat Imagery')
plt.colorbar()
plt.show()

## 2. Landsat Imagery Data Sets 

#### Data Collection
###### We used the following workflow to acquire and preprocess data:
###### Earth Explorer: Landsat satellite imagery of the Jasper Fire region was accessed through the USGS Earth Explorer platform.
###### ESPA: The imagery was downloaded in GeoTIFF format in two batches:
###### jasper-landsat-130
###### jasper-landsat-054
###### Azure Storage Blob: The TIFF files were uploaded to Azure Storage Blob for collaborative access by the team.

#### Landsat Data sourced from Earth Explorer stored in Azure Storage Blob 

In [None]:
# Landsat 130 Paths
landsat_130_ui_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"
landsat_130_azureml_path = "azureml://subscriptions/6328cbe3-a1c5-406e-a25d-72120ce95fdf/resourcegroups/MSLA/workspaces/jasper/datastores/workspaceblobstore/paths/UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"
landsat_130_blob_url = "https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Landsat 054 Paths
landsat_054_ui_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_054_azureml_path = "azureml://subscriptions/6328cbe3-a1c5-406e-a25d-72120ce95fdf/resourcegroups/MSLA/workspaces/jasper/datastores/workspaceblobstore/paths/UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_054_blob_url = "https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"


In [None]:
from azureml.core import Workspace, Dataset

# Load the workspace configuration
ws = Workspace.from_config()

# Access the default datastore
datastore = ws.get_default_datastore()

# Define the paths to your datasets
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Create datasets for the paths
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))

# List the files in the datasets
print("Files in Landsat 054 dataset:")
for file_path in landsat_054_dataset.to_path():
    print(file_path)

print("\nFiles in Landsat 130 dataset:")
for file_path in landsat_130_dataset.to_path():
    print(file_path)


## 3. Jasper Fire Area Landsat Data Exploration and Visualization 

### Study Area - Jasper Fire within Black Hills National Forest

![Black Hills National Forest](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_192011_UTC/Screenshot%202024-11-30%20111812.png)





### Burn Boundary and Land Markers 

#### The burn scar is in the south west corner of the Black Hills National Forest

##### Burn Scar from NBR Images 2020 

![Burn Scar from NBR Images in Black and White](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_192011_UTC/photos/photos/Screenshot%202024-11-30%20144038.png)

#### The burn scar remains 20 years after the wildfire

##### General Context of Fire Location 
![Jasper Fire Burn Boundary](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_213818_UTC/Jasper3.jpg)


#### Data Analysis
###### Burn Severity Mapping: Using indices like dNBR (Differenced Normalized Burn Ratio) to measure the severity of the fire’s impact on vegetation.
###### Vegetation Recovery: Using NDVI (Normalized Difference Vegetation Index) to track regrowth over the past 25 years.
######
###### There are many possibilities of what can be done with this data and that includes future analysis such as: 
###### Fuel Load Modeling: Leveraging 3D geospatial data, including LiDAR, to model fuel loads and better understand forest recovery dynamics.

### Normalized Burn Index (NBR) 

In [None]:
from azureml.core import Workspace, Dataset

# Load the workspace configuration
ws = Workspace.from_config()

# Access the default datastore
datastore = ws.get_default_datastore()

# Define the paths to your datasets
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Create datasets for the paths
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))

# Helper function to filter and sort files
def filter_and_sort_files(dataset):
    file_list = dataset.to_path()
    
    # Filter files containing "NBR" in their names
    filtered_files = [file_path for file_path in file_list if "NBR" in file_path]
    
    # Extract year and month from the filenames and sort
    sorted_files = sorted(
        filtered_files,
        key=lambda x: (int(x.split("_")[3][:4]), int(x.split("_")[3][4:6]))  # Extract year and month
    )
    
    return sorted_files

# Apply the helper function to both datasets
sorted_landsat_054_files = filter_and_sort_files(landsat_054_dataset)
sorted_landsat_130_files = filter_and_sort_files(landsat_130_dataset)

# Print the sorted files
print("Filtered and sorted files in Landsat 054 dataset:")
for file_path in sorted_landsat_054_files:
    print(file_path)

print("\nFiltered and sorted files in Landsat 130 dataset:")
for file_path in sorted_landsat_130_files:
    print(file_path)


### NBR (Normalized Burn Ratio)
##### Image from 2023
![NRB in 2023](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_220544_UTC/Screenshot%202024-11-30%20135822.png)



In [None]:
from azureml.core import Workspace, Dataset

# Load the workspace configuration
ws = Workspace.from_config()

# Access the default datastore
datastore = ws.get_default_datastore()

# Define the paths to your datasets
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Create datasets for the paths
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))

# Helper function to filter, separate, and sort files
def filter_separate_and_sort_files(dataset):
    file_list = dataset.to_path()
    
    # Separate files into NBR and NBR2
    nbr_files = [file_path for file_path in file_list if "NBR" in file_path and "NBR2" not in file_path]
    nbr2_files = [file_path for file_path in file_list if "NBR2" in file_path]
    
    # Extract year and month from the filenames and sort each group
    sorted_nbr_files = sorted(
        nbr_files,
        key=lambda x: (int(x.split("_")[3][:4]), int(x.split("_")[3][4:6]))  # Extract year and month
    )
    sorted_nbr2_files = sorted(
        nbr2_files,
        key=lambda x: (int(x.split("_")[3][:4]), int(x.split("_")[3][4:6]))  # Extract year and month
    )
    
    return sorted_nbr_files, sorted_nbr2_files

# Apply the helper function to both datasets
nbr_files_054, nbr2_files_054 = filter_separate_and_sort_files(landsat_054_dataset)
nbr_files_130, nbr2_files_130 = filter_separate_and_sort_files(landsat_130_dataset)

# Print the results
print("Filtered and sorted NBR files in Landsat 054 dataset:")
for file_path in nbr_files_054:
    print(file_path)

print("\nFiltered and sorted NBR2 files in Landsat 054 dataset:")
for file_path in nbr2_files_054:
    print(file_path)

print("\nFiltered and sorted NBR files in Landsat 130 dataset:")
for file_path in nbr_files_130:
    print(file_path)

print("\nFiltered and sorted NBR2 files in Landsat 130 dataset:")
for file_path in nbr2_files_130:
    print(file_path)


In [None]:
from azureml.core import Workspace, Dataset

# Load the workspace configuration
ws = Workspace.from_config()

# Access the default datastore
datastore = ws.get_default_datastore()

# Define the paths to your datasets
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Create datasets for the paths
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))

# ... (rest of your code)

# Apply the helper function to both datasets
nbr_files_054, nbr2_files_054 = filter_separate_and_sort_files(landsat_054_dataset)
nbr_files_130, nbr2_files_130 = filter_separate_and_sort_files(landsat_130_dataset)

# Combine the file paths
all_nbr_files = nbr_files_054 + nbr_files_130  # Concatenate the lists

# Print the results
print("All NBR files:")
for file_path in all_nbr_files:
    print(file_path)

# ... (rest of your code to visualize the NBR data)

#### Composite of Multiple Rasters (July 2020 and July 2024)
![Black and White NBR Composite](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_192011_UTC/photos/Screenshot%202024-11-30%20142947.png)

In [None]:
import os
import matplotlib.pyplot as plt
from azureml.core import Workspace, Dataset
import rasterio
from rasterio.errors import RasterioIOError

# Define paths for specific years (adjust as needed)
landsat_130_path_2013 = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/*2013*_SR_NBR.tif"
landsat_130_path_2024 = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/*2024*_SR_NBR.tif"

# Create FileDatasets for the specific files
landsat_130_dataset_2013 = Dataset.File.from_files((datastore, landsat_130_path_2013))
landsat_130_dataset_2024 = Dataset.File.from_files((datastore, landsat_130_path_2024))

# Download the datasets to a local directory
download_path = './nbr_files'
os.makedirs(download_path, exist_ok=True)
landsat_130_dataset_2013.download(target_path=download_path, overwrite=True)
landsat_130_dataset_2024.download(target_path=download_path, overwrite=True)

# Get the downloaded file paths
downloaded_files = [
    os.path.join(download_path, f) for f in os.listdir(download_path) if f.endswith('.tif')
]

# Visualize the downloaded NBR TIFF files
for tiff_file_path in downloaded_files:
    try:
        with rasterio.open(tiff_file_path) as dataset:
            nbr_data = dataset.read(1)  # Read the NBR band

            # Visualize the data
            plt.figure(figsize=(10, 6))
            plt.imshow(nbr_data, cmap='viridis')
            plt.colorbar(label='Pixel Values')
            plt.title(f'NBR Visualization: {tiff_file_path}')
            plt.xlabel('Column')
            plt.ylabel('Row')
            plt.show()

    except Exception as e:
        print(f"Error processing file {tiff_file_path}: {e}")

In [None]:
import os
import matplotlib.pyplot as plt
from azureml.core import Workspace, Dataset
from azureml.data.data_reference import DataReference
import rasterio
from rasterio.errors import RasterioIOError

# Load workspace and datastore
ws = Workspace.from_config()
datastore = ws.get_default_datastore()

# Define paths with wildcard for NBR files
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/*_SR_NBR.tif"
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/*_SR_NBR.tif"

# Create FileDatasets
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))

# Helper function to filter NBR TIFF files by year
def filter_nbr_files_by_year(dataset):
    file_list = dataset.to_path()
    nbr_files = [file for file in file_list if "SR_NBR.tif" in file]
    files_by_year = {}
    for file in nbr_files:
        year = file.split("_")[3][:4]  # Extract year (YYYY)
        if year not in files_by_year:
            files_by_year[year] = []
        files_by_year[year].append(file)
    return files_by_year

# Filter NBR files by year
nbr_files_130_by_year = filter_nbr_files_by_year(landsat_130_dataset)

# Stream and visualize the NBR TIFF files for 2013 and 2024
for year, files in nbr_files_130_by_year.items():
    if year in ['2013', '2024']:  # Filter for the years you want
        print(f"Visualizing NBR files for year {year}...")
        for file_path in files:
            print(f"  Visualizing: {file_path}")
            try:
                # Create a DataReference object
                data_reference = DataReference(
                    datastore=datastore,
                    data_reference_name="landsat_data",
                    path_on_datastore=file_path
                )

                # Access the data using as_download()
                downloaded_path = data_reference.as_download()

                # Open the downloaded file with rasterio
                with rasterio.open(downloaded_path) as dataset:
                    nbr_data = dataset.read(1)  # Read the NBR band

                    # Visualize the data
                    plt.figure(figsize=(10, 6))
                    plt.imshow(nbr_data, cmap='viridis')
                    plt.colorbar(label='Pixel Values')
                    plt.title(f'NBR Visualization: {file_path}')
                    plt.xlabel('Column')
                    plt.ylabel('Row')
                    plt.show()

            except Exception as e:
                print(f"Error processing file {file_path}: {e}")

In [None]:
import os
import matplotlib.pyplot as plt
from azureml.core import Workspace, Dataset
from azureml.data.data_reference import DataReference
import rasterio
from rasterio.errors import RasterioIOError

# Load workspace and datastore
ws = Workspace.from_config()
datastore = ws.get_default_datastore()

# Define paths with wildcard for NBR files
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/*_SR_NBR.tif"
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/*_SR_NBR.tif"

# Create FileDatasets
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))

# Helper function to filter NBR TIFF files by year
def filter_nbr_files_by_year(dataset):
    file_list = dataset.to_path()
    nbr_files = [file for file in file_list if "SR_NBR.tif" in file]
    files_by_year = {}
    for file in nbr_files:
        year = file.split("_")[3][:4]  # Extract year (YYYY)
        if year not in files_by_year:
            files_by_year[year] = []
        files_by_year[year].append(file)
    return files_by_year

# Filter NBR files by year
nbr_files_130_by_year = filter_nbr_files_by_year(landsat_130_dataset)

# Stream and visualize the NBR TIFF files for 2013 and 2024
for year, files in nbr_files_130_by_year.items():
    if year in ['2013', '2024']:  # Filter for the years you want
        print(f"Visualizing NBR files for year {year}...")
        for file_path in files:
            print(f"  Visualizing: {file_path}")
            try:
                # Create a DataReference object
                data_reference = DataReference(
                    datastore=datastore,
                    data_reference_name="landsat_data",
                    path_on_datastore=file_path
                )

                # Open the DataReference as a stream with rasterio
                with rasterio.open(data_reference.as_download()) as dataset:
                    nbr_data = dataset.read(1)  # Read the NBR band

                    # Visualize the data
                    plt.figure(figsize=(10, 6))
                    plt.imshow(nbr_data, cmap='viridis')
                    plt.colorbar(label='Pixel Values')
                    plt.title(f'NBR Visualization: {file_path}')
                    plt.xlabel('Column')
                    plt.ylabel('Row')
                    plt.show()

            except Exception as e:
                print(f"Error processing file {file_path}: {e}")

### Hand Replanting is Essential to Forest Post-Fire Recovery  
###### Each year the US Forestry Service has a 2 week window in April to replant nursery grown native Ponderosa Pine throughout the burn area. This effort has continued now for over 20 years. 

###### This is a visualization of the areas within the Jasper Fire area that have been replanted. 

![New Trees Planted](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_192011_UTC/Screenshot%202024-11-30%20134507.png)

In [None]:
from azureml.core import Workspace, Dataset
import rasterio
from rasterio.errors import RasterioIOError  # Import RasterioIOError
import matplotlib.pyplot as plt

# Load the workspace configuration
ws = Workspace.from_config()

# Access the default datastore
datastore = ws.get_default_datastore()

# Define the paths to your datasets
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Create datasets for the paths
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))

# ... (rest of your code)

# Apply the helper function to both datasets
nbr_files_054, nbr2_files_054 = filter_separate_and_sort_files(landsat_054_dataset)
nbr_files_130, nbr2_files_130 = filter_separate_and_sort_files(landsat_130_dataset)

# Combine the file paths
all_nbr_files = nbr_files_054 + nbr_files_130  # Concatenate the lists

# Visualize the NBR data
for tiff_file_path in all_nbr_files:
    try:
        with rasterio.open(tiff_file_path) as dataset:
            nbr_data = dataset.read(1)  # Read the NBR data
            plt.imshow(nbr_data, cmap='RdYlGn')  # Use a suitable colormap for NBR
            plt.colorbar()
            plt.title(f'NBR - {os.path.basename(tiff_file_path)}')  # Set title with file name
            plt.show()

    except RasterioIOError as e:
        print(f"Error opening TIFF file: {e}")

### Shows the Growth Over a 4 Year Period Using Change Detection Analysis

#### Using Change Detection This Shows Regrowth Which Matches Vector Data for Hand Replanting 

![Change Detection](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_192011_UTC/photos/photos/photos/Screenshot%202024-11-30%20145349.png)


In [None]:
import os
import matplotlib.pyplot as plt
from azureml.core import Workspace, Dataset
from azureml.data.data_reference import DataReference
import rasterio
from rasterio.errors import RasterioIOError

# Load workspace and datastore (as before)
ws = Workspace.from_config()
datastore = ws.get_default_datastore()

# Define paths (as before)
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/*_SR_NBR.tif"
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/*_SR_NBR.tif"

# Create FileDatasets (as before)
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))

# ... (your filter_nbr_files_by_year function)

# Filter NBR files by year
nbr_files_130_by_year = filter_nbr_files_by_year(landsat_130_dataset)

# Stream and visualize the NBR TIFF files
for year, files in nbr_files_130_by_year.items():
    print(f"Visualizing NBR files for year {year}...")
    for file_path in files:
        print(f"  Visualizing: {file_path}")
        try:
            # Create a DataReference object
            data_reference = DataReference(
                datastore=datastore,
                data_reference_name="landsat_data",  # Choose a name 
                path_on_datastore=file_path
            )

            # Open the DataReference as a stream with rasterio
            with rasterio.open(data_reference.as_download()) as dataset:
                nbr_data = dataset.read(1)  # Read the NBR band

                # Visualize the data
                plt.figure(figsize=(10, 6))
                plt.imshow(nbr_data, cmap='viridis')
                plt.colorbar(label='Pixel Values')
                plt.title(f'NBR Visualization: {file_path}')
                plt.xlabel('Column')
                plt.ylabel('Row')
                plt.show()

        except Exception as e:
            print(f"Error processing file {file_path}: {e}")

In [None]:
from azureml.core import Workspace, Dataset
import matplotlib.pyplot as plt
import numpy as np
import rasterio
from rasterio.plot import show

# Load the workspace configuration
ws = Workspace.from_config()

# Access the default datastore
datastore = ws.get_default_datastore()

# Define the paths to your datasets
landsat_054_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
landsat_130_path = "UI/2024-11-26_040251_UTC/11022024-201615-130-20241126T031215Z-001/"

# Create datasets for the paths
landsat_054_dataset = Dataset.File.from_files((datastore, landsat_054_path))
landsat_130_dataset = Dataset.File.from_files((datastore, landsat_130_path))

# Filter function for NBR files
def filter_nbr_files(dataset):
    file_list = dataset.to_path()
    nbr_files = [file_path for file_path in file_list if "NBR" in file_path and "NBR2" not in file_path]
    return nbr_files

# Extract and analyze the data quality for NBR images
def analyze_data_quality(nbr_files):
    all_data = []
    missing_files = 0
    valid_dates = []

    # Iterate over the NBR files to analyze them
    for file in nbr_files:
        try:
            # Load the file into a numpy array using rasterio
            with rasterio.open(file) as src:
                data = src.read(1)  # Read the first band (NBR data)
                if data is None or np.all(data == 0):
                    missing_files += 1
                    continue
                valid_dates.append(file.split("/")[-2])  # Extract the date from the file path
                all_data.append(data)
        
        except Exception as e:
            print(f"Error loading {file}: {e}")
            missing_files += 1

    # Convert to numpy arrays
    all_data = np.array(all_data)

    # Compute basic statistics for quality assessment
    if len(all_data) > 0:
        mean_values = np.mean(all_data, axis=(1, 2))  # Mean of each image
        std_values = np.std(all_data, axis=(1, 2))  # Std deviation of each image
        
        return mean_values, std_values, valid_dates, missing_files
    else:
        return None, None, None, missing_files

# Process both datasets
nbr_files_054 = filter_nbr_files(landsat_054_dataset)
nbr_files_130 = filter_nbr_files(landsat_130_dataset)

# Analyze data quality
mean_values_054, std_values_054, valid_dates_054, missing_files_054 = analyze_data_quality(nbr_files_054)
mean_values_130, std_values_130, valid_dates_130, missing_files_130 = analyze_data_quality(nbr_files_130)

# Visualize results (Mean values over time)
def plot_data_quality_trends(valid_dates, mean_values, std_values, dataset_name):
    plt.figure(figsize=(10, 6))
    plt.plot(valid_dates, mean_values, label=f'{dataset_name} Mean NBR', color='b', marker='o')
    plt.fill_between(valid_dates, mean_values - std_values, mean_values + std_values, color='blue', alpha=0.2)
    plt.xlabel('Date')
    plt.ylabel('Mean NBR Value')
    plt.title(f'{dataset_name} NBR Data Quality Over Time')
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.legend()
    plt.show()

# Plot for Landsat 054 and Landsat 130
if mean_values_054 is not None:
    plot_data_quality_trends(valid_dates_054, mean_values_054, std_values_054, "Landsat 054")

if mean_values_130 is not None:
    plot_data_quality_trends(valid_dates_130, mean_values_130, std_values_130, "Landsat 130")

# Report missing data
print(f"Missing NBR files in Landsat 054 dataset: {missing_files_054}")
print(f"Missing NBR files in Landsat 130 dataset: {missing_files_130}")


In [None]:
import rasterio
import matplotlib.pyplot as plt
from collections import defaultdict

# Helper function to group files by year
def group_files_by_year(file_paths, dates):
    files_by_year = defaultdict(list)
    for i, (year, month) in enumerate(dates):
        files_by_year[year].append(file_paths[i])
    return files_by_year

# Group NBR files by year
nbr_files_by_year = group_files_by_year(nbr_files_054 + nbr_files_130, nbr_dates_054 + nbr_dates_130)

# Visualize one GeoTIFF per year
plt.figure(figsize=(15, 10))
for i, (year, files) in enumerate(sorted(nbr_files_by_year.items())):
    file_path = files[0]  # Pick the first file for the year
    with rasterio.open(file_path) as src:
        data = src.read(1)  # Read the first band
        bounds = src.bounds  # Geo-referenced bounds
        crs = src.crs  # Coordinate reference system

    # Plot the data
    plt.subplot(4, 3, i + 1)  # Adjust rows/columns as needed
    plt.imshow(data, cmap="viridis")
    plt.colorbar(label="NBR Values")
    plt.title(f"Year: {year}")
    plt.axis("off")

    # Optionally, print GeoTIFF metadata (e.g., bounds, CRS)
    print(f"Year: {year}, File: {file_path}")
    print(f"Bounds: {bounds}, CRS: {crs}")

plt.tight_layout()
plt.show()



In [None]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
import os
import rasterio
from rasterio.plot import show
import matplotlib.pyplot as plt

# Step 1: Authenticate and initialize Azure ML Client
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

# Step 2: Define datasets
datasets = ["jasper-landsat-130", "jasper-landsat-054"]

# Step 3: Function to process NBR .tif files
def process_tif_file(tif_path):
    try:
        # Open and read the .tif file
        with rasterio.open(tif_path) as dataset:
            nbr_data = dataset.read(1)  # Read the first band
            print(f"File Metadata for {os.path.basename(tif_path)}: {dataset.meta}")

        # Visualize the NBR data
        plt.figure(figsize=(10, 8))
        plt.title(f"NBR Visualization: {os.path.basename(tif_path)}")
        plt.imshow(nbr_data, cmap="RdYlGn", vmin=-1, vmax=1)  # Set NBR range
        plt.colorbar(label="NBR Value")
        plt.xlabel("Column Index")
        plt.ylabel("Row Index")
        plt.grid(False)
        plt.show()

    except Exception as e:
        print(f"Error processing file {tif_path}: {e}")

# Step 4: Process each dataset
for dataset_name in datasets:
    print(f"Processing dataset: {dataset_name}")
    
    # Fetch dataset details from Azure Datastore
    data_asset = ml_client.data.get(name=dataset_name, version="2")  # Update version if needed
    dataset_path = data_asset.path
    
    # List all files in the dataset
    print(f"Dataset path: {dataset_path}")
    tif_files = [
        f"{dataset_path}/{file}" for file in os.listdir(dataset_path) if file.endswith(".tif")
    ]
    
    # Process each .tif file
    for tif_file in tif_files:
        print(f"Processing file: {tif_file}")
        process_tif_file(tif_file)



In [None]:
from azureml.core import Workspace, Dataset
import pandas as pd
import os

ws = Workspace.from_config()
datastore = ws.get_default_datastore()

# Create Datasets for Landsat 130 and 054
landsat_130_dataset = Dataset.File.from_files(path=(datastore, landsat_130_ui_path))
landsat_054_dataset = Dataset.File.from_files(path=(datastore, landsat_054_ui_path))

# Download the files
landsat_130_dataset.download(target_path='./landsat_130', overwrite=True)
landsat_054_dataset.download(target_path='./landsat_054', overwrite=True)

# Assuming your Landsat data is in a format like CSV or GeoJSON
# Adjust the following code based on your actual data format

# Read the downloaded files into pandas DataFrames
landsat_130_df = pd.DataFrame()  # Initialize an empty DataFrame
for filename in os.listdir('./landsat_130'):
    if filename.endswith('.csv'):  # Or your file extension
        filepath = os.path.join('./landsat_130', filename)
        df = pd.read_csv(filepath)  # Or pd.read_json(), etc.
        landsat_130_df = pd.concat([landsat_130_df, df], ignore_index=True)

landsat_054_df = pd.DataFrame()
for filename in os.listdir('./landsat_054'):
    if filename.endswith('.csv'):
        filepath = os.path.join('./landsat_054', filename)
        df = pd.read_csv(filepath)
        landsat_054_df = pd.concat([landsat_054_df, df], ignore_index=True)

# Now you have landsat_130_df and landsat_054_df as pandas DataFrames

In [None]:
from azureml.core import Workspace, Datastore, Dataset
from azureml.data.datapath import DataPath
import os



In [None]:
# Define the paths
blob_path = "UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/"
datastore_path = [(datastore, blob_path)]

# Create a TabularDataset or FileDataset
dataset = Dataset.File.from_files(path=datastore_path)


In [None]:
# Download files locally for inspection
local_paths = dataset.download(target_path="data", overwrite=True)  # Returns a list of downloaded paths

# List all downloaded files from the paths
import os

# Check each path for files
for path in local_paths:
    if os.path.isdir(path):  # If it's a directory, list files inside
        print(f"Files in directory {path}:")
        print(os.listdir(path))
    else:  # If it's a file, just print the file path
        print(f"Downloaded file: {path}")



In [None]:
# Filter .tif files
tif_files = [file for file in local_path if file.endswith('.tif')]
print(tif_files[:10])  # Print the first 10 .tif files



In [None]:
print(f"Total files: {len(local_path)}")


In [None]:
from collections import Counter

# Count file extensions
extensions = [os.path.splitext(file)[1] for file in local_path]
extension_counts = Counter(extensions)
print(extension_counts)


In [None]:
!pip install mltable

In [None]:
!pip install azure-ai-ml
!pip install azure-identity


In [None]:
from azure.ai.ml import MLClient, command, Input
from azure.ai.ml.constants import AssetTypes, InputOutputModes
from azure.identity import DefaultAzureCredential

# Initialize MLClient
ml_client = MLClient.from_config(credential=DefaultAzureCredential())

# Get your dataset
data_asset = ml_client.data.get("jasper-landsat-054", version="1")

# Define the job command to list TIFF files
job = command(
    command="find ${inputs.data} -type f -name '*.tif'",
    inputs={
        "data": Input(
            path=data_asset.id,
            type=AssetTypes.URI_FOLDER,
            mode=InputOutputModes.RO_MOUNT,
        ),
    },
    environment="azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
)

# Submit the job
returned_job = ml_client.jobs.create_or_update(job)

# Print job details
print(f"Job submitted. ID: {returned_job.id}")



In [None]:
import os

# Assuming `local_path` is the list returned by dataset.download()
local_paths = dataset.download(target_path="data", overwrite=True)

# Iterate through each path in the list and print the files
for path in local_paths:
    if os.path.isdir(path):  # Check if the path is a directory
        print(f"Contents of {path}:")
        print(os.listdir(path))
    else:
        print(f"File: {path}")


In [None]:
import os
import re
import rasterio
import matplotlib.pyplot as plt

# List all files
files = [path for path in local_paths if os.path.isfile(path)]

# Debugging: Print the first 5 files for inspection
print(f"Total files found: {len(files)}")
print("Sample file paths:")
for file in files[:5]:  # Print a sample
    print(file)

# Updated regex to capture the correct date portion (e.g., 20220708 from _20220708_)
date_pattern = re.compile(r"_(\d{4})(\d{2})(\d{2})_")  # Matches _YYYYMMDD_

# Function to filter files by year
def filter_files_by_year(files, year):
    filtered_files = []
    for file in files:
        match = date_pattern.search(file)
        if match:
            # Extract the year, month, and day from the date
            file_year = match.group(1)  # Extract YYYY
            file_month = match.group(2)  # Extract MM (optional for future use)
            file_day = match.group(3)  # Extract DD (optional for future use)

            # Debug: Print matched years and full dates
            print(f"Matched date: {file_year}-{file_month}-{file_day} for file: {file}")

            if file_year == year:
                filtered_files.append(file)
    return filtered_files

# Visualize raster files for a given year
def visualize_files_by_year(files, year):
    print(f"\nVisualizing data for the year {year}:")
    year_files = filter_files_by_year(files, year)
    if not year_files:
        print(f"No files found for the year {year}.")
        return
    
    for file in year_files:
        try:
            with rasterio.open(file) as src:
                data = src.read(1)  # Read the first band
                plt.figure(figsize=(10, 6))
                plt.imshow(data, cmap='viridis')
                plt.colorbar(label="Pixel Values")
                plt.title(f"Visualization of {os.path.basename(file)}")
                plt.show()
        except Exception as e:
            print(f"Could not process file {file}: {e}")

# Loop through the years 2020–2024 and visualize
years = ["2020", "2021", "2022", "2023", "2024"]
for year in years:
    visualize_files_by_year(files, year)


In [None]:
from azureml.core import Workspace, Dataset, Datastore
import os
import re
import rasterio
import numpy as np
import matplotlib.pyplot as plt

# Step 1: Connect to Azure ML Workspace
workspace = Workspace.from_config()

# Step 2: Access Datastore
datastore = Datastore.get(workspace, 'workspaceblobstore')

# Step 3: Define a single path for the NDVI files
relative_path = 'UI/2024-11-19_143459_UTC/11022024-201612-054-20241119T143214Z-001/'  # Adjust if necessary

# Step 4: Access dataset
ndvi_dataset = Dataset.File.from_files((datastore, relative_path))

# Step 5: Download dataset locally
local_path = ndvi_dataset.download(target_path='.', overwrite=True)

# Step 6: Collect all files from the downloaded path
all_files = []
for root, _, files in os.walk(local_path):
    for file in files:
        all_files.append(os.path.join(root, file))

# Step 7: Filter files for NDVI in June, July, August 2023
pattern = re.compile(r".*NDVI.*_2023(06|07|08)_.*\.tif$")
filtered_files = [file for file in all_files if pattern.search(file)]

# Debugging: Check filtered files
print(f"Total NDVI files for June, July, August 2023: {len(filtered_files)}")
for file in filtered_files:
    print(f"Matched file: {file}")

# Step 8: Process and visualize NDVI files
if not filtered_files:
    print("No NDVI files found for the specified criteria.")
else:
    for file_path in filtered_files:
        try:
            print(f"Processing file: {file_path}")
            with rasterio.open(file_path) as src:
                # Read NDVI data
                data = src.read(1)  # Assuming first band contains NDVI
                mean_value = np.nanmean(data)
                print(f"Mean NDVI value: {mean_value}")
                
                # Visualize NDVI data
                plt.figure(figsize=(10, 6))
                plt.imshow(data, cmap='viridis')
                plt.colorbar(label='NDVI Values')
                plt.title(f"NDVI Visualization: {os.path.basename(file_path)}")
                plt.show()
        except Exception as e:
            print(f"Error processing {file_path}: {e}")


### NDVI 

##### (NDVI) Normalized Difference Vegetation Index, a remote sensing method that measures the health and amount of vegetation in an area

##### Burn Scar 2022

![NDVI Burn Scar](https://firerecovery.blob.core.windows.net/azureml-blobstore-d516a889-2242-4cd1-9ee4-4a3315f1782b/UI/2024-11-30_192011_UTC/photos/photos/photos/photos/Screenshot%202024-11-30%20150106.png)





### Results and Discussion
###### As a group we learned a significant amount about how to use Azure Storage Blob, Azure Machine Learning Studio, machine learning models, and geospatial packages while better understanding forest fire and the long process of forest recovery that in this case demonstrated that it was not a natural regrowth but required careful, diligent, and constrained long-range intervention. 

###### The TIFF files are incredibly large so it took some time just to learn how to acquire and bring them into the Azure Storage Blob. 
###### 
###### Additional challenges included getting library dependencies to work together and resolve countless errors. 

###### There is much potential for the use of machine learning tools such as Azure AI to explore satellite imagery and various indices (NDVI, NBR) and do various analysis to understand burn severity, recovery, and change over time. 


### Conclusion


####  Takeaways: Key Insights for the Jasper Fire Recovery

###### * Understanding the long-term recovery process highlights the need for continued innovation in restoration and resilience planning.
###### * Satellite imagery and AI-driven analysis enable real-time tracking of vegetation regrowth, land use changes, and fire-prone areas.
###### * Manual replanting is essential for forest restoration, but it is constrained by time and resource limits.


### References 
###### Chen, X., Vogelmann, J. E., Rollins, M., Ohlen, D., Key, C. H., Yang, L., Huang, C., & Shi, H. (2011). Detecting post-fire burn severity and vegetation recovery using multitemporal remote sensing spectral indices and field-collected composite burn index data in a ponderosa pine forest. International Journal of Remote Sensing, 32(23), 7905–7927. https://doi.org/10.1080/01431161.2010.524678
###### ● Crookston, N. L., & Dixon, G. E. (2005). The forest vegetation simulator: A review of its structure, content, and applications. Computers and Electronics in Agriculture, 49, 60–80. https://doi.org/10.1016/j.compag.2005.08.005
###### ● Ex, S. A., Smith, F. W., Keyser, T. L., & Rebain, S. A. (2016). Development and evaluation of equations for estimating canopy fuel characteristics for 10 major conifers of the western United States. Western Journal of Applied Forestry, 31(4), 161–169. https://doi.org/10.5849/wjaf.15-076
###### ● Fulé, P. Z., Waltz, A. E. M., Covington, W. W., & Heinlein, T. A. (2001). Measuring forest restoration effectiveness in reducing hazardous fuels. Journal of Forestry, 99(1), 24–29. https://doi.org/10.1093/jof/99.1.24
###### ● Hawley, C. M., Loudermilk, E. L., Rowell, E. M., & Pokswinski, S. (2018). A novel approach to 3D fuels biomass sampling for 3D fuel characterization. MethodsX, 5, 1597–1604. https://doi.org/10.1016/j.mex.2018.10.002
###### ● Keyser, T., Smith, F. W., & Shepperd, W. (2009). Estimating canopy fuels and their impact on potential fire behavior for ponderosa pine in the Black Hills, South Dakota. DigitalCommons@University of Nebraska - Lincoln. https://digitalcommons.unl.edu/jfspresearch/138
###### ● Maxwell, A. E., Gallagher, M. R., Minicuci, N., Bester, M. S., Loudermilk, E. L., Pokswinski, S. M., & Skowronski, N. S. (2023). Impact of reference data sampling density for estimating plot-level average shrub heights using terrestrial laser scanning data. Fire, 6(3), 98. https://doi.org/10.3390/fire6030098
###### ● Parks, S. A., Dobrowski, S. Z., Parisien, M.-A., Miller, C., & Hudak, A. T. (2022). Burn severity mapping: A synthesis of the state of the science and a look to the future. International Journal of Wildland Fire, 31, 539–560. https://doi.org/10.1071/WF22050
###### ● Picotte, J. J., Cansler, C. A., Kolden, C. A., Lutz, J. A., Key, C., Benson, N. C., & Robertson, K. M. (2021). Development of nationally consistent and ecologically relevant fire severity metrics from Landsat time series. Remote Sensing of Environment, 263, 112569. https://doi.org/10.1016/j.rse.2021.112569
###### ● Reiner, A. L., Baker, C., Wahlberg, M., Rau, B. M., & Birch, J. D. (2022). Region-specific remote-sensing models for predicting burn severity, basal area change, and canopy cover change following fire in the southwestern United States. Fire, 5(5), 137. https://doi.org/10.3390/fire5050137
###### ● Ross, C. W., Loudermilk, E. L., Skowronski, N., Pokswinski, S., Hiers, J. K., & O’Brien, J. (2022). LiDAR voxel-size optimization for canopy gap estimation. Remote Sensing, 14(5), 1054. https://doi.org/10.3390/rs14051054
###### ● Rowell, E., Loudermilk, E. L., Pokswinski, S., Hiers, J., O’Brien, J., Mathey, J., & Robertson, K. (2020). Coupling terrestrial laser scanning with 3D fuel biomass sampling for advancing wildland fuels characterization. Forest Ecology and Management, 462, 117945. https://doi.org/10.1016/j.foreco.2020.117945
###### ● Scheffler, D., & Frantz, D. (2022). Regionally optimized spectral harmonization of Landsat-5, Landsat-7, and Landsat-8 for improved forest monitoring in the Bavarian Forest National Park, Germany. International Journal of Applied Earth Observation and Geoinformation, 115, 103126. https://doi.org/10.1016/j.jag.2022.103126
###### ● Smith, A. M. S., Lentile, L. B., Hudak, A. T., & Morgan, P. (2007). Evaluation of linear spectral unmixing and ΔNBR for predicting post-fire recovery in a North American ponderosa pine forest. International Journal of Remote Sensing, 28(22), 5159–5166. https://doi.org/10.1080/01431160701395161
###### ● U.S. Department of Agriculture, Forest Service. (2003). Fire and Fuels Extension to the Forest Vegetation Simulator. https://www.fs.fed.us/fmsc/fvs/FFEguide.pdf
###### ● Xi, Z., Chasmer, L., & Hopkinson, C. (2023). Delineating and reconstructing 3D forest fuel components and volumes with terrestrial laser scanning. Remote Sensing, 15(19), 4778. https://doi.org/10.3390/rs15194778


#### Contact
###### For any inquiries, please reach out to the team:
###### 
###### Manuel Malla: manuel.malla@studentambassadors.com
###### Yash Padhara: Yash.Padhara@studentambassadors.com
###### Sneha Pandey: sneha.pandey@studentambassadors.com
###### Philippa Burgess: philippa.burgess@studentambassadors.com