In [1]:
#@title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the "License")

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# 🔥 Wildfire spread forecasting -- Overview

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/wildfires-forecasting/notebooks/1-overview.ipynb)

In 2021, wildfires destroyed [7 million acres of wildland](https://www.ncei.noaa.gov/access/monitoring/monthly-report/fire/202113)--roughly the same area as the state of Massachusetts. These wildfires destroyed homes, towns, and people's lives.

<figure>
<img alt="Exterior image of a house mostly destroyed by flames"
     src="https://media.cnn.com/api/v1/images/stellar/prod/200908110238-07-wildfires-0907-malden-wa.jpg?q=x_17,y_443,h_876,w_1556,c_crop/h_720,w_1280"/>
<figcaption><i>Figure. The 2020 Babb Road wildfire destroying a home in Malden, WA</i></figcaption>
</figure>


For a wildfire to catch hold and spread, a set of conditions must exist in an environment. These conditions have been measured and recorded in multiple sources--sources that are available in Earth Engine. Imagine if you could build a ML model that can predict the likelihood and spread of wildfires!

This sample is broken into the following notebooks:

* 🧭 **Overview**. Go through what we want to achieve and explore the data we want to use as inputs and outputs for our model.
* 🗄️ [**Create the dataset**](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/wildfires-forecasting/notebooks/2-dataset.ipynb) Use [Apache Beam](https://beam.apache.org/) to fetch data from [Earth Engine](https://earthengine.google.com/) and create a dataset for our model in [Dataflow](https://cloud.google.com/dataflow).
* 🧠 **Train the model**: Build a simple _Fully Convolutional Network_ in [PyTorch](https://pytorch.org/) and train it in [Vertex AI](https://cloud.google.com/vertex-ai/docs/training/custom-training) with the dataset we created.
* 🔮 **Model predictions**: Get predictions from the model with data it has never seen before.

This sample leverages geospatial satellite and topographical data from [Google Earth Engine](https://earthengine.google.com/). Using satellite imagery, you'll build and train a model for predicting the potentials spread of a "current" wildfire.

+ ⏲️ Time estimate: TT hours
+ 💰 Cost estimate: Around \\$DD USD (free if you use \\$300 Cloud credits)

💚 This is one of many machine learning how-to samples inspired from real climate solutions aired on the People and Planet AI 🎥 series.

## 📒 Using this interactive notebook

Click the **run** icons ▶️ of each section within this notebook.

![Run cell](data/images/run-cell.png)

> 💡 Alternatively, you can run the currently selected cell with `Ctrl + Enter` (or `⌘ + Enter` in a Mac).

This **notebook code lets you train and deploy an ML model** from end-to-end. When you run a code cell, the code runs in the notebook's runtime, so you're not making any changes to your personal computer.

> ⚠️ **To avoid any errors**, wait for each section to finish in their order before clicking the next “run” icon.

This sample must be connected to a **Google Cloud project**, but nothing else is needed other than your Google Cloud project.

You can use an existing project or you can create a new Cloud project [with cloud credits for free.](https://cloud.google.com/free/docs/gcp-free-tier)

## 🎬 Before you begin

Let's start by cloning the GitHub repository and installing some dependencies.

In [None]:
# Now let's get the code from GitHub and navigate to the sample.
!git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
%cd python-docs-samples/people-and-planet-ai/weather-forecasting

Next, we have to authenticate Earth Engine and initialize it.
Since we've already authenticated to this [Colab](https://www.youtube.com/watch?v=rNgswRZ2C1Y) and saved our credentials as the [Google default credentials](https://google-auth.readthedocs.io/en/master/reference/google.auth.html#google.auth.default),
we can reuse those credentials for Earth Engine.

> 💡 Since we're making **large amounts of automated requests to Earth Engine**, we want to use the
[high-volume endpoint](https://developers.google.com/earth-engine/cloud/highvolume).

In [2]:
import ee
import google.auth

def ee_init() -> None:
    """Authenticate and initialize Earth Engine with the default credentials."""
    # Use the Earth Engine High Volume endpoint.
    #   https://developers.google.com/earth-engine/cloud/highvolume
    credentials, _ = google.auth.default(
        scopes=[
            "https://www.googleapis.com/auth/cloud-platform",
            "https://www.googleapis.com/auth/earthengine",
        ]
    )
    ee.Initialize(
        credentials,
        project=project,
        opt_url="https://earthengine-highvolume.googleapis.com",
    )

In [3]:
ee_init()

# 📚 Understand the data

Before we begin, let's consider what we want to achieve and the datasets we chose for that purpose.

## 🎯 **Goal**: Time series forecasting and image segmentation

The goal of our model is to use satellite images to analyze the likelihood and potential spread of wildfires for a given geographical region. The output layer will combine a time series forecast (likelihood of fire to spread) and a classification (on fire or not on fire).

## 🛰 Inputs: Satellite images

To achieve our goal, we must combine multiple geographical datasets into a single dataset (or map in this case). Each input--also known as "features" or "independent variables"--will be stored as a single band within the resulting map. The following list shows the datasets used for this example:

* **USGS/SRTMGL1_003**: NASA SRTM Digital Elevation 30m
* **GRIDMET/DROUGHT**: CONUS Drought Indices
* **ECMWF/ERA5/DAILY**: Daily Aggregates - Latest Climate Reanalysis Produced by ECMWF / Copernicus Climate Change Service
* **IDAHO_EPSCOR/GRIDMET**: University of Idaho Gridded Surface Meteorological Dataset
* **CIESIN/GPWv411/GPW_Population_Density**: Population Density (Gridded Population of the World Version 4.11)

The following table shows the model input variables, the source dataset, and the symbols used for variable in our model.

| Feature | Original Source | Variable name |
| --------|:----------------|:--------------|
| Elevation | `USGS/SRTMGL1_003` | `elevation` |
| Palmer Drought Severity Index | `GRIDMET/DROUGHT` | `psdi` |
| Avg air temperature at 2m height | `ECMWF/ERA5/DAILY` | `mean_2m_air_temperature` |
| Total precipitation | `ECMWF/ERA5/DAILY` | `total_precipitation` |
| 10m u-component of wind (daily avg) | `ECMWF/ERA5/DAILY` | `u_component_of_wind_10m` |
| 10m v-component of wind (daily avg) | `ECMWF/ERA5/DAILY` | `v_component_of_wind_10m'` |
|
| Precipatation amount | `IDAHO_EPSCOR/GRIDMET` | `pr` |
| Specific humidity | `IDAHO_EPSCOR/GRIDMET` | `sph` |
| Wind direction | `IDAHO_EPSCOR/GRIDMET` | `th` |
| Minimum temperature | `IDAHO_EPSCOR/GRIDMET` | `tmmn` |
| Maximum temperature | `IDAHO_EPSCOR/GRIDMET` | `tmmx` |
| Wind velocity at 10m | `IDAHO_EPSCOR/GRIDMET` | `vs` |
| Energy release component | `IDAHO_EPSCOR/GRIDMET` | `erc` |
| Population density (per square km) | `CIESIN/GPWv411/GPW_Population_Density` | `population_density` |




In [7]:
INPUTS = {
    'USGS/SRTMGL1_003': ["elevation"],
    'GRIDMET/DROUGHT': ["psdi"],
    'ECMWF/ERA5/DAILY': [
         'mean_2m_air_temperature',
         'total_precipitation',
         'u_component_of_wind_10m',
         'v_component_of_wind_10m'],
    'IDAHO_EPSCOR/GRIDMET': [
         'pr',
         'sph',
         'th',
         'tmmn',
         'tmmx',
         'vs',
         'erc'],
    'CIESIN/GPWv411/GPW_Population_Density': ['population_density'],
    'MODIS/006/MOD14A1': ['FireMask']
}

## 🗺 **Outputs**: Land cover map

Finally, we need to give the model a set of labels to apply to each section of the map. These labels tell the training program (Tensorflow) what we want to infer from the previous data. In other words, this dataset represents the "dependent variable" that our model attempts to predict. For our model, we will use the "Terra Thermal Anomalies & Fire Daily Global 1km (MODIS/006/MOD14A1)" map from Earth Engine. We'll use the band `FireMask` provided by the map.

In [14]:
LABELS = {
    'MODIS/006/MOD14A1': ['FireMask'],
}

# BONEYARD