[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/forestdatapartnership/whisp/blob/main/notebooks/Colab_whisp_geojson_to_csv.ipynb)

# Whisp a geojson

Python Notebook pathway for [Whisp](https://openforis.org/solutions/whisp/) running in the cloud via [Google Colab](https://colab.google/).

**To open:**
click badge at top.

**To run:** click play buttons (or press shift + enter)

**Requirements:** Google Earth Engine (GEE) account and registered cloud project.



- **Aim:** support compliance with zero deforestation regulations
- **Input**: geojson file of plot boundaries or points
- **Output**: CSV table and geojson containing statistics and risk indicators

### Setup Google Earth Engine

In [3]:
import ee

# Google Earth Engine project name
gee_project_name = "ee-dnsalazar10" # change to your project name. If unsure see here: https://developers.google.com/earth-engine/cloud/assets)

# NB opens browser to allow access
ee.Authenticate()

# initialize with chosen project
ee.Initialize(project=gee_project_name)

### Install and import packages

In [4]:
# Install openforis-whisp (if not already installed)
!pip install --pre openforis-whisp

Collecting openforis-whisp
  Downloading openforis_whisp-2.0.0a6-py3-none-any.whl.metadata (16 kB)
Collecting country_converter<2.0.0,>=0.7 (from openforis-whisp)
  Downloading country_converter-1.3.1-py3-none-any.whl.metadata (25 kB)
Collecting geojson<3.0.0,>=2.5.0 (from openforis-whisp)
  Downloading geojson-2.5.0-py2.py3-none-any.whl.metadata (15 kB)
Collecting pandera<1.0.0,>=0.22.1 (from pandera[io]<1.0.0,>=0.22.1->openforis-whisp)
  Downloading pandera-0.26.1-py3-none-any.whl.metadata (10 kB)
Collecting typing_inspect>=0.6.0 (from pandera<1.0.0,>=0.22.1->pandera[io]<1.0.0,>=0.22.1->openforis-whisp)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting black (from pandera[io]<1.0.0,>=0.22.1->openforis-whisp)
  Downloading black-25.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl.metadata (81 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.3/81.3 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?2

In [5]:
import openforis_whisp as whisp

top-level pandera module will be **removed in a future version of pandera**.
If you're using pandera to validate pandas objects, we highly recommend updating
your import:

```
# old import
import pandera as pa

# new import
import pandera.pandas as pa
```

If you're using pandera to validate objects from other compatible libraries
like pyspark or polars, see the supported libraries section of the documentation
for more information on how to import pandera:

https://pandera.readthedocs.io/en/stable/supported_libraries.html


```
```



### Get a geojson

- Files are stored tempoarily and can be viewed in a panel on the left (click on Folder icon to view).
- Press refresh if updates are not showing
- Alternatively you can work with files in your Google Drive: drive.mount('/content/drive')

In [13]:
#function to upload a geojson file. Download example here: https://github.com/andyarnell/whisp/tree/package-test-new-structure/tests/fixtures)
def import_geojson():
    from google.colab import files
    fn, content = next(iter(files.upload().items()))
    with open(f'/content/{fn}', 'wb') as f: f.write(content)
    return f'/content/{fn}'

In [14]:
GEOJSON_EXAMPLE_FILEPATH = import_geojson()
print(f"GEOJSON_EXAMPLE_FILEPATH: {GEOJSON_EXAMPLE_FILEPATH}")

Saving test1_poly1.geojson to test1_poly1.geojson
GEOJSON_EXAMPLE_FILEPATH: /content/test1_poly1.geojson


### Whisp it

In [15]:
# Choose countries to process (currently three countries: 'co', 'ci', 'br')
iso2_codes_list = ['co', 'ci', 'br']  # Example ISO2 codes for including country specific data

In [20]:
import pandas as pd
import geopandas as gpd
import ee
import openforis_whisp as whisp

# Read the geojson file into a GeoDataFrame
gdf = gpd.read_file(GEOJSON_EXAMPLE_FILEPATH)

# Convert any Timestamp columns to strings
for col in gdf.columns:
    if pd.api.types.is_datetime64_any_dtype(gdf[col]):
        gdf[col] = gdf[col].astype(str)

# Convert the GeoDataFrame to a GeoJSON string and then to an Earth Engine FeatureCollection
feature_collection = ee.FeatureCollection(gdf.__geo_interface__)

df_stats = whisp.whisp_formatted_stats_ee_to_df(
    feature_collection=feature_collection,
    # external_id_column="user_id",# optional - specify which input column/property to map to the external ID.
    national_codes=iso2_codes_list,
    # unit_type='percent', # optional - to change unit type. Default is 'ha'.
    )

Whisp multiband image compiled
Creating schema for national_codes: ['co', 'ci', 'br']
external_id


### Display results

In [21]:
df_stats

Unnamed: 0,plotId,external_id,Area,Geometry_type,Country,ProducerCountry,Admin_Level_1,Centroid_lon,Centroid_lat,Unit,...,nBR_MapBiomas_col9_palmoil_2020,nBR_MapBiomas_col9_pc_2020,nBR_INPE_TCamz_cer_annual_2020,nBR_MapBiomas_col9_soy_2020,nBR_MapBiomas_col9_annual_crops_2020,nBR_INPE_TCamz_pasture_2020,nBR_INPE_TCcer_pasture_2020,nBR_MapBiomas_col9_pasture_2020,nCI_Cocoa_bnetd,geo
0,1,,6.571,Polygon,COL,CO,Quindío,-75.777852,4.441885,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7793..."
1,2,,10.26,Polygon,COL,CO,Quindío,-75.777832,4.441812,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7796..."
2,3,,0.032,Polygon,COL,CO,Quindío,-75.776914,4.441445,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7770..."
3,4,,2.656,Polygon,COL,CO,Quindío,-75.792858,4.432392,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7950..."
4,5,,0.194,Polygon,COL,CO,Quindío,-75.792245,4.431444,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7925..."
5,6,,42.437,Polygon,COL,CO,Quindío,-75.797883,4.433053,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.8049..."
6,7,,113.566002,Polygon,COL,CO,Quindío,-75.78335,4.433162,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7903..."
7,8,,21.764999,Polygon,COL,CO,Quindío,-75.782425,4.43103,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7858..."
8,9,,9.543,Polygon,COL,CO,Quindío,-75.778936,4.435003,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7809..."
9,10,,4.822,Polygon,COL,CO,Quindío,-75.786327,4.427279,ha,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"{'type': 'Polygon', 'coordinates': [[[-75.7880..."


### Add risk category columns

In [22]:
# adds risk columns to end of dataframe
df_w_risk = whisp.whisp_risk(df=df_stats,national_codes=iso2_codes_list)

Using unit type: ha


### Display updated table
- Scroll to far right to see additions

In [23]:
df_w_risk

Unnamed: 0,plotId,external_id,Area,Geometry_type,Country,ProducerCountry,Admin_Level_1,Centroid_lon,Centroid_lat,Unit,...,Ind_05_primary_2020,Ind_06_nat_reg_forest_2020,Ind_07_planted_plantations_2020,Ind_08_planted_plantations_after_2020,Ind_09_treecover_after_2020,Ind_10_agri_after_2020,Ind_11_logging_concession_before_2020,risk_pcrop,risk_acrop,risk_timber
0,1,,6.571,Polygon,COL,CO,Quindío,-75.777852,4.441885,ha,...,no,yes,no,no,yes,yes,no,low,low,low
1,2,,10.26,Polygon,COL,CO,Quindío,-75.777832,4.441812,ha,...,no,yes,no,no,yes,yes,no,low,low,low
2,3,,0.032,Polygon,COL,CO,Quindío,-75.776914,4.441445,ha,...,no,yes,no,no,yes,yes,no,low,low,low
3,4,,2.656,Polygon,COL,CO,Quindío,-75.792858,4.432392,ha,...,no,yes,no,no,yes,yes,no,low,low,low
4,5,,0.194,Polygon,COL,CO,Quindío,-75.792245,4.431444,ha,...,no,yes,no,no,yes,yes,no,more_info_needed,more_info_needed,high
5,6,,42.437,Polygon,COL,CO,Quindío,-75.797883,4.433053,ha,...,yes,yes,no,no,yes,yes,no,low,low,low
6,7,,113.566002,Polygon,COL,CO,Quindío,-75.78335,4.433162,ha,...,no,yes,no,no,yes,yes,no,low,low,low
7,8,,21.764999,Polygon,COL,CO,Quindío,-75.782425,4.43103,ha,...,no,yes,no,no,yes,yes,no,low,low,low
8,9,,9.543,Polygon,COL,CO,Quindío,-75.778936,4.435003,ha,...,no,yes,no,no,yes,yes,no,low,low,low
9,10,,4.822,Polygon,COL,CO,Quindío,-75.786327,4.427279,ha,...,no,yes,no,no,yes,yes,no,low,low,low


### Export table with risk columns to CSV (temporary storage)

In [24]:
df_w_risk.to_csv("whisp_output_table_w_risk.csv",index=False)

### Export table with risk columns to geojson (temporary storage)

In [25]:
whisp.convert_df_to_geojson(df_w_risk,"whisp_output_table_w_risk.geojson") # builds a geojson file containing Whisp columns. Uses the geometry column "geo" to create the spatial features.

GeoJSON saved to whisp_output_table_w_risk.geojson


### Download outputs to local storage
- Saves files in "Downloads" folder on your machine
- If you see a "Downloads blocked" button at top of browser click to allow file downloads.
- Alternatively right click on file in the folder (in the panel on your left) and choose 'Download'.

In [26]:
from google.colab import files
files.download('whisp_output_table_w_risk.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [27]:
files.download('whisp_output_table_w_risk.geojson') # spatial output

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>