<img width="100" src="https://carbonplan-assets.s3.amazonaws.com/monogram/dark-small.png" style="margin-left:0px;margin-top:20px"/>

Forest Emissions Tracking - Phase I
===================================

_by Joe Hamman and Jeremy Freeman (CarbonPlan)_

March 29, 2020

## Introduction
In general, greenhouse gasses (GHGs) arising from forest land use changes can be attributed to both natural factors (e.g. wildfire) and human activities (e.g. deforestation). Our approach is to build upon an existing body of research that has provided high-resolution satellite-based estimates of aboveground biomass (Spawn et al., 2020), forest cover change (Hansen et al., 2013), and change attribution (Curtis et al., 2018). While many of the necessary data products already exist, we can integrate, extend, or update these resources to provide global, current estimates that can be integrated with the other resources produced by the coalition.

Specifically, for any given spatial extent and time duration ($t1$ to $t2$), we can use three quantities — existing biomass, forest cover change, and change attribution — to estimate the effective GHG emissions from land use changes. The simplest estimate is:

$\Delta Biomass (t) = TotalBiomass (t) * \Delta ForestCover (\%)$

$Emissions (tCO_2) = \Delta Biomass (t) * 0.5 (tC/t) * 3.67 (tC02 / tC)$

where $\Delta ForestCover$ is the fraction of pixels within the given spatial extent that experienced a stand-replacement disturbance between $t1$ and $t2$. The $TotalBiomass$ is estimated as the aboveground biomass at time $t1$. This estimate can be further refined by attributing, for each pixel, the source of forest cover loss (e.g. wildfire, deforestation, etc.), and using those sources to express emissions fractionally and/or exclude certain categories from total estimates (e.g. rotational clear-cutting within tree plantations). Pixel-wise estimates can then be aggregated into province and country-wide estimates.

## Setup

To begin, we'll import a handful of Python libraries and set a few constants.

In [None]:
%matplotlib inline
import dask
import intake
import xarray as xr
from tqdm import tqdm
import numcodecs


TC02_PER_TC = 3.67
TC_PER_TBM = 0.5
SQM_PER_HA = 10000
ORNL_SCALING = 0.1

In [None]:
from dask_gateway import Gateway

gateway = Gateway()
options = gateway.cluster_options()
options.worker_cores = 2
options.worker_memory = 24
cluster = gateway.new_cluster(cluster_options=options)
cluster.adapt(minimum=1, maximum=300)
cluster


In [None]:
client = cluster.get_client()
client

In [None]:
# data catalog
cat = intake.open_catalog("https://raw.githubusercontent.com/carbonplan/forest-emissions-tracking/master/catalog.yaml")

In [None]:
import fsspec

with fsspec.open('https://storage.googleapis.com/earthenginepartners-hansen/GFC-2018-v1.6/treecover2000.txt') as f:
    lines = f.read().decode().splitlines() 
print(len(lines))

# all tiles
# lats = []
# lons = []
# for line in lines:
#     pieces = line.split('_')
#     lats.append(pieces[-2])
#     lons.append(pieces[-1].split('.')[0])
    
# conus tiles
conus_lats = ['60N', '50N', '40N', '30N']
conus_lons = ['140W', '130W', '120W', '110W', '100W', '90W', '80W', '70W', '60W']

# all tiles
lats = []
lons = []
for line in lines:
    pieces = line.split('_')
    lat = pieces[-2]
    lon = pieces[-1].split('.')[0]
    
    if (lat in conus_lats) and (lon in conus_lons):
        lats.append(lat)
        lons.append(lon)
print(len(lats))

In [None]:
def _preprocess(da, lat=None, lon=None):
    da = da.rename({"x": "lon", "y": "lat"}).squeeze(drop=True)
    if lat is not None:
        da = da.assign_coords(lat=lat, lon=lon)
    return da


def open_hansen_2018_tile(lat, lon, emissions=False):
    ds = xr.Dataset()

    # Min Hansen data
    variables = ["treecover2000", "gain", "lossyear", "datamask"] #, "first", "last"]
    for v in variables:
        da = cat.hansen_2018(variable=v, lat=lat, lon=lon).to_dask().pipe(_preprocess)
        # force coords to be identical
        if ds:
            da = da.assign_coords(lat=ds.lat, lon=ds.lon)
        ds[v] = da

    ds["treecover2000"] /= 100.0
    ds["lossyear"] += 2000

    # Hansen biomass
    ds["agb"] = (
        cat.hansen_biomass(lat=lat, lon=lon).to_dask().pipe(_preprocess, lat=ds.lat, lon=ds.lon)
    )
    if emissions:
        # Hansen emissions
        ds["emissions_ha"] = (
            cat.hansen_emissions_ha(lat=lat, lon=lon)
            .to_dask()
            .pipe(_preprocess, lat=ds.lat, lon=ds.lon)
        )
        ds["emissions_px"] = (
            cat.hansen_emissions_px(lat=lat, lon=lon)
            .to_dask()
            .pipe(_preprocess, lat=ds.lat, lon=ds.lon)
        )

    return ds

In [None]:
# open a single 10x10degree tile of the Hansen 30x30m data
lat = lats[1]
lon = lons[1]
box = dict(lat=slice(0, 40000, 100), lon=slice(0, 40000, 100))

ds = open_hansen_2018_tile(lat, lon)
display(ds)

In [None]:
encoding = {'emissions': {'compressor': numcodecs.Blosc()}}

In [None]:
def calc_emissions(ds):
    d_biomass = ds['agb'] * ds['d_treecover']
    emissions = d_biomass * TC_PER_TBM * TC02_PER_TC
    return emissions


def calc_one_tile(ds):
    # calculate d_treecover
    years = xr.DataArray(range(2001, 2019), dims=('year', ), name='year')
    loss_frac = []
    for year in years:
        loss_frac.append(xr.where((ds['lossyear'] == year), ds['treecover2000'], 0))
    ds['d_treecover'] = xr.concat(loss_frac, dim=years)
    ds['emissions'] = calc_emissions(ds)
    return ds


@dask.delayed
def process_one_tile(lat, lon):
    url = f'gs://carbonplan-scratch/global-forest-emissions/{lat}_{lon}.zarr'
    
    if '.zmetadata' in url:
        # skip - dataset is already complete
        return url
    mapper = fsspec.get_mapper(url)

    with dask.config.set(scheduler='threads'):
        ds = open_hansen_2018_tile(lat, lon)
        ds = calc_one_tile(ds)[['emissions']]
        ds = ds.chunk({'lat': 4000, 'lon': 4000, 'year': 2})
        ds.to_zarr(mapper, encoding=encoding, mode='w')
        return url

In [None]:
tiles = []
for lat, lon in tqdm(list(zip(lats, lons))):
    tiles.append(client.persist(process_one_tile(lat, lon), retries=1))

In [None]:
client.close()
cluster.close()