<img width="50" src="https://carbonplan-assets.s3.amazonaws.com/monogram/dark-small.png" style="margin-left:0px;margin-top:20px"/>

# MTBS to Cloud Optimized GeoTIFF

_by Joe Hamman (CarbonPlan), June 5, 2020_

This notebook converts MTBS 30m yearly rasters to Cloud Optimized GeoTIFF and
stages them in a Google Cloud Storage bucket.

**Inputs:**

- `DATA.zip` from MTBS website

**Outputs:**

- One COG per year: `gs://carbonplan-data/raw/MTBS/30m/<YEAR>/raster.tif`

**Notes:**

- No reprojection or processing of the data is done in this notebook.


In [None]:
import io
import os.path

import gcsfs
from fsspec.implementations import zip
from rasterio.io import MemoryFile
from rio_cogeo.cogeo import cog_translate
from rio_cogeo.profiles import cog_profiles

# run `gcloud auth login` on the command line, or try switching token to `browser`
fs = gcsfs.GCSFileSystem(
    project="carbonplan",
    token="/Users/jhamman/.config/gcloud/legacy_credentials/joe@carbonplan.org/adc.json",
)

The input for this script is a zip file called `DATA.zip`. This was downloaded
from: https://www.mtbs.gov/direct-download Specifically, it came from:

```
  - [select] Burn Severity Mosaics
    -> [select] Continental U.S.
      -> [click] all years
        -> [click] Download 34 Files
```

This file does not need to be un-zipped for the rest of the script to run.


In [None]:
# raw zip file
raw_zips = "~/Downloads/DATA.zip"

# This is where we'll write the COGs when we're done
bucket = "carbonplan-data/raw/MTBS/30m/"

# This is the COG profile:
dst_profile = cog_profiles.get("deflate")

In [None]:
def translate(fo, out_file):
    """translate a file object (`fo`) to cloud optimized geotiff

    the resulting COG is written to the filesystem (`fs`) defined above.
    """
    dst_profile = cog_profiles.get("deflate")
    with MemoryFile() as mem_dst:
        # Important, we pass `mem_dst.name` as output dataset path
        cog_translate(fo, mem_dst.name, dst_profile, in_memory=True)
        print(f"writing to {out_file}")
        with fs.open(out_file, "wb") as f:
            f.write(mem_dst.read())

In [None]:
# iterate through the zip file, extracting individual years
# write only files with `tif` or `htm` suffixes to the cloud bucket
# Warning: this step takes a while to run, go get some coffee.
root = zip.ZipFileSystem(raw_zips).get_mapper("composite_data")
for key in root:
    year = key.split("/")[1]
    sub = io.BytesIO(root[key])
    r2 = zip.ZipFileSystem(sub).get_mapper("")

    for fname in r2:
        if fname.endswith("tif"):
            fo = io.BytesIO(r2[fname])
            out_name = os.path.join(bucket, f"{year}.tif")
            translate(fo, out_name)
        elif fname.endswith("htm"):
            out_name = os.path.join(bucket, f"{year}.htm")
            with fs.open(out_name, "wb") as f:
                f.write(r2[fname])
        else:
            continue
        print(f"done with {out_name}")