<a href="https://colab.research.google.com/github/agroimpacts/nmeo/blob/class%2Ff2023/planet_downloader_retiler.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Downloading and preparing PlanetScope data

This notebook provides instructions for working with selected samples to query the Planet API, download the necessary NICFI tiles by date, and then reprocess them into tiles defined by a different grid format.

## Requirements

### Files
You will need one input file:

- malawi_tiles_full.geojson

This file is the one you will select a subset from to collect NICFI tiles, and then retile them. You will have to select an area of 5X5 tiles first.


### Code
You will first have to install the clone our `maputil` repository from GitHub, which can be collected done by running the code below (first we mount Drive)

In [None]:
from google.colab import drive
root = '/content/gdrive'
drive.mount(root)

#### Clone and/or update maputil

In [2]:
import os
from datetime import datetime as dt
repo_path = f"{root}/MyDrive/repos"
clone_path = 'https://github.com/agroimpacts/maputil.git'
if not os.path.exists(repo_path):
    print(f"Making {repo_path}")
    os.makedirs(repo_path, exist_ok=True)

if not os.path.exists(f"{repo_path}/maputil"):
    !git -C "{repo_path}" clone "{clone_path}"
else:
    !git -C "{repo_path}/maputil" pull

# os.chdir(f"{repo_path}/maputil")

Already up to date.


### Install and import other necessary packages

Having cloned that repo, you need want to include the python modules that are contained in it within the import statements.

In [3]:
%%capture
%pip install affine
%pip install leafmap
%pip install localtileserver
%pip install leafmap
%pip install boto3
%pip install urllib3==2.0.3
# import rioxarray

In [3]:
os.chdir(f"{repo_path}/maputil")
import sys
import importlib
from pathlib import Path
import affine
import pandas as pd
# import rioxarray as rxr
import leafmap.leafmap as leafmap
import localtileserver
import geopandas as gpd
import inspect
# from botocore.utils import PROTOCOL_TLS

libpath = f"{repo_path}"
# sys.path.insert(0, libpath)
# import maputil
# importlib.reload(maputil)
# importlib.reload(maputil.utils)
from maputil import *


## Set up the parameters to run the downloader

First, enter in the prompt below the path on Google Drive to a text file containing the Planet API key for your account. You can find that key by logging into your [Planet account](https://account.planet.com/) under the My Settings tab. Copy the key and save it into a simply text file (e.g. mykey.txt), which has only one line containing the key. This is safer than copying it into your notebook, where it might eventually become public.

In [None]:
#@title #### Enter path to Planet key
x = input("Enter the path on Drive where your key is stored, "\
          "e.g. keys/mykey.txt: ")
key_path = f"{root}/MyDrive/{x}"

PLANET_API_KEY = open(key_path).read().strip()

The following parameters should be set for running the code.

- API_URL_KEY: Provided
- proj_path: The path where you want to place all files
- quad_dir: The place to download Planet basemap quads
- tile_dir: The place where tile clipped out of basemap quads will go
- temp_dir: The place where temporary files to be made during clipping out the tile go
- dst_width: The output width in pixels of the final PlanetScope tile (2368)
- dst_height: The output width in pixels of the final PlanetScope tile (2368)
- nbands: The number of output bands in the imagery (4)
- dst_crs: The output coordinate reference system ('EPSG=4326')
- tile_path: The path to the tile file provided for this exercise (malawi_tiles_full.geojson), which you can get [here](https://drive.google.com/file/d/1yTFp7IjvCVvPr9mBRD8L5PC20oGc_g5I/view?usp=sharing).

Change the paths below to match those on your system.

In [28]:
API_URL = 'https://api.planet.com/basemaps/v1/mosaics'

# change the dest_path to the one you want
proj_path = f"{root}/MyDrive/data/nmeo"  # main output path
quad_dir = f"{proj_path}/quads"  # for downloaded NICFI quads
tile_dir = f"{proj_path}/tiles"  # for output tiles
temp_dir = f"{proj_path}/temp"   # temporary directory for intermediate outputs
log_dir = f"{proj_path}/logs"  # directory for log files

# output parameters for tiles
dst_width = 2358
dst_height = 2358
nbands = 4
dst_crs = 'EPSG:4326'

tile_path = f"{proj_path}/inputs/malawi_tiles_full.geojson"

### Read in tile file and select sub-area

Using the Malawi tiles

We just want to collect the month and year information from the samples file, and add it to the tiles file. We also want to convert the month and year data to a single variable, for querying the Planet API

In [7]:
tiles = gpd.read_file(tile_path)
tiles[tiles.columns[0:3]] = tiles[tiles.columns[0:3]].astype(int).astype(str)
tiles.head()

Unnamed: 0,tile,tile_col,tile_row,geometry
0,841570,1009,938,"POLYGON ((32.90900 -9.36000, 32.90900 -9.41000..."
1,841571,1010,938,"POLYGON ((32.95900 -9.36000, 32.95900 -9.41000..."
2,841572,1011,938,"POLYGON ((33.00900 -9.36000, 33.00900 -9.41000..."
3,841590,1009,939,"POLYGON ((32.90900 -9.41000, 32.90900 -9.46000..."
4,841591,1010,939,"POLYGON ((32.95900 -9.41000, 32.95900 -9.46000..."


View the tiles on the map, and then use the rectangle tool to select a small number of tiles (e.g. 10) anywhere in Malawi.  

In [None]:
m = leafmap.Map()
m.add_basemap()
m.add_basemap("SATELLITE")
m.add_gdf(tiles, zoom_to_layer=True)
m

Save the polygon you draw to a geojson.

In [None]:
m.save_draw_features(f"{proj_path}/inputs/aoi.geojson")

### Select the tiles using the AOI

In [22]:
aoi = gpd.read_file(f"{proj_path}/inputs/aoi.geojson")
tiles_aoi = tiles.sjoin(aoi)\
    .drop(columns="index_right")
tiles_aoi.shape

(20, 4)

## Get the NICFI grid catalog for the tiles

### Query the catalog

Find the Planet basemap quads that intersect the selected tiles.
We first create a function for defining the correct dates of imagery, which vary in terms of their length of monthly coverage. The date period has to be exact when querying the API to get the images back from Planet.

In [9]:
def year_date(year, month):
    if (year < "2021") & (month < "06"):
        date = str(int(year)-1) + "-12_" + year + "-05"
    elif (year < "2020") & (month == "12"):
        date = year + "-12_" + str(int(year)+1) + "-05"
    elif (year == "2020") & (month == "12"):
        date = year + '-' + month
    elif (year < "2020") & ((month >= "06") & (month < "12")):
        date = year + "-06_" + year + "-11"
    elif (year == "2020") & ((month >= "06") & (month < "09")):
        date = year + "-06_" + year + "-08"
    else:
        date = year + '-' + month
    return(date)

Run the query, which will give you a list of GeoDataFrames containing the quad tile name, date, download URL, as well as the tile ID of the tiling grid we use here.

But first specify the year and month(s) of interest. In this example, we are using June, 2021.

In [10]:
date = year_date("2021", "06")

Then fetch the intersecting basemap quad grid.

In [27]:
pdl = PlanetDownloader()
quads_gdf, quads_url = pdl.get_basemap_grid(
    PLANET_API_KEY, API_URL, dates=[date],
    bbox=tiles_aoi.dissolve().iloc[0]["geometry"].bounds
)

None does not exist. Creating the catalog...


In [12]:
quads_gdf.head()

Unnamed: 0,tile,date,geometry,file
0,1221-931,2021-06,"POLYGON ((34.80469 -16.13026, 34.80469 -15.961...",planet_medres_normalized_analytic_2021-06_mosa...
1,1222-931,2021-06,"POLYGON ((34.98047 -16.13026, 34.98047 -15.961...",planet_medres_normalized_analytic_2021-06_mosa...
2,1221-930,2021-06,"POLYGON ((34.80469 -16.29905, 34.80469 -16.130...",planet_medres_normalized_analytic_2021-06_mosa...
3,1222-930,2021-06,"POLYGON ((34.98047 -16.29905, 34.98047 -16.130...",planet_medres_normalized_analytic_2021-06_mosa...
4,1221-929,2021-06,"POLYGON ((34.80469 -16.46769, 34.80469 -16.299...",planet_medres_normalized_analytic_2021-06_mosa...


After iterating through the tile grid and querying the catalog, combine the results into a single GeoDataFrame, which you can save to a geojson, which lists the tiles you need to cover each grid.  

Now let's join the two catalogs, reorganize a bit and save to disk.

In [26]:
tiles_quads = tiles_aoi.sjoin(quads_gdf.rename(columns={"tile": "quad"}))\
    .drop(columns="index_right")\
    .iloc[:,[0,4,5,6,3]]

In [29]:
tiles_quads.to_file(
    Path(proj_path) / "tiles_quads.geojson", driver="GeoJSON"
)

## Download quads and retile

Once you have the catalog, you can now start to download quads and retile them. However, we are first going to test downloading just one quad. We will first set up some directories

In [30]:
# set up tile and temporary directory
if not os.path.isdir(tile_dir):
    os.makedirs(tile_dir)
if not os.path.isdir(quad_dir):
    os.makedirs(quad_dir)
if not os.path.isdir(temp_dir):
    os.makedirs(temp_dir)

Next, we will query the catalog we have created to see which tiles intersect just a single quad (they can intersect up to 4)

In [37]:
tiles_quads[["tile", "quad"]]\
    .groupby("tile")\
    .count()\
    .reset_index()\
    .query("quad==1")

Unnamed: 0,tile,quad
5,927486,1
7,927488,1
8,927489,1
9,927490,1
10,927506,1
12,927508,1
13,927509,1
14,927510,1


Let's use the first one, 927486. We need to select that row from the DataFrame, build a download url, and then download it

In [53]:
row = tiles_quads[tiles_quads.tile=="927486"]  # quad to download
download_url = f"{quads_url}/<id>/full?api_key={PLANET_API_KEY}"

link = get_quad_download_url(download_url, row.quad.iloc[0])
filename = f"{quad_dir}/{row.file.iloc[0]}.tif"
download_tiles_helper(link, filename)

Downloaded: /content/gdrive/MyDrive/data/nmeo/quads/planet_medres_normalized_analytic_2021-06_mosaic_1221-930.tif


Check your quads folder--you should the file there. We will continue on with this on Monday!
