# Analysis Ready Data Tutorial Part 2: Use Case 1

Time-series analysis (e.g. change detection and trend detection) is a powerful application of satellite imagery. However, a great deal of processing is required to prepare imagery for analysis. Analysis Ready Data (ARD), preprocessed time-series stacks of overhead imagery, allow for time-series analysis without any additional processing of the imagery. See [Analysis Data Defined](https://medium.com/planet-stories/analysis-ready-data-defined-5694f6f48815) for an excellent introduction and discussion on ARD.

In [Part 1](ard_1_intro_and_best_practices.ipynb) of this tutorial, we introduced ARD and covered the how and whys of using the Data and Orders APIs to create and interpret ARD.

This second part of the tutorial focuses on the first of two use cases. The use case addressed in this tutorial is:

* As a software engineer at an ag-tech company, I'd like to be able to order Planet imagery programmatically in a way that enables the data scientist at my organization to create time-series algorithms (e.g. monitoring ndvi curves over time) without further data cleaning and processing.

Please see the first part of the tutorial for an introduction to the Data and Orders APIs along with best practices. A lot of functionality developed in that tutorial will be copied here in a compact form.

## Introduction

Two things are interesting about this use case. First, we are calculating NDVI, and second, we are compositing scenes together. What is NDVI and what is compositing and why do we want to do it?

Great questions!

First, NDVI stands for normalized difference vegitation index. It is used a **LOT** to find out if vegetation is growing. You can find out more about NDVI at [USGS](https://www.usgs.gov/land-resources/eros/phenology/science/ndvi-foundation-remote-sensing-phenology?qt-science_center_objects=0#qt-science_center_objects) and [Wikipedia](https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index). What we care about here is that NDVI uses the red and near-infrared bands of an image and returns one band with values that range from -1 to 1. So, we expect a single-band image for each order.

Compositing is a way to stitch multiple scenes together for maximum coverage. We want this because for a time series, we just want one image for each date and we want that one image to have the most coverage to minimize holes in our data. The composite tool takes in multiple scenes and returns one image. If we feed it scenes from a whole timestack, we still just get one image back! So, to avoid that disaster, we group our scenes by date and only composite the scenes that were collected on the same date.


## Implementation

The use case we will cover is: *As a software engineer at an ag-tech company, I'd like to be able to order Planet imagery programmatically in a way that enables the data scientist at my organization to create time-series algorithms (e.g. monitoring ndvi curves over time) without further data cleaning and processing.*

For this use case, the area of interest and time range are not specified. The need for no further processing indicates we should specify a strict usable pixel data filter. For time-series analysis the daily coverage of PS satellites is ideal. For our time-series analysis, we would like a single image that covers the entire area of interest (AOI). However, it may take multiple scenes to cover the entire AOI. Therefore, we will use the Composite tool to make a composite for each day in the time series analysis. This is a little tricky because the Composite tool just composites all of the scenes associated with the ids ordered. So we need to parse the scene ids we got from the Data API to get scene ids for each day, then submit an order for each day.

To summarize, these are the steps:
1. [Initialize API client](#Step-1:-Initialize-API-client)
1. [Search Data API](#Step-2:-Search-Data-API)
1. [Group IDs by Date](#Step-3:-Group-IDs-by-Date)
1. [Submit Orders](#Step-4:-Submit-Orders)
1. [Download Orders](#Step-5:-Download-Orders)
1. [Unzip and Verify Orders](#Step-6:-Unzip-and-Verify-Orders)

Note that, due to the processing-intensiveness of visualizing the NDVI images and UDM2s, we will be covering visualization in the next notebook, [Analysis Ready Data Tutorial Part 2: Use Case 1 - Visualization](ard_2_use_case_1_visualize_images.ipynb)

#### Import Dependencies

In [None]:
import asyncio
from copy import copy
from datetime import datetime
from itertools import chain
import json
import os
from pathlib import Path
from pprint import pprint
import shutil
import time
from zipfile import ZipFile

import numpy as np
from planet import Auth
from planet import Session, DataClient, OrdersClient

#### Step 1: Initialize API client

In [None]:
# if your Planet API Key is not set as an environment variable, you can paste it below
API_KEY = os.environ.get('PL_API_KEY', 'PASTE_YOUR_KEY_HERE')

client = Auth.from_key(API_KEY)

#### Step 2: Search Data API

The goal of this step is to get the scene ids that meet the search criteria for this use case.

In [None]:
# define test data for the filter

# iowa crops aoi
test_aoi_geom = {
    "type": "Polygon",
    "coordinates": [
        [
            [-93.299129, 42.699599],
            [-93.299674, 42.812757],
            [-93.288436, 42.861921],
            [-93.265332, 42.924817],
            [-92.993873, 42.925124],
            [-92.993888, 42.773637],
            [-92.998396, 42.754529],
            [-93.019154, 42.699988],
            [-93.299129, 42.699599]
        ]
    ]
}

### Let's search:
# for the geometry above
# a PSScene image
# Date Range: April 1st - May 1st 2019
# Clear Percent: 90% or above

In [None]:
# create an API Request from the search specifications

item_type = ['PSScene']

geom_filter = data_filter.geometry_filter(test_aoi_geom)
clear_percent_filter = data_filter.range_filter('clear_percent', 90)
date_range_filter = data_filter.date_range_filter("acquired", datetime(month=4, day=1, year=2019), datetime(month=5, day=1, year=2019))
cloud_cover_filter = data_filter.range_filter('cloud_cover', None, 0.1)

combined_filter = data_filter.and_filter([geom_filter, clear_percent_filter, date_range_filter])

async with Session() as sess:
    cl = DataClient(sess)
    request = await cl.create_search(name='temp_search2',search_filter=combined_filter, item_types=item_type)


In [None]:
# Let's look at our search request.
# Note: This is just the request's structure, the search hasn't been implemented yet
request

In [None]:
# Search the Data API
async with Session() as sess:
    cl = DataClient(sess)
    items = await cl.run_search(search_id=request['id'])
    item_list = [i async for i in items]

In [None]:
print(len(item_list))

#### Step 3: Group IDs by Date

In [None]:
# check out an item just for fun
print(item_list[0])

In [None]:
# let's grab this first item in our list and look at the date it was acquired
item = item_list[0]
acquired_date = item['properties']['acquired'].split('T')[0]
acquired_date

In [None]:
# We can create a function to get the acquired dates for all of our search results
def get_acquired_date(item):
    return item['properties']['acquired'].split('T')[0]

acquired_dates = [get_acquired_date(item) for item in item_list]

In [None]:
# Let's look at the unique values of Acquired Date for our results
unique_acquired_dates = set(acquired_dates)
unique_acquired_dates

In [None]:
# We can also list our Image IDs grouped based on Acquired Date

def get_date_item_ids(date, all_items):
    return [i['id'] for i in all_items if get_acquired_date(i) == date]

def get_ids_by_date(items):
    acquired_dates = [get_acquired_date(item) for item in items]
    unique_acquired_dates = set(acquired_dates)
    
    ids_by_date = dict((d, get_date_item_ids(d, items))
                       for d in unique_acquired_dates)
    return ids_by_date
    
ids_by_date = get_ids_by_date(item_list)
pprint(ids_by_date)

In [None]:
ids_by_date[list(unique_acquired_dates)[0]]

#### Step 4: Submit Orders

Now that we have the scene ids for each collect date, we can create the orders for each date. The output of each order is a single zip file that contains one composited scene and one composited UDM2.

For this step we will just use the python api. See part 1 for a demonstration of how to use the CLI.

##### Step 4.1: Build Order Toolchain

In [None]:
item_type = 'PSScene'
bundle = 'analytic_sr_udm2'
name = 'tutorial_order2'

In [None]:
# specify tools

# clip to AOI
clip_tool = {'clip': {'aoi': test_aoi_geom}}

# convert to NDVI
bandmath_tool = {'bandmath': {
    "pixel_type": "32R",
    "b1": "(b4 - b3) / (b4+b3)"
}}

# composite
composite_tool = {
      "composite": {
      }
}

tools = [clip_tool, bandmath_tool, composite_tool]
pprint(tools)

In [None]:
# Build the order request using the Python SDK's order_request feature
# We will put this into a function so we can loop over all of our dates/image_IDs of interest.
def build_order_request(ids):
    products = [order_request.product(ids, bundle, item_type)]
    request = order_request.build_request('test_order_sdk_method_2', products=products, tools=tools)
    return request

In [None]:
list_of_order_requests = []

for date in list(unique_acquired_dates):
    ids = ids_by_date[date]
    list_of_order_requests.append(build_order_request(ids))
    
print(list_of_order_requests)

##### Step 4.2: Submit Orders

In this section, for the sake of demonstration, we limit our orders to 2. Feel free to increase this limit if you want!

In [None]:
order_limit = 2
list_orders = []

# Place the order
for order_request in list_of_order_requests[:order_limit]:
    async with Session() as sess:
        cl = OrdersClient(sess)
        order = await cl.create_order(order_request)
    list_orders.append(order)

In [None]:
# View the orders info
list_orders

### Step 5: Download Orders

##### Step 5.1: Wait Until Orders are Successful

Before we can download the orders, they have to be prepared on the server.

##### Step 5.2: Run Download

For this step we will use the planet python orders API because we want to be able to download multiple orders at once, something the CLI does not yet support.

In [None]:
# Since we have multiple Order IDs, let's get them into a list
order_id_list = []

for order in list_orders:
    order_id = order["id"]
    order_id_list.append(order_id)
    
print(order_id_list)

In [None]:
# establish the directory where we want to download the data
data_dir = os.path.join('data', 'use_case_1')

# make the download directory if it doesn't exist
Path(data_dir).mkdir(parents=True, exist_ok=True)

In [None]:
# First, we will make sure the orders have reached a downloadable state. 
# Then, we will download the orders.
# This may take several minutes.

async with Session() as sess:
    client = OrdersClient(sess)
    await asyncio.gather(
        client.wait(order_id_list[0]),
        client.wait(order_id_list[1]),
        )

In [None]:
async with Session() as sess:
    client = OrdersClient(sess)
    await asyncio.gather(
        client.download_order(order_id_list[0], data_dir, client),
        client.download_order(order_id_list[1], data_dir, client),
        )

In [None]:
# Let's check our downloaded file locations
!ls data/use_case_1

In [None]:
data_dir

In [None]:
locations = []
def get_download_locations(download_dir, order_id_list):
    for order_id in order_id_list:
        manifest_file = os.path.join(download_dir, order_id, 'manifest.json')
        with open(manifest_file, 'r') as src:
            manifest = json.load(src)
        location = [os.path.join(data_dir, order_id, f['path'])
                     for f in manifest['files']]
        locations.append(location)
    return locations

locations = get_download_locations(data_dir, order_id_list)

# un-nest our locations object
locations = list(chain.from_iterable(locations))

pprint(locations)

#### Step 6: Unzip and Verify Orders

In this step we will simply unzip the orders and view one of the ordered composite images.

##### 6.1: Unzip Order

In this section, we will unzip each order into a directory named after the downloaded zip file.

In [None]:
def unzip(filename):
    location = Path(filename)
    
    zipdir = location.parent / location.stem
    with ZipFile(location) as myzip:
        myzip.extractall(zipdir)
    return zipdir

In [None]:
def get_unzipped_files(zipdir):
    filedir = zipdir / 'files'
    filenames = os.listdir(filedir)
    return [filedir / f for f in filenames]

In [None]:
locations

In [None]:
# Now we can un-zip all our files using the functions we defined above
for i in locations:
    zipdir = unzip(i)
    file_paths = get_unzipped_files(zipdir)
    pprint(file_paths)

##### 6.2: Verify Orders

In this section we will view the orders manually in QGIS. In the next part of this tutorial, we will visualize the NDVI composite image with the UDM. But for now, we just want to make sure we got what we ordered.

In the explorer, navigate to the data folder (should be `*/notebooks/analysis-ready-data/data/use_case_1/`). In any of the subfolders (named with the `order_id`), go into `files` and find the file named `composite.tif`). Drag that file into QGIS to visualize.