# Analysis Ready Data Tutorial Part 2: Use Case 1

Time-series analysis (e.g. change detection and trend detection) is a powerful application of satellite imagery. However, a great deal of processing is required to prepare imagery for analysis. Analysis Ready Data (ARD), preprocessed time-series stacks of overhead imagery, allow for time-series analysis without any additional processing of the imagery. See [Analysis Data Defined](https://medium.com/planet-stories/analysis-ready-data-defined-5694f6f48815) for an excellent introduction and discussion on ARD.

In [Part 1](ard_1_intro_and_best_practices.ipynb) of this tutorial, we introduced ARD and covered the how and whys of using the Data and Orders APIs to create and interpret ARD.

This second part of the tutorial focuses on the first of two use cases. The use case addressed in this tutorial is:

* As a software engineer at an ag-tech company, I'd like to be able to order Planet imagery programmatically in a way that enables the data scientist at my organization to create time-series algorithms (e.g. monitoring NDVI curves over time) without further data cleaning and processing.

Please see the first part of the tutorial for an introduction to the Data and Orders APIs along with best practices. A lot of functionality developed in that tutorial will be copied here in a compact form.

## Introduction

Two things are interesting about this use case. First, we are calculating NDVI, and second, we are compositing scenes together. We are also using UDM2s. What are NDVI and UDM2s, what is compositing, and why do we want to do it?

Great questions!

NDVI stands for Normalized Difference Vegetation Index. It is commonly used to find out if vegetation is growing. You can find out more about NDVI at [USGS](https://www.usgs.gov/land-resources/eros/phenology/science/ndvi-foundation-remote-sensing-phenology?qt-science_center_objects=0#qt-science_center_objects) and [Wikipedia](https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index). What we care about here is that NDVI uses the red and near-infrared bands of an image and returns one band with values that range from -1 to 1. So, we expect a single-band image for each order.

[UDM2s](https://developers.planet.com/docs/data/udm-2/), or Usable Data Masks, use machine learning image segmentation techniques to identify which pixels in the image are clear or cloudy, or are contaminated by light or heavy haze, or snow. The resulting image mask layers, packaged as a GeoTIFF, helps visualize what parts of the image contain these elements and what parts are clear.

Compositing is a way to stitch multiple scenes together for maximum coverage. We want this because for a time series, we just want one image for each date and we want that one image to have the most coverage to minimize holes in our data. The composite tool takes in multiple scenes and returns one image. If we feed it scenes from a whole timestack, we still just get one image back! So, to avoid that disaster, we group our scenes by date and only composite the scenes that were collected on the same date.


## Implementation

The use case we will cover is: *As a software engineer at an ag-tech company, I'd like to be able to order Planet imagery programmatically in a way that enables the data scientist at my organization to create time-series algorithms (e.g. monitoring ndvi curves over time) without further data cleaning and processing.*

For this use case, the area of interest and time range are not specified. The need for no further processing indicates we should specify a strict usable pixel data filter. For time-series analysis the daily coverage of PS satellites is ideal. For our time-series analysis, we would like a single image that covers the entire area of interest (AOI). However, it may take multiple scenes to cover the entire AOI. Therefore, we will use the Composite tool to make a composite for each day in the time series analysis. This is a little tricky because the Composite tool just composites all of the scenes associated with the ids ordered. So we need to parse the scene ids we got from the Data API to get scene ids for each day, then submit an order for each day.

We will be searching for a PSScene image in the date range April 1st - May 1st 2019 with a clear percentage of 90% or above. 

To summarize, these are the steps:
1. [Initialize API client](#Step-1:-Initialize-API-client)
2. [Search Data API](#Step-2:-Search-Data-API)
3. [Group IDs by Date](#Step-3:-Group-IDs-by-Date)
4. [Submit Orders](#Step-4:-Submit-Orders)
5. [Download Orders](#Step-5:-Download-Orders)
6. [Unzip and Verify Orders](#Step-6:-Unzip-and-Verify-Orders)

Note that, due to the processing-intensiveness of visualizing the NDVI images and [UDM2s](https://developers.planet.com/docs/data/udm-2/), we will be covering visualization in the next notebook, [Analysis Ready Data Tutorial Part 2: Use Case 1 - Visualization](ard_2_use_case_1_visualize_images.ipynb)

Open in Colab below:

<a target="_blank" href="https://colab.research.google.com/github/planetlabs/notebooks/blob/master/jupyter-notebooks/analysis-ready-data/ard_2_use_case_1.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

#### Import Dependencies

In [1]:
import asyncio
from copy import copy
from datetime import datetime
from itertools import chain
import json
import os
from pathlib import Path

from pprint import pprint
from zipfile import ZipFile
import numpy as np
import planet
from planet import Auth
from planet import Session, DataClient, OrdersClient, data_filter
from planet.order_request import build_request, product

#### Step 1: Initialize API client

In [2]:
"""
If your Planet API Key is not set as an environment variable, you can paste it below
Note: please be sure to follow the security guidelines put forth by your 
organization when using this API Key in the keychain
"""
API_KEY = os.environ.get('PL_API_KEY', 'PASTE_YOUR_KEY_HERE')

client = planet.Auth.from_key(API_KEY)

#### Step 2: Search Data API

The goal of this step is to get the scene ids that meet the search criteria for this use case.

In [3]:
# Define test data for the filter

# Iowa crops aoi
test_aoi_geom = {
    "type": "Polygon",
    "coordinates": [
        [
            [-93.299129, 42.699599],
            [-93.299674, 42.812757],
            [-93.288436, 42.861921],
            [-93.265332, 42.924817],
            [-92.993873, 42.925124],
            [-92.993888, 42.773637],
            [-92.998396, 42.754529],
            [-93.019154, 42.699988],
            [-93.299129, 42.699599]
        ]
    ]
}

In [4]:
# Create an API Request from the search specifications

item_type = ['PSScene']

geom_filter = data_filter.geometry_filter(test_aoi_geom)
clear_percent_filter = data_filter.range_filter('clear_percent', 90)
date_range_filter = data_filter.date_range_filter("acquired", datetime(
    month=4, day=1, year=2019), datetime(month=5, day=1, year=2019))
cloud_cover_filter = data_filter.range_filter('cloud_cover', None, 0.1)

combined_filter = data_filter.and_filter(
    [geom_filter, clear_percent_filter, date_range_filter])

async with Session() as sess:
    cl = sess.client('data')
    request = await cl.create_search(name='temp_search2', search_filter=combined_filter, item_types=item_type)

In [5]:
# Let's look at our search request.
# Note: This is just the request's structure, the search hasn't been implemented yet
pprint(request)

{'__daily_email_enabled': False,
 '_links': {'_self': 'https://api.planet.com/data/v1/searches/bdae6a6a26af4d239de12b82433e8515',
            'results': 'https://api.planet.com/data/v1/searches/bdae6a6a26af4d239de12b82433e8515/results'},
 'created': '2023-07-07T14:17:17.369726Z',
 'filter': {'config': [{'config': {'coordinates': [[[-93.299129, 42.699599],
                                                    [-93.299674, 42.812757],
                                                    [-93.288436, 42.861921],
                                                    [-93.265332, 42.924817],
                                                    [-92.993873, 42.925124],
                                                    [-92.993888, 42.773637],
                                                    [-92.998396, 42.754529],
                                                    [-93.019154, 42.699988],
                                                    [-93.299129, 42.699599]]],
                        

In [6]:
# Search the Data API
async with Session() as sess:
    cl = sess.client('data')
    items = cl.run_search(search_id=request['id'])
    item_list = [i async for i in items]

In [7]:
print(len(item_list))

80


#### Step 3: Group IDs by Date

In [8]:
# Check out an item just for fun
print(item_list[0])

{'_links': {'_self': 'https://api.planet.com/data/v1/item-types/PSScene/items/20190426_163458_0e3a', 'assets': 'https://api.planet.com/data/v1/item-types/PSScene/items/20190426_163458_0e3a/assets/', 'thumbnail': 'https://tiles.planet.com/data/v1/item-types/PSScene/items/20190426_163458_0e3a/thumb'}, '_permissions': ['assets.basic_analytic_4b:download', 'assets.basic_analytic_4b_rpc:download', 'assets.basic_analytic_4b_xml:download', 'assets.basic_udm2:download', 'assets.ortho_analytic_3b:download', 'assets.ortho_analytic_3b_xml:download', 'assets.ortho_analytic_4b:download', 'assets.ortho_analytic_4b_sr:download', 'assets.ortho_analytic_4b_xml:download', 'assets.ortho_udm2:download', 'assets.ortho_visual:download'], 'assets': ['basic_analytic_4b', 'basic_analytic_4b_rpc', 'basic_analytic_4b_xml', 'basic_udm2', 'ortho_analytic_3b', 'ortho_analytic_3b_xml', 'ortho_analytic_4b', 'ortho_analytic_4b_sr', 'ortho_analytic_4b_xml', 'ortho_udm2', 'ortho_visual'], 'geometry': {'coordinates': [[[

In [9]:
# Let's grab this first item in our list and look at the date it was acquired
item = item_list[0]
acquired_date = item['properties']['acquired'].split('T')[0]
pprint(acquired_date)

'2019-04-26'


In [10]:
# We can create a function to get the acquired dates for all of our search results
def get_acquired_date(item):
    return item['properties']['acquired'].split('T')[0]

acquired_dates = [get_acquired_date(item) for item in item_list]

In [11]:
# Let's look at the unique values of Acquired Date for our results
unique_acquired_dates = set(acquired_dates)
pprint(unique_acquired_dates)

{'2019-04-02',
 '2019-04-08',
 '2019-04-15',
 '2019-04-19',
 '2019-04-20',
 '2019-04-21',
 '2019-04-23',
 '2019-04-24',
 '2019-04-26'}


In [12]:
# We can also list our Image IDs grouped based on Acquired Date

def get_date_item_ids(date, all_items):
    """
    Get the item IDs for items with a specific acquired date.

    Args:
        date (str): The target acquired date in string format (e.g., '2023-06-27').
        all_items (list): A list of item dictionaries, each containing an 'id' field and 'acquired' field.

    Returns:
        list: A list of item IDs that have the specified acquired date.
    """
    return [i['id'] for i in all_items if get_acquired_date(i) == date]


def get_ids_by_date(items):
    """
    Returns a dictionary mapping of acquired dates of the Image IDs to lists of item IDs.

    Args:
        items (list): A list of items.

    Returns:
        dict: A dictionary where the keys are acquired dates and the values are lists of item IDs.
    """
    acquired_dates = [get_acquired_date(item) for item in items]
    unique_acquired_dates = set(acquired_dates)

    ids_by_date = dict((d, get_date_item_ids(d, items))
                       for d in unique_acquired_dates)
    return ids_by_date


ids_by_date = get_ids_by_date(item_list)
pprint(ids_by_date)

{'2019-04-02': ['20190402_163633_0e16',
                '20190402_163631_0e16',
                '20190402_163634_0e16'],
 '2019-04-08': ['20190408_163738_1025',
                '20190408_163736_1025',
                '20190408_163735_1025',
                '20190408_164038_100e',
                '20190408_164036_100e',
                '20190408_164037_100e',
                '20190408_164035_100e',
                '20190408_164034_100e',
                '20190408_154008_1020',
                '20190408_154005_1020',
                '20190408_154006_1020',
                '20190408_154004_1020',
                '20190408_154007_1020'],
 '2019-04-15': ['20190415_170304_85_1068', '20190415_170302_79_1068'],
 '2019-04-19': ['20190419_164002_1035',
                '20190419_164003_1035',
                '20190419_164000_1035',
                '20190419_164001_1035',
                '20190419_164004_1035'],
 '2019-04-20': ['20190420_164137_1002',
                '20190420_164136_1002',
      

In [13]:
ids_by_date[list(unique_acquired_dates)[0]]

['20190402_163633_0e16', '20190402_163631_0e16', '20190402_163634_0e16']

#### Step 4: Submit Orders

Now that we have the scene ids for each collect date, we can create the orders for each date. The output of each order is a single zip file that contains one composited scene and one composited UDM2.

For this step we will just use the Python API. See the first part of this [notebook](ard_1_intro_and_best_practices.ipynb) for a demonstration of how to use the CLI.

##### Step 4.1: Build Order Toolchain

In [14]:
item_type = 'PSScene'
bundle = 'analytic_sr_udm2'
name = 'tutorial_order2'

In [15]:
# Specify tools

# Clip to AOI
clip_tool = {'clip': {'aoi': test_aoi_geom}}

# Convert to NDVI
bandmath_tool = {'bandmath': {
    "pixel_type": "32R",
    "b1": "(b4 - b3) / (b4+b3)"
}}

# Composite
composite_tool = {
    "composite": {
    }
}

tools = [clip_tool, bandmath_tool, composite_tool]
pprint(tools)

[{'clip': {'aoi': {'coordinates': [[[-93.299129, 42.699599],
                                    [-93.299674, 42.812757],
                                    [-93.288436, 42.861921],
                                    [-93.265332, 42.924817],
                                    [-92.993873, 42.925124],
                                    [-92.993888, 42.773637],
                                    [-92.998396, 42.754529],
                                    [-93.019154, 42.699988],
                                    [-93.299129, 42.699599]]],
                   'type': 'Polygon'}}},
 {'bandmath': {'b1': '(b4 - b3) / (b4+b3)', 'pixel_type': '32R'}},
 {'composite': {}}]


In [16]:
# Build the order request using the Python SDK's order_request feature
# We will put this into a function so we can loop over all of our dates/image_IDs of interest.
def build_order_request(ids):
    products = [product(ids, bundle, item_type)]
    request = build_request('test_order_sdk_method_2',
                            products=products, tools=tools)
    return request

In [17]:
list_of_order_requests = []

for date in list(unique_acquired_dates):
    ids = ids_by_date[date]
    list_of_order_requests.append(build_order_request(ids))

pprint(list_of_order_requests)

[{'name': 'test_order_sdk_method_2',
  'products': [{'item_ids': ['20190402_163633_0e16',
                             '20190402_163631_0e16',
                             '20190402_163634_0e16'],
                'item_type': 'PSScene',
                'product_bundle': 'analytic_sr_udm2'}],
  'tools': [{'clip': {'aoi': {'coordinates': [[[-93.299129, 42.699599],
                                               [-93.299674, 42.812757],
                                               [-93.288436, 42.861921],
                                               [-93.265332, 42.924817],
                                               [-92.993873, 42.925124],
                                               [-92.993888, 42.773637],
                                               [-92.998396, 42.754529],
                                               [-93.019154, 42.699988],
                                               [-93.299129, 42.699599]]],
                              'type': 'Polygon'}}},
     

##### Step 4.2: Submit Orders

In this section, for the sake of demonstration, we limit our orders to 2. Feel free to increase this limit if you want!

In [18]:
order_limit = 2
list_orders = []

# Place the order
for order_request in list_of_order_requests[:order_limit]:
    async with Session() as sess:
        cl = sess.client('orders')
        order = await cl.create_order(order_request)
    list_orders.append(order)

In [19]:
# View the orders info
pprint(list_orders)

[{'_links': {'_self': 'https://api.planet.com/compute/ops/orders/v2/6fe46be0-e53b-49d2-b3a3-16dbc600b4d8'},
  'created_on': '2023-07-07T14:17:19.871Z',
  'error_hints': [],
  'id': '6fe46be0-e53b-49d2-b3a3-16dbc600b4d8',
  'last_message': 'Preparing order',
  'last_modified': '2023-07-07T14:17:19.871Z',
  'name': 'test_order_sdk_method_2',
  'products': [{'item_ids': ['20190402_163633_0e16',
                             '20190402_163631_0e16',
                             '20190402_163634_0e16'],
                'item_type': 'PSScene',
                'product_bundle': 'analytic_sr_udm2'}],
  'state': 'queued',
  'tools': [{'clip': {'aoi': {'coordinates': [[[-93.299129, 42.699599],
                                               [-93.299674, 42.812757],
                                               [-93.288436, 42.861921],
                                               [-93.265332, 42.924817],
                                               [-92.993873, 42.925124],
                     

### Step 5: Download Orders

##### Step 5.1: Wait Until Orders are Successful

Before we can download the orders, they have to be prepared on the server.

##### Step 5.2: Run Download

For this step we will use the Planet Python Orders API because we want to be able to download multiple orders at once, something the CLI does not yet support.

In [20]:
# Since we have multiple Order IDs, let's get them into a list
order_id_list = []

for order in list_orders:
    order_id = order["id"]
    order_id_list.append(order_id)

print(order_id_list)

['6fe46be0-e53b-49d2-b3a3-16dbc600b4d8', 'b4945b89-0f89-4fae-9cd7-255020978060']


In [21]:
# Establish the directory where we want to download the data
data_dir = os.path.join('data', 'use_case_1')

# Make the download directory if it doesn't exist
Path(data_dir).mkdir(parents=True, exist_ok=True)

In [22]:
"""
First, we will make sure the orders have reached a downloadable state. 
Then, we will download the orders.
This may take several minutes.
"""

async with Session() as sess:
    cl = sess.client('orders')
    await asyncio.gather(
        cl.wait(order_id_list[0]),
        cl.wait(order_id_list[1]),
    )

CancelledError: 

In [None]:
async with Session() as sess:
    cl = sess.client('orders')
    await asyncio.gather(
        cl.download_order(order_id_list[0], data_dir),
        cl.download_order(order_id_list[1], data_dir),
    )

In [None]:
# Let's check our downloaded file locations
!ls data/use_case_1

[34m50c290fd-687a-4395-9f9a-0e389823479a[m[m [34mb9ddd81d-0539-435b-b635-6bed947ef2ee[m[m
[34m6d00095c-474b-4c69-8024-bc21a5241c25[m[m [34mbc6f0bce-ea98-4cbd-8a5a-df8eae6742c4[m[m
[34m7309fefd-2493-4ba6-bba8-9d55276d272f[m[m [34mc4180539-85ab-4a13-911b-ed20600b4e0d[m[m
[34m8011394b-7118-4178-a132-8d7c84306c1d[m[m [34mf06c7576-351b-4890-9a86-a215d210400c[m[m
[34m8be63c8c-291f-4dd2-9c5f-c921b6702da4[m[m [34mfe1eb38d-907d-469c-ad66-7a3392c7942a[m[m
[34m8f7ea899-c2d2-4789-8a0d-da23572dd580[m[m locations.zip
[34mb3aedba2-2594-4104-8098-c8d7787ddaa5[m[m


In [None]:
pprint(data_dir)

'data/use_case_1'


In [None]:
locations = []


def get_download_locations(download_dir, order_id_list):
    """
    Retrieves the download locations of files based on the provided order IDs.

    Args:
        download_dir (str): The directory where the download files are stored.
        order_id_list (list): A list of order IDs.

    Returns:
        list: A list of download locations.
    """
    for order_id in order_id_list:
        manifest_file = os.path.join(download_dir, order_id, 'manifest.json')
        with open(manifest_file, 'r') as src:
            manifest = json.load(src)
        location = [os.path.join(data_dir, order_id, f['path'])
                    for f in manifest['files']]
        locations.append(location)
    return locations


locations = get_download_locations(data_dir, order_id_list)

# Un-nest our locations object
locations = list(chain.from_iterable(locations))

pprint(locations)

['data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164135_1002_metadata.json',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163150_0e19_metadata.json',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/composite.tif',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164134_1002_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163148_0e19_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163149_0e19_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164133_1002_metadata.json',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163152_0e19_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164133_1002_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164136_1002_metadata.json',
 'data/use_

#### Step 6: Unzip and Verify Orders

In this step we will simply unzip the orders and view one of the ordered composite images.

##### 6.1: Unzip Order

In this section, we will unzip each order into a directory named after the downloaded zip file.

In [None]:
def unzip(filename):
    location = Path(filename)

    zipdir = location.parent / location.stem
    with ZipFile(location) as myzip:
        myzip.extractall(zipdir)
    return zipdir

In [None]:
def get_unzipped_files(zipdir):
    filedir = zipdir / 'files'
    filenames = os.listdir(filedir)
    return [filedir / f for f in filenames]

In [None]:
pprint(locations)

['data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164135_1002_metadata.json',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163150_0e19_metadata.json',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/composite.tif',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164134_1002_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163148_0e19_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163149_0e19_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164133_1002_metadata.json',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_163152_0e19_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164133_1002_3B_AnalyticMS_metadata_clip.xml',
 'data/use_case_1/7309fefd-2493-4ba6-bba8-9d55276d272f/20190420_164136_1002_metadata.json',
 'data/use_

##### 6.2: Verify Orders

In this section we will view the orders manually in QGIS. In the next part of this tutorial, we will visualize the NDVI composite image with the UDM. But for now, we just want to make sure we got what we ordered.

In the explorer, navigate to the data folder (should be `*/notebooks/analysis-ready-data/data/use_case_1/`). In any of the subfolders (named with the `order_id`), go into `files` and find the file named `composite.tif`). Drag that file into QGIS to visualize.