# **Machine-to-Machine (M2M) Landsat Download: Bands, Bundles, and Band Groups**

### **What is M2M?**
EarthExplorer (EE) serves as a key data access portal, offering a range of tools for searching, discovering, and downloading data and metadata from the USGS Earth Resources Observation and Science (EROS) data repository. The Machine-to-Machine (M2M) API allows users to search and retrieve download URLS and metadata from the EROS archive by using programming languages like Python or PHP. Users can create JSON structures to pass to M2M endpoints, subsequently receiving JSON responses in return.

### **Requesting Access**
Users will need to log on to their [**EROS Registration Service (ERS)**](https://ers.cr.usgs.gov/) accounts to view view the M2M documentation. To submit download requests through M2M, ERS users must be authorized. 
Steps to request M2M access:
1. Login to [**EROS Registration Service (ERS)**](https://ers.cr.usgs.gov/) (See [**How to Create an ERS Account**](https://www.youtube.com/watch?v=Ut6kxbuP_nk))
2. Select the **Access Request** menu
3. From the **Access Controls** page, use the **Request Access** button
4. In the **Access Type** selector, choose **Access to EE's Machine to Machine interface (MACHINE)**, then complete the questions

<div class="alert alert-warning"><h4>
The USGS/EROS User Services team assesses these requests. Contact USGS EROS User Services for more information: <a href="https://www.usgs.gov/staff-profiles/usgs-eros-customer-services" target="_blank">custserv@usgs.gov</a> and visit the <a href="https://www.usgs.gov/centers/eros/science/earthexplorer-help-index#ers" target="_blank">EarthExplorer Help Index</a>   </h4></div>


### **Use Case Scenario**
This notebook illustrates how to download Landsat Collection 2 ***bands***, ***bundles***, and ***band groups*** using the M2M API. In this context, ***"bands"*** refer to individual band files in **.TIF** format, ***"bundles"*** include all files for a product(s) as **.tar** files, and ***"band groups"*** include downloading all files for a given product(s) as individual files (unpacked from the .tar file within a bundle). The area of interest (AOI) location for used is Anchorage, Alaska.

You will be allowed to select the download ***filetype*** before proceeding.


After 1. [**Setting a File Download Type**](#setdownloadtype) his example shows the user how to:

2. [**Create Area of Interest GeoJSON file**](#createaoi)
3. [**Retrieve Scenes using the *scene-search* endpoint**](#retrievescenes)
4. [**Request a *scene-list-add***](#scenelistadd)
5. [**Get *download-options***](#downloadoptions)
6. [**Set *data-file-group* Ids**](#setdatafilegroups)
7. [**Select Products for Downloading**](#selectproducts)
8. [**Sending *download-request* and *download-retrieve* commands**](#downloadrequest)
9. [**Removing a scene-list with *scene-list-remove***](#scenelistremove)
10. [**Logout of M2M endpoint**](#logout)

##### To view other M2M examples visit the [M2M Machine-to-Machine (M2M) API Example](https://m2m.cr.usgs.gov/api/docs/examples) page.


## **Setup**

### **Import necessary libraries**

In [None]:
import json
import requests
from getpass import getpass
import sys
import time
import re
import threading
import datetime
import os
import pandas as pd
import geopandas as gpd

import warnings
warnings.filterwarnings("ignore")

### **Define send request function**
This function sends a request to a M2M endpoint and returns the parsed JSON response.

Input parameters include:
- **endpoint_url** (*str*): The URL of the M2M endpoint
- **payload** (*dict*): The payload to be sent with the request

Returns:
- **dict**: Parsed JSON response

In [None]:
# Función para leer credenciales desde un archivo
def read_credentials(file_path):
    credentials = {}
    with open(file_path, 'r') as file:
        for line in file:
            if ':' in line:
                key, value = line.strip().split(':', 1)
                credentials[key] = value
    return credentials

# Definición de la Función `sendRequest`
def sendRequest(url, data, headers=None, exitIfNoResponse=True):
    json_data = json.dumps(data)
    
    if headers is None:
        response = requests.post(url, data=json_data)
    else:
        response = requests.post(url, data=json_data, headers=headers)
    
    try:
        httpStatusCode = response.status_code
        if response is None:
            print("No output from service")
            if exitIfNoResponse: sys.exit()
            else: return False
        
        output = json.loads(response.text)
        if output['errorCode'] is not None:
            print(output['errorCode'], "- ", output['errorMessage'])
            if exitIfNoResponse: sys.exit()
            else: return False
        
        if httpStatusCode == 404:
            print("404 Not Found")
            if exitIfNoResponse: sys.exit()
            else: return False
        elif httpStatusCode == 401:
            print("401 Unauthorized")
            if exitIfNoResponse: sys.exit()
            else: return False
        elif httpStatusCode == 400:
            print("Error Code", httpStatusCode)
            if exitIfNoResponse: sys.exit()
            else: return False
        else:
            return output['data']
    except Exception as e:
        print("An error occurred: ", str(e))
        if exitIfNoResponse: sys.exit()
        else: return False

### **Define download function**

In [None]:
def downloadFile(url):
    sema.acquire()
    try:
        response = requests.get(url, stream=True)
        disposition = response.headers['content-disposition']
        filename = re.findall("filename=(.+)", disposition)[0].strip("\"")
        print(f"    Downloading: {filename}...")
        
        open(os.path.join(data_dir, filename), 'wb').write(response.content)
        sema.release()
    except Exception as e:
        print(f"\nFailed to download from {url}. Will try to re-download.")
        sema.release()
        runDownload(threads, url)

### **Define *runDownload* function**
An additonal function that uses the ***downloadFile*** function above to allow multiple files to be downloaded simultaneously.

In [None]:
def runDownload(threads, url):
    thread = threading.Thread(target=downloadFile, args=(url,))
    threads.append(thread)
    thread.start()

### **Set up output directory**
This section creates a ***data*** and ***utils*** directories.

In [None]:
data_dir = 'data'
utils_dir = 'utils'
dirs = [ data_dir, utils_dir]

for d in dirs:
        if not os.path.exists(d): 
            try: 
                os.makedirs(d)
                print(f"Directory '{d}' created successfully.") 
            except OSError as e: 
                print(f"Error creating directory '{d}': {e}") 
        else: 
            print(f"Directory '{d}' already exists.") 

### **<span style="color:#DF0101">Deleting files in *data* directory</span>**

<span style="color:#DF0101">This script is designed to clear all files in the data directory, providing users with a clean slate to examine the types of files downloaded with each download type. Uncomment this section if the **data** directory is not empty.</span> 

In [None]:
# # List all files in the directory
# files = os.listdir(data_dir)

# # Loop through the files and delete them one by one
# for file in files:
#     file_path = os.path.join(data_dir, file)
    
#     try:
#         # Check if the file exists and is a regular file (not a directory)
#         if os.path.isfile(file_path):
#             os.remove(file_path)
#             print(f"Deleted file: {file_path}")
#         else:
#             print(f"Skipped non-file: {file_path}")
#     except Exception as e:
#         print(f"Error deleting file: {file_path} - {str(e)}")

# # Print a message indicating the process is complete
# print("File deletion complete.")

### **Set download thread parameters**

In [None]:
maxthreads = 5 # Threads count for downloads
sema = threading.Semaphore(value=maxthreads)
label = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") # Customized label using date time
threads = []

## **User Input**
#### **Login by entering [EROS Registration System (EROS)](https://ers.cr.usgs.gov/https://ers.cr.usgs.gov/) *username* and *password* when prompted**

In [None]:
# Leer credenciales desde el archivo
cred_file_path = 'C:\\Workspace\\descarga_landsat\\credenciales.txt'
credentials = read_credentials(cred_file_path)
username = credentials['username']
token = credentials['token']

# Generar la clave de API usando el endpoint login-token
serviceUrl = "https://m2m.cr.usgs.gov/api/api/json/stable/"
url = serviceUrl + "login-token"
payload = {"username": username, "token": token}
response = requests.post(url, json=payload)

if response.status_code == 200:
    apiKey = response.json()['data']
    print('\nLogin Successful, API Key Received!')
    headers = {'X-Auth-Token': apiKey}
else:
    print("\nLogin was unsuccessful. Please check your token and try again.")
    print(response.json())  # Print the error message from the response
    raise Exception("Login failed")

# Print the API key only if login was successful
if apiKey:
    print("\nAPI Key: " + apiKey + "\n")

#### **OR**

#### **Login by setting [EROS Registration System (EROS)](https://ers.cr.usgs.gov/https://ers.cr.usgs.gov/) *username* and *password***

In [None]:
# username = ""
# password = ""

In [None]:
# print("Logging in...\n")
    
# serviceUrl = "https://m2m.cr.usgs.gov/api/api/json/stable/"
# login_payload = {'username' : username, 'password' : password}
    
# apiKey = sendRequest(serviceUrl + "login", login_payload)
    
# print("API Key: " + apiKey + "\n")

## **1. Set File Download Type <a id="setdownloadtype"></a>**

#### <span style="color:#DF0101">Uncomment your preferred download type before proceeding.</span> 

- #### **Bundle**
For downloading bundles of Landsat scenes (downloads data for select scenes as ***.tar*** files).

In [None]:
# fileType = 'bundle'

- #### **Band**
For downloading individual band files as ***.TIF*** files.
Must include ***bandNames*** for each band e.g. ***'SR_B2'*** will search for Band 2 from the Surface Reflectance Products.

In [None]:
fileType = 'band'
bandNames = {'SR_B3', 'SR_B5', 'ANG'}

- #### **Band Group**
Downloads all bands for a given product as individual files.

In [None]:
# fileType = 'band_group'

## **2. Create an Area of Interest (AOI) GeoJSON Text File** <a id="createaoi"></a>

Polygon coordinates are arranged in the sequence NE, NW, SW, SE, NE, representing all four corners, with an extra NE point to conclude and close the polygon.

In [None]:
from geojson import Polygon, Feature, FeatureCollection, dump

polygon = Polygon([[[-68.00847232151489, 9.993051075414456],  # Esquina inferior izquierda
                    [-68.00741959094407, 10.334374973060754],  # Esquina superior izquierda
                    [-67.49442572538628, 10.332388252838966],  # Esquina superior derecha
                    [-67.49602212224453, 9.991131292108372],   # Esquina inferior derecha
                    [-68.00847232151489, 9.993051075414456]]])  # Cerrando el polígono

features = []
features.append(Feature(geometry=polygon, properties={"name": "Lago de Valencia Area", "country": "Venezuela"}))

# Crear la colección de características
feature_collection = FeatureCollection(features)

# Guardar en un archivo GeoJSON
with open('./utils/Lago_de_Valencia_Venezuela_aoi.geojson', 'w') as f:
    dump(feature_collection, f)

- #### Open the GeoJSON file as a geopandas dataframe

In [None]:
aoi_geodf =  gpd.read_file('./utils/Lago_de_Valencia_Venezuela_aoi.geojson') #aoi

- #### View the Coordinate Reference System (CRS)
The CRS used is the World Geodetic System 1984 with units in degrees.

In [None]:
aoi_geodf.crs

### **Plot AOI using Folium**
The ***folium*** module is used for creating interactive web maps.

In [None]:
import folium
m = folium.Map(location=[aoi_geodf.centroid.y[0], aoi_geodf.centroid.x[0]], zoom_start=8, tiles="openstreetmap",\
              width="90%",height="90%", attributionControl=0) #add n estimate of where the center of the polygon would be located\
                                        #for the location [latitude longitude]

In [None]:
for _, r in aoi_geodf.iterrows():
    sim_geo = gpd.GeoSeries(r["geometry"]).simplify(tolerance=0.001)
    geo_j = sim_geo.to_json()
    geo_j = folium.GeoJson(data=geo_j, style_function=lambda x: {"fillColor": "blue"})
    geo_j.add_to(m)
m

## **3. Retrieve Scenes using the [***scene-search***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-search) endpoint** <a id="retrievescenes"></a>

The [***scene-search***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-search) is used for searching for scenes within a dataset collection. A ***'datasetName'*** parameter is required; it is used to identify the dataset to search for. In this case, we will be downloading bands from the ***Landsat 8-9 Operational Land Imager and Thermal Infrared Sensor Collection 2 Level-2*** with a datasetName: ***landsat_ot_c2_l2***.

In [None]:
datasetName = 'landsat_ot_c2_l2'

<div class="alert alert-info">
    <h4>To find the <b><i>datasetName</i></b> for other collections of Landsat data:</h4>
    <ol>
        <li>
            Run a request with the <a href="https://m2m.cr.usgs.gov/api/docs/reference/#dataset-search" target="_blank">
                <b><i>dataset-search</i></b>
            </a> request and extract the <b><i>'datasetAlias'</i></b> field. No input parameters are necessary!
        </li>
        <li>
            Run the <b><i>dataset-search</i></b> endpoint using the
            <a href="https://m2m.cr.usgs.gov/api/test/json/" target="_blank">
                <b><i>Machine-to-Machine (M2M) Test Page</i></b>
            </a>.
        </li>
    </ol>
</div>


### **3.1 Setup [*scene-filters*](https://m2m.cr.usgs.gov/api/docs/datatypes/#sceneFilter):**

The [***spatialFilter***](https://m2m.cr.usgs.gov/api/docs/datatypes/#spatialFilter), [***acquisitionFilter***](https://m2m.cr.usgs.gov/api/docs/datatypes/#acquisitionFilter) and [***cloudCoverFilter***](https://m2m.cr.usgs.gov/api/docs/datatypes/#cloudCoverFilter) are [***scene-filters***](https://m2m.cr.usgs.gov/api/docs/datatypes/#sceneFilter) used below to identify the scenes matching the filter criteria using [**scene-search**](https://m2m.cr.usgs.gov/api/docs/reference/#scene-search).

- #### ***spatialFilter***
There are two ***filterType's***:

> - [***geojson***](https://m2m.cr.usgs.gov/api/docs/datatypes/#spatialFilterGeoJson) which needs a ***['geoJson'](https://m2m.cr.usgs.gov/api/docs/datatypes/#geoJson)*** field name labeled with a geometric ***'type'*** e.g. ***'Polygon'*** and includes the coordinate array of the polygon:

~~~~
        spatialFilter = {'filterType' : 'geojson',
                         'geoJson' : {'type': 'Polygon',\ 
                                      'coordinates': [[[-148.9555, 61.4834],\
                                                       [-150.9495, 61.4834],\
                                                       [-150.9495, 61.0031],\
                                                       [-148.9555, 61.0031],\
                                                       [-148.9555, 61.4834]]]}}

~~~~

> - [***mbr***](https://m2m.cr.usgs.gov/api/docs/datatypes/#spatialFilterMbr) (Minimum Bounding Rectangle) where the user should include the ***'lowerLeft'*** (southwest point) and ***'upperRight'*** (northeast point) of the rectangle as ***{'latitude': 61.4834, 'longitude' : -148.9555}*** coordinates:

```
        spatialFilter =  {'filterType' : 'mbr',
                    'lowerLeft' : {'latitude' : 61.4834,\
                                   'longitude' : -148.9555},
                   'upperRight' : { 'latitude' : 61.0031,\
                                   'longitude' : -150.9495}}
```

In [None]:
# spatialFilter =  {'filterType' : 'mbr',
#                     'lowerLeft' : {'latitude' : aoi_geodf.bounds.miny[0],\
#                                    'longitude' : aoi_geodf.bounds.minx[0]},
#                    'upperRight' : { 'latitude' : aoi_geodf.bounds.maxy[0],\
#                                    'longitude' : aoi_geodf.bounds.maxx[0]}}

spatialFilter = {
    'filterType': 'geojson',
    'geoJson': {
        'type': 'Polygon',
        'coordinates': [
            [
                [-68.00847232151489, 9.993051075414456],  # Esquina inferior izquierda
                [-68.00741959094407, 10.334374973060754],  # Esquina superior izquierda
                [-67.49442572538628, 10.332388252838966],  # Esquina superior derecha
                [-67.49602212224453, 9.991131292108372],   # Esquina inferior derecha
                [-68.00847232151489, 9.993051075414456]    # Cerrando el polígono
            ]
        ]
    }
}

- #### **acquisitionFilter**
 The [***acquisitionFilter***](https://m2m.cr.usgs.gov/api/docs/datatypes/#acquisitionFilter) is the ***'start'*** and ***'end'*** dates in ISO format (**yyyy-mm-dd**) for the time period of interest. In our case it will be the same as the ***temporalFilter*** used above.

In [None]:
temporalFilter = {'start' : '2024-07-05', 'end' : '2024-08-06'}

- #### ***cloudCoverFilter***
To filter scenes by cloud cover, you can use the [***cloudCoverFilter***](https://m2m.cr.usgs.gov/api/docs/datatypes/#cloudCoverFilter) and specify a ***'min'*** and ***'max'*** percentage (%) value.

In [None]:
cloudCoverFilter = {'min' : 0, 'max' : 10}

### **3.2 Combine all parameters into a *payload* used for sending requests:**

In [None]:
search_payload = {
    'datasetName' : datasetName,
    'sceneFilter' : {
        'spatialFilter' : spatialFilter,
        'acquisitionFilter' : temporalFilter,
        'cloudCoverFilter' : cloudCoverFilter
    }
}

In [None]:
search_payload

### **3.3 Search for Scenes:**

In [None]:
scenes = sendRequest(serviceUrl + "scene-search", search_payload, headers)

In [None]:
pd.json_normalize(scenes['results'])

## **4. Request a [*scene-list-add*](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-add)**<a id="scenelistadd"></a>
A [***scene-list-add***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-add) compiles a list of scene or product IDs for download. A user defined  ***`listId`*** name is required.

### **4.1 Create a list of entityId's**
The [***scene-list-add***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-add) endpoint requires a list of **`entityIds`**, these are [Landsat Scene Identifiers](https://www.usgs.gov/centers/eros/science/landsat-collection-2-data-dictionary#landsat_scene_id) that remain consistent for all levels and products available for each scene. Users can also enter a list of **`displayIds`** (also known as [Landsat Product Identifiers](https://www.usgs.gov/centers/eros/science/landsat-collection-2-data-dictionary#landsat_product_id)) in place of a list of **`entityIds`** but will need to label the **`idField`** as the **`'displayId'`**. Both the **`entityId`** and **`displayId`** are returned after running a [***scene-search***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-search).

Here's an example of what the [***scene-list-add***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-add) payload would look like when a user wants to add scenes to a list using their entityId or displayId values:

- **entityIds**
~~~~
    {'listId': 'temp_landsat_ot_c2_l2_list',
     'idField': 'entityId',
     'entityIds': ['LC80680172020070LGN00', 'LC80680182020070LGN00'],
     'datasetName': 'landsat_ot_c2_l2'}
~~~~ 
- **displayIds**
~~~~ 
    {'listId': 'temp_landsat_ot_c2_l2_list',
     'idField': 'displayId',
     'entityIds': ['LC08_L2SP_068017_20200310_20200822_02_T1', 'LC08_L2SP_068018_20200310_20200822_02_T1'],
     'datasetName': 'landsat_ot_c2_l2'}
~~~~

Note the difference between the `'entityIds'` and `'idField'` parameters.
Also these examples use the results from the [***scene-search***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-search) submitted above.

<div class="alert alert-info" style="font-size: 14px;">
    <h4>
        Though it is possible to use a list of displayIds, you should optimize your workflow by using a list of <mark style="background-color: #ACD8E2; color: #0B4C5F;"><i>'entityIds'</i></mark> for the <mark style="background-color: #ACD8E2; color: #0B4C5F;"><i>'entityIds'</i></mark> parameter in the <a href="https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-add" target="_blank"><b><i>scene-list-add</i></b></a> request. This eliminates the need for individual entityId lookups for each displayId, significantly improving overall efficiency.
    </h4>
</div>

In [None]:
idField = 'entityId'

entityIds = []

for result in scenes['results']:
     # Add this scene to the list I would like to download if bulk is available
    if result['options']['bulk'] == True:
        entityIds.append(result[idField])
    
entityIds

### **4.2 Add scenes to a list using [*scene-list-add*](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-add)**

In [None]:
listId = f"temp_{datasetName}_list" # customized list id
scn_list_add_payload = {
    "listId": listId,
    'idField' : idField,
    "entityIds": entityIds,
    "datasetName": datasetName
}
scn_list_add_payload

- #### The ***scene-list-add*** returns the number of scenes added to the list

In [None]:
count = sendRequest(serviceUrl + "scene-list-add", scn_list_add_payload, headers) 
count

- #### You can view the items added to the scene-list using [***scene-list-get***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-get)

In [None]:
sendRequest(serviceUrl + "scene-list-get", {'listId' : scn_list_add_payload['listId']}, headers) 

## **5. Get [*download-options*](https://m2m.cr.usgs.gov/api/docs/reference/#download-options)**<a id="downloadoptions"></a>
The [***download-options***](https://m2m.cr.usgs.gov/api/docs/reference/#download-options) request is used to discover downloadable products for each dataset. A ***'datasetName'*** parameter is required. The ***listId*** created above is used in the request.

In [None]:
download_opt_payload = {
    "listId": listId,
    "datasetName": datasetName
}

if fileType == 'band_group':
    download_opt_payload['includeSecondaryFileGroups'] = True

download_opt_payload

In [None]:
products = sendRequest(serviceUrl + "download-options", download_opt_payload, headers)
pd.json_normalize(products)

## **6. Set [*dataset-file-groups*](https://m2m.cr.usgs.gov/api/docs/reference/#dataset-file-groups) Ids** <a id="setdatafilegroups"></a>
This is necessary within this script for ***band_group*** downloads.
You can use the [***dataset-file-groups***](https://m2m.cr.usgs.gov/api/docs/reference/#dataset-file-groups) request to list all configured file groups for a dataset collection given the ***datasetName*** is provided.

In [None]:
filegroups = sendRequest(serviceUrl + "dataset-file-groups", {'datasetName' : datasetName}, headers)  
pd.json_normalize(filegroups['secondary'])

#### ***fileGroupIds*** 
The ***landsat_ot_c2_l2*** dataset has two file groups:
- ***ls_c2l2_st_band*** (Landsat Collection-2 Level-2 Surface Temperature Bands)
- ***ls_c2l2_sr_band*** (Landsat Collection-2 Level-2 Surface Reflectance Bands)
    

In [None]:
fileGroupIds = {"ls_c2l2_sr_band"}

## **7. Select Products for Downloading**<a id="selectproducts"></a>
The script below uses the [**dataset**](https://m2m.cr.usgs.gov/api/docs/reference/#dataset) which is used to retrieve the dataset by id or name.
> Note that ***'products'*** are results from the endpoint [***download-options***](https://m2m.cr.usgs.gov/api/docs/reference/#download-options).

In [None]:

# Select products
print("Selecting products...")
downloads = []
if fileType == 'bundle':
    # Select bundle files
    print("    Selecting bundle files...")
    for product in products:        
        if product["bulkAvailable"] and product['downloadSystem'] != 'folder':               
            downloads.append({"entityId":product["entityId"], "productId":product["id"]})


elif fileType == 'band':
    # Select band files
    print("    Selecting band files...")
    for product in products:  
        if product["secondaryDownloads"] is not None and len(product["secondaryDownloads"]) > 0:
            for secondaryDownload in product["secondaryDownloads"]:
                for bandName in bandNames:
                    if secondaryDownload["bulkAvailable"] and bandName in secondaryDownload['displayId']:
                        downloads.append({"entityId":secondaryDownload["entityId"], "productId":secondaryDownload["id"]})


elif fileType == 'band_group':        
    # Get secondary dataset ID and file group IDs with the scenes
    print("    Checking for scene band groups and get secondary dataset ID and file group IDs with the scenes...")
    sceneFileGroups = []
    entityIds = []
    datasetId = None
    for product in products:  
        if product["secondaryDownloads"] is not None and len(product["secondaryDownloads"]) > 0:
            for secondaryDownload in product["secondaryDownloads"]:
                if secondaryDownload["bulkAvailable"] and secondaryDownload["fileGroups"] is not None:
                    if datasetId == None:
                        datasetId = secondaryDownload['datasetId']
                    for fg in secondaryDownload["fileGroups"]:                            
                        if fg not in sceneFileGroups:
                            sceneFileGroups.append(fg)
                        if secondaryDownload['entityId'] not in entityIds:
                            entityIds.append(secondaryDownload['entityId'])

    # Send dataset request to get the secondary dataset name by the dataset ID
    data_req_payload = {
        "datasetId": datasetId,
    }
    results = sendRequest(serviceUrl + "dataset", data_req_payload, headers)
    secondaryDatasetName = results['datasetAlias']

    # Add secondary scenes to a list
    secondaryListId = f"temp_{datasetName}_scecondary_list" # customized list id
    sec_scn_add_payload = {
        "listId": secondaryListId,
        "entityIds": entityIds,
        "datasetName": secondaryDatasetName
    }

    print("    Adding secondary scenes to list...")
    count = sendRequest(serviceUrl + "scene-list-add", sec_scn_add_payload, headers)    
    print("    Added", count, "secondary scenes\n")

    # Compare the provided file groups Ids with the scenes' file groups IDs
    if fileGroupIds:
        fileGroups = []
        for fg in fileGroupIds:
            fg = fg.strip() 
            if fg in sceneFileGroups:
                fileGroups.append(fg)
    else:
        fileGroups = sceneFileGroups
else:
    # Select all available files
    for product in products:        
        if product["bulkAvailable"]:
            if product['downloadSystem'] != 'folder':            
                downloads.append({"entityId":product["entityId"], "productId":product["id"]})
            if product["secondaryDownloads"] is not None and len(product["secondaryDownloads"]) > 0:
                for secondaryDownload in product["secondaryDownloads"]:
                    if secondaryDownload["bulkAvailable"]:
                        downloads.append({"entityId":secondaryDownload["entityId"], "productId":secondaryDownload["id"]})            

## **8. Send [*download-request*](https://m2m.cr.usgs.gov/api/docs/reference/#download-request)**<a id="downloadrequest"></a>

In [None]:
if fileType != 'band_group':
    download_req2_payload = {
        "downloads": downloads,
        "label": label
    }
else:
    if len(fileGroups) > 0:
        download_req2_payload = {
            "dataGroups": [
                {
                    "fileGroups": fileGroups,
                    "datasetName": secondaryDatasetName,
                    "listId": secondaryListId
                }
            ],
            "label": label
        }
    else:
        print('No file groups found')
        sys.exit()

print(f"Sending download request ...")
download_request_results = sendRequest(serviceUrl + "download-request", download_req2_payload, headers)
print(f"Done sending download request") 

if len(download_request_results['newRecords']) == 0 and len(download_request_results['duplicateProducts']) == 0:
    print('No records returned, please update your scenes or scene-search filter')
    sys.exit()


### **Using [*download-retrieve*](https://m2m.cr.usgs.gov/api/docs/reference/#download-retrieve) to get URLs and start downloading:**

The [***download-retrieve***](https://m2m.cr.usgs.gov/api/docs/reference/#download-retrieve) endpoint returns a list of all available downloads and the ***runDownload*** function defined above is used for downloading the retrieved URLs.

In [None]:
# Attempt the download URLs
for result in download_request_results['availableDownloads']:       
    print(f"Get download url: {result['url']}\n" )
    runDownload(threads, result['url'])
    
preparingDownloadCount = len(download_request_results['preparingDownloads'])
preparingDownloadIds = []
if preparingDownloadCount > 0:
    for result in download_request_results['preparingDownloads']:  
        preparingDownloadIds.append(result['downloadId'])

    download_ret_payload = {"label" : label}                
    # Retrieve download URLs
    print("Retrieving download urls...\n")
    download_retrieve_results = sendRequest(serviceUrl + "download-retrieve", download_ret_payload, headers, False)
    if download_retrieve_results != False:
        print(f"    Retrieved: \n" )
        for result in download_retrieve_results['available']:
            if result['downloadId'] in preparingDownloadIds:
                preparingDownloadIds.remove(result['downloadId'])
                runDownload(threads, result['url'])
                print(f"       {result['url']}\n" )
            
        for result in download_retrieve_results['requested']:   
            if result['downloadId'] in preparingDownloadIds:
                preparingDownloadIds.remove(result['downloadId'])
                runDownload(threads, result['url'])
                print(f"       {result['url']}\n" )
    
    # Didn't get all download URLs, retrieve again after 30 seconds
    while len(preparingDownloadIds) > 0: 
        print(f"{len(preparingDownloadIds)} downloads are not available yet. Waiting for 30s to retrieve again\n")
        time.sleep(30)
        download_retrieve_results = sendRequest(serviceUrl + "download-retrieve", download_ret_payload, headers, False)
        if download_retrieve_results != False:
            for result in download_retrieve_results['available']:                            
                if result['downloadId'] in preparingDownloadIds:
                    preparingDownloadIds.remove(result['downloadId'])
                    print(f"    Get download url: {result['url']}\n" )
                    runDownload(threads, result['url'])
                    
print("\nDownloading files... Please do not close the program\n")
for thread in threads:
    thread.join()

## **9. Remove the scene list with [*scene-list-remove*](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-remove)**<a id="scenelistremove"></a>
Use [***scene-list-remove***](https://m2m.cr.usgs.gov/api/docs/reference/#scene-list-remove) to delete items from a specified list, a ***`listId`*** is required. Since no ***`datasetName`*** is provided, the entire list is removed. This is important as scene-lists are preserved between sessions.

In [None]:
remove_scnlst_payload = {
    "listId": listId
}
sendRequest(serviceUrl + "scene-list-remove", remove_scnlst_payload, headers)

if fileType == 'band_group':    
    # Remove the secondary scene list
    remove_scnlst2_payload = {
        "listId": secondaryListId
    }
    sendRequest(serviceUrl + "scene-list-remove", remove_scnlst2_payload, headers)

## **10. [*Logout*](https://m2m.cr.usgs.gov/api/docs/reference/#logout)**<a id="logout"></a>
Logout so the API Key cannot be used anymore.

<div class="alert alert-block alert-warning">
    <h4> <b>NOTE:</b> The Machine-to-Machine API key expires after 2 hours of inactivity. </h4>
</div>  

In [None]:
# endpoint = "logout"  
# if sendRequest(serviceUrl + endpoint, None, apiKey) == None:        
#     print("\nLogged Out\n")
# else:
#     print("\nLogout Failed\n")

## **11. List Downloads**

In [None]:
os.listdir(data_dir)


<div class="alert alert-block alert-info">
    <h1> Contact Information </h1>
    <h3> Material written by Tonian Robinson<sup>1</sup> </h3>
    <ul>
        <b>Contact:</b> custserv@usgs.gov <br> 
        <b>Voice:</b> +1-605-594-6151 <br>
        <b>Organization:</b> USGS EROS User Services <br>
        <b>Date last modified:</b> 01-Jul-2024 <br>
    </ul>
    
<sup>1</sup>Earth Space Technology Services LLC., contractor to the U.S. Geological Survey, Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota, 57198-001, USA. Work performed under USGS contract G0121D0001.
</div>