**Source: https://stac.astrogeology.usgs.gov/docs/tutorials/advanced_python/**
In this tutorial we want to use the API to access several databases and then retrieve urls based on a certain critera. For the purposes of this example we will be looking at the incidence angle of each image. Lets start by getting our file set up, for the purposes of this example, we will say we are trying to get images in a certain bounding box.

In [5]:
from pystac_client import Client
import pandas as pd
import geopandas as gpd
from shapely.geometry import shape
pd.set_option("display.max_colwidth", 150)
import pvl
import aiohttp
import asyncio

# set pystac_client logger to DEBUG to see API calls
import logging
logging.basicConfig()
logger = logging.getLogger('pystac_client') 
logger.setLevel(logging.INFO)

# Bounding Box for this example
bounding_box  = [3,4,5,6]

# Use async to define how to get incidence angle
async def get_incidence(session, url):
    async with session.get(url) as response:
        lbl_as_text = await response.text()
        lbl = pvl.loads(lbl_as_text)
        incidence_angle = lbl['Caminfo']['Geometry']['IncidenceAngle']
        return incidence_angle

loop = asyncio.get_event_loop()

# Setting up the catalog
catalog = Client.open("https://stac.astrogeology.usgs.gov/api/")
print(catalog)

<Client id=usgs-stac-api>


With this setup code we can then go on to create a geodata frame to store the results of the query. For example, the link to the image, the image id, the geometry, the incidence angle, etc. In the below example, the data is stored in items.gdf

In [6]:
# Names of the databases you want to retrieve images from
databases = ['kaguya_terrain_camera_stereoscopic_uncontrolled_observations', 'kaguya_terrain_camera_spsupport_uncontrolled_observations', 'kaguya_terrain_camera_monoscopic_uncontrolled_observations']
results = []
result_df = pd.DataFrame()
# Iterate through all the kaguya databases
for database in databases:
    result = catalog.search(collections=[database],bbox=bounding_box, max_items=200)
    results.append(result)

# Get all items as a dictionary
items = []
for r in results:
    items += r.item_collection_as_dict()['features']
for i in items:
    i['geometry'] = shape(i['geometry'])
items_gdf = gpd.GeoDataFrame(pd.json_normalize(items))

# Use async to find the incidence angles
async with aiohttp.ClientSession() as session:
    tasks = []
    for url in items_gdf['assets.caminfo_pvl.href'].values:
        tasks.append(asyncio.ensure_future(get_incidence(session, url)))
    angles = await asyncio.gather(*tasks)

items_gdf['incidence_angle'] = angles
print(items_gdf)

       type stac_version                         id  \
0   Feature        1.0.0  TC1W2B0_01_05187N056E0034   
1   Feature        1.0.0  TC2W2B0_01_05187N061E0034   
2   Feature        1.0.0  TC1W2B0_01_05187N043E0034   
3   Feature        1.0.0  TC2W2B0_01_05187N047E0034   
4   Feature        1.0.0  TC2W2B0_01_05187N034E0035   
..      ...          ...                        ...   
65  Feature        1.0.0  TC1S2B0_01_05522N064E0040   
66  Feature        1.0.0  TC1S2B0_01_05522N050E0040   
67  Feature        1.0.0  TC1S2B0_01_05522N037E0040   
68  Feature        1.0.0  TC1S2B0_01_05521N057E0050   
69  Feature        1.0.0  TC1S2B0_01_05521N043E0050   

                                                                                                                                                 geometry  \
0   POLYGON ((2.84276 4.853, 2.8417 4.90848, 2.83955 5.04374, 2.83763 5.17901, 2.83614 5.31456, 2.83379 5.44859, 2.83141 5.58266, 2.82948 5.71775, 2.8...   
1   POLYGON ((2.83876 5.3

Now we have all the images that intersect our given bounding box, we want to download them locally. All of these images are hosted on the cloud, so this is a great time to use rclone. First, lets make the paths to were the data is hosted in the cloud

In [3]:
with open("kaguyatc_images_to_download.txt", 'w') as f:
    for index in range(len(items_gdf)):
        old_url = items_gdf['assets.image.href'].iloc[index]
        updated_url = f"s3_noauth:/{old_url.split('/')[2].split('.')[0]}/{'/'.join(old_url.split('/')[3:])}\n"
        f.write(updated_url)

Where s3_noauth represents the name where they are stored in your config file. When using rclone one must setup a rclone.conf file. To do this navigate to the rclone.conf file, it should be stored in your home directory in ./.config/rclone/rclone.conf. Open it using vim vim ./.config/rclone/rclone.conf and insert the following in.

[s3_noauth]
type = s3
provider = AWS
env_auth = false
region = us-west-2

Now rclone is all set up and ready to use! Although there are python packages that allow one to use rclone in python, they are a bit clunky and not as well supported. As such it is recommended to create the following script to quickly and easily download your files. Save this script in download_files.sh

#!/bin/bash

input_file=$1
destination=$2

while IFS= read -r file; do
    echo ${file}
    rclone copy "${file}" "${destination}"
done < "${input_file}"

### .bat form for windows

@echo off
set input_file=%1
set destination=%2

for /f "usebackq delims=" %%f in ("%~1") do (
    echo %%f
    rclone copy "%%f" "%~2"
)

It can then be run with a command like: ```
bash download_files.sh kaguyatc_images_to_download.txt /path/to/dowload/folder
```

Thats it! You have sucesfully downloadde imags using rclone and async. Hopefully this will speed up downloading and fetching processess.