# Finding Landsat Scenes

This tutorial uses the USGS Machine-to-Machine (M2M) API to find Landsat scenes and then download them. 

To have access to the M2M service, you need to request access (https://m2m.cr.usgs.gov/api/docs/json/)


In [None]:
import rsgislib.dataaccess.usgs_m2m
import rsgislib.tools.utils
import rsgislib.tools.httptools
import datetime
import os

## Username and Password

You need to be careful with your username and password and therefore you should not write them into the notebook. RSGISlib provide a tool/functions for doing a basic encoding of the username and password so they are not stored as free text (Note. be careful as the simple encoding is not secure). 

To create the encoded file, you can use the command line tool `rsgisuserpassfile.py` as shown below:

`rsgisuserpassfile.py userinfo.txt`

Once you have created the `userinfo.txt` file, you can read it into your notebook/script using the `get_username_password` function shown below:

In [None]:
username, password = rsgislib.tools.utils.get_username_password(input_file = "userinfo.txt")

## API Key

For all M2M calls, you need to pass your API key, which is generated for your session using your username and password. Note, the `RSGIS_USGS_USER` and `RSGIS_USGS_PASS` environmental variables can be defined for the system you are using, and then you do not need to pass the username and password into the function. 

The API key is generated using use the following function:

In [None]:
api_key = rsgislib.dataaccess.usgs_m2m.usgs_login(username=username, password=password)
api_key

## Search by WRS2 Row/Path

I would commonly search by Landsat WRS2 row and path, and the way to do that is to find the lat/long at the centre of the row/path of interest. Alternatively, you can just search by a bounding box or point.

For this example, we'll search for the row/path 24/204 which is one of the scenes which covers Aberystwyth.

In [None]:
lon, lat = rsgislib.dataaccess.usgs_m2m.get_wrs_pt(
        api_key=api_key, row=24, path=204, grid_version=2
    )
print(f"{lon}, {lat}")

## Search for Scenes

The datasets which are available are listed below:

| Dataset Name | Dataset ID |
|-|-|
| Landsat 4/5 TM Collection 1 Level 1 | `landsat_tm_c1` |
| Landsat 4/5 TM Collection 2 Level 1 | `landsat_tm_c2_l1` |
| Landsat 4/5 TM Collection 2 Level 2 | `landsat_tm_c2_l2` |
| Landsat 7 ETM+ Collection 1 Level 1 | `landsat_etm_c1` |
| Landsat 7 ETM+ Collection 2 Level 1 | `landsat_etm_c2_l1` |
| Landsat 7 ETM+ Collection 2 Level 2 | `landsat_etm_c2_l2` |
| Landsat 8/9 Collection 1 Level 1 | `landsat_8_c1` |
| Landsat 8/9 Collection 2 Level 1 | `landsat_ot_c2_l1` |
| Landsat 8/9 Collection 2 Level 2 | `landsat_ot_c2_l2` |
| Sentinel 2A | `sentinel_2a` |

You can also specify the min-max cloud cover thresholds, start and end time periods and months of interest, alongside a point or boundary box to be searched.

The function below can then be used to search for scenes:

In [None]:
# Specify the dataset which are want to download:
usgs_dataset = "landsat_ot_c2_l2"

In [None]:
scn_lst, meta_data_dict = rsgislib.dataaccess.usgs_m2m.usgs_search(
        dataset=usgs_dataset,
        api_key=api_key,
        start_date=datetime.datetime(year=2020, month=4, day=1),
        end_date=datetime.datetime(year=2020, month=6, day=30),
        cloud_min=0,
        cloud_max=20,
        pt=[lat, lon],
        bbox=None,
        poly_geom=None,
        months=None,
        full_meta=False,
        max_n_rslts=10,
        start_n=None,
    )

In [None]:
scn_lst

## Get Download IDs

The next step is to get a list of download IDs for the scenes, this is done using the function below:


In [None]:
scn_dsp_ids, scn_ent_ids = rsgislib.dataaccess.usgs_m2m.get_download_ids(scn_lst)

In [None]:
scn_dsp_ids

## Create Scene List

Next you need to create a download list on the USGS server. You need to provide a unique name for your list but also a time period for which the list should be stored. See the [RSGISLib documentation](http://rsgislib.org/dev/rsgislib_dataaccess.html#rsgislib.dataaccess.usgs_m2m.create_scene_list) on how the time period should be specified but generally I would specify that they should exist for 1 week which is `P1W`.

In [None]:
# Define a unique base name for the search and download list.
dnwld_name = "aber_r24_p204"

In [None]:
scn_lst_add_info = rsgislib.dataaccess.usgs_m2m.create_scene_list(
            api_key=api_key,
            dataset=usgs_dataset,
            scn_ent_ids=scn_ent_ids,
            lst_name=f"{dnwld_name}_lst",
            lst_period="P1W",
        )

## Check Scenes are available for Download

Once you have created the scene list, you need to get the download list: 

In [None]:
dwlds_lst = rsgislib.dataaccess.usgs_m2m.check_dwnld_opts(
            api_key=api_key,
            lst_name=f"{dnwld_name}_lst",
            dataset=usgs_dataset,
            dwnld_filetype="bundle",
            rm_lst=True,
        )

## Request Downloads

You can now request the download URLs:

In [None]:
avail_dwn_urls, prep_dwnld_ids = rsgislib.dataaccess.usgs_m2m.request_downloads(
            api_key=api_key,
            dwlds_lst=dwlds_lst,
            dwnld_label=f"{dnwld_name}_dwnld",
        )

## Get the List of URLs

The following loop extracts the URLs for each scene ID and checks whether the scene is a tier 1 (T1) scene, as tier 2 (T2) data is commonly poorly registered. This results in a dictionary of URLs to be downloaded:

In [None]:
file_urls = dict()
for scn_id in scn_dsp_ids:
    print(scn_id)
    for url_id in avail_dwn_urls:
        if scn_id in avail_dwn_urls[url_id]:
            if "T1" in scn_id:
                file_urls[f"{scn_id}.tar"] = avail_dwn_urls[url_id]

In [None]:
file_urls

## Create a Database of the URLs to download

When there is a long list of URLs that need to be downloaded, it is useful to keep track of whether a scene has been successfully downloaded. To do this, RSGISLib used the [pysondb](https://github.com/pysonDB/pysonDB) module, which will need to be installed on your system. 

In [None]:
ls_scns_file = "aber_r24_p204_scns.json"
rsgislib.tools.httptools.create_file_listings_db(
    db_json=ls_scns_file,
    file_urls=file_urls,
)

## Logout of the USGS M2M system

Finally you should logout of the USGS M2M system using the function below:

In [None]:
rsgislib.dataaccess.usgs_m2m.usgs_logout(api_key)

## Perform Download

The final stage is to actually download the data requested. This is performed using the following functions:

### Create Output Directory

In [None]:
out_dwnld_dir = "aber_r24_p204"
if not os.path.exists(out_dwnld_dir):
    os.mkdir(out_dwnld_dir)

### Run Download

RSGISLib has a function which downloads the URLs within the [pysondb](https://github.com/pysonDB/pysonDB) json database and records those which were successfully downloaded. You can read the database file (`aber_r24_p204_scns.json`) in each text editor if you want to check which files have been downloaded etc. or just run the function again and it'll only try and download the scenes which have not been previously downloaded.

In [None]:
rsgislib.tools.httptools.download_http_files_use_lst_db(
    db_json=ls_scns_file,
    out_dir_path=out_dwnld_dir,
    http_user=None,
    http_pass=None,
    use_wget=False,
    wget_time_out=60,
    check_file_exists=True,
)