# End to End Tests

End to end tests benchmark HTTP requests to an xarray tile server via HTTP requests. The tile server being tested is [https://dev-titiler-xarray.delta-backend.com/](https://dev-titiler-xarray.delta-backend.com/) which is deployed via instructions in [https://github.com/developmentseed/titiler-xarray](https://github.com/developmentseed/titiler-xarray).

Tests are run locally using https://locust.io/.

## Setup

Commands below are run within this directory.

```python
python -m pip install --upgrade virtualenv
virtualenv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

### Import Libraries + Define Helper Functions

In [3]:
import pandas as pd

def csv_to_pandas(file_path, drop_cols=[], sort_by=[]):
    df = pd.read_csv(file_path)
    df.drop(drop_cols, axis=1, inplace=True)
    df = df.sort_values(sort_by)
    return df

## Generate Dataset Specs and Lists of URLs to Test

A variety of datasets were used to demonstrate the impacts of different resolutions, shapes and chunk size on performance for different tiles. The set of datasets is defined in `gen_test_urls.py` which generates the dataset information in the table below as well as lists of URLs to test, stored in `urls/`.

To modify the datasets used, make updates to `gen_test_urls.py` and re-run that script. You can skip running `gen_test_urls.py` if the list of URLs has already been generated (see `urls/{collection_name}_urls.txt` and is up to date with any changes in `gen_test_urls.py`.

Note: for the FWI-GEOS-5-Hourly dataset (or any dataset in veda-data-store and veda-data-store-staging), the `gen_test_urls.py` script requires data access via a role from the SMCE VEDA AWS account. Please skip this dataset or contact the SMCE team for access.

If you have role-based access to those buckets, you will need to assume the role using MFA and assume that role.

Otherwise, that dataset will just be skipped in the `gen_test_urls.py` script via a try/catch statement.

In [1]:
!mkdir -p urls
!python gen_test_urls.py

## Inspect Datasets

In [5]:
df = csv_to_pandas('./zarr_info.csv', drop_cols=['variable', 'compression'], sort_by=['lat_resolution'])
df = df[['source', 'lat_resolution', 'lon_resolution', 'shape', 'chunks', 'chunk_size_mb', 'number_coord_chunks']]
df

Unnamed: 0,source,lat_resolution,lon_resolution,shape,chunks,chunk_size_mb,number_coord_chunks
4,s3://yuvipanda-test1/cmr/gpm3imergdl.zarr,0.1,0.1,"{'time': 8149, 'lon': 3600, 'lat': 1800}","{'time': 10, 'lon': 3600, 'lat': 1800}",247.192383,3
0,s3://veda-data-store-staging/EIS/zarr/FWI-GEOS...,0.25,0.3125,"{'time': 26880, 'lat': 533, 'lon': 1152}","{'time': 120, 'lat': 100, 'lon': 100}",4.577637,3
3,https://ncsa.osn.xsede.org/Pangeo/pangeo-forge...,0.25,0.25,"{'time': 15044, 'zlev': 1, 'lat': 720, 'lon': ...","{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,4
1,s3://power-analysis-ready-datastore/power_901_...,0.5,0.625,"{'time': 492, 'lat': 361, 'lon': 576}","{'time': 492, 'lat': 25, 'lon': 25}",2.346039,43
2,s3://cmip6-pds/CMIP6/CMIP/NASA-GISS/GISS-E2-1-...,2.0,2.5,"{'time': 1980, 'lat': 90, 'lon': 144}","{'time': 600, 'lat': 90, 'lon': 144}",29.663086,3


# Run Locust and Inspect Results

In [1]:
!./run-all.sh

In [None]:
# Upload results to S3
!aws s3 cp results/* s3://nasa-eodc-data-store/e2e/results/$(date +"%Y-%m-%d_%H_%M_%S")/