# Tile Server End to End Benchmarks

In addition to code benchmarks, it is desirable to have e2e benchmarks which demonstrate the performance of the tile server. Running end-to-end benchmarks is documented in https://github.com/developmentseed/tile-benchmarking/tree/main/03-e2e/README.md.

Below, we include an example of how to plot results from one execution of these bencmarks.

In [1]:
# download results from s3
!aws s3 cp --recursive s3://nasa-eodc-data-store/tile-benchmarking-results/2023-09-19_22-18-25/ downloaded_resuls/

fatal error: An error occurred (InvalidAccessKeyId) when calling the ListObjectsV2 operation: The AWS Access Key Id you provided does not exist in our records.



Median response time represents the median response time for multiple requests (usually 10) for the same tile (i.e. xyz parameters). Tested tiles ranged from zoom 0 to zoom 6. From testing we noticed that the most important factors in determining response time were the number of coordinate chunks and the chunk size.

**Based on this, the recommendation would be to create chunks as small as possible (e.g. 1 timestep per chunk for the full spatial extent if visualization is the main goal) and not to chunk coordinate data.**

In [2]:
# Import libraries
import os
import pandas as pd
import hvplot.pandas
import holoviews as hv
pd.options.plotting.backend = 'holoviews'
import warnings
warnings.filterwarnings('ignore')


  from pandas.core.computation.check import NUMEXPR_INSTALLED


Parse results into a dataframe.

In [3]:
# Specify the directory path and the suffix
directory_path = "downloaded_results/"
suffix = "_urls_stats.csv"  # For example, if you're interested in text files

# List all files in the directory
all_files = os.listdir(directory_path)

# Filter the files to only include those that end with the specified suffix
files_with_suffix = [f"{directory_path}{f}" for f in all_files if f.endswith(suffix)]

In [4]:
dfs = []
for file in files_with_suffix:
    df = pd.read_csv(file)
    df['file'] = file
    dfs.append(df)

merged_df = pd.concat(dfs)
merged_df 

Unnamed: 0,Type,Name,Request Count,Failure Count,Median Response Time,Average Response Time,Min Response Time,Max Response Time,Average Content Size,Requests/s,...,75%,80%,90%,95%,98%,99%,99.9%,99.99%,100%,file
0,GET,/tiles/0/0/0.png?reference=False&variable=tas&...,10,0,280.0,369.161288,251.240442,672.973636,3612.000000,0.129508,...,440,500,670,670,670,670,670,670,670,downloaded_results/600_1440_1_CMIP6_daily_GISS...
1,GET,/tiles/1/0/1.png?reference=False&variable=tas&...,10,0,250.0,260.582695,221.343795,335.845542,1266.000000,0.129508,...,280,290,340,340,340,340,340,340,340,downloaded_results/600_1440_1_CMIP6_daily_GISS...
2,GET,/tiles/1/1/0.png?reference=False&variable=tas&...,10,0,240.0,282.260892,227.447890,385.395605,3354.000000,0.129508,...,370,380,390,390,390,390,390,390,390,downloaded_results/600_1440_1_CMIP6_daily_GISS...
3,GET,/tiles/2/0/0.png?reference=False&variable=tas&...,10,0,250.0,269.354307,225.462393,439.454533,1667.000000,0.129508,...,270,270,440,440,440,440,440,440,440,downloaded_results/600_1440_1_CMIP6_daily_GISS...
4,GET,/tiles/2/2/0.png?reference=False&variable=tas&...,20,0,240.0,262.816168,226.358204,443.092825,1661.000000,0.259016,...,270,300,340,440,440,440,440,440,440,downloaded_results/600_1440_1_CMIP6_daily_GISS...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
88,GET,/tiles/6/60/41.png?reference=False&variable=TS...,10,0,410.0,431.334475,378.773467,493.383819,737.000000,0.063951,...,450,490,490,490,490,490,490,490,490,downloaded_results/power_901_monthly_meteorolo...
89,GET,/tiles/6/61/43.png?reference=False&variable=TS...,10,0,440.0,478.745012,385.681737,642.301311,761.000000,0.063951,...,540,570,640,640,640,640,640,640,640,downloaded_results/power_901_monthly_meteorolo...
90,GET,/tiles/6/63/24.png?reference=False&variable=TS...,10,0,430.0,464.574641,383.566941,634.129623,800.000000,0.063951,...,500,560,630,630,630,630,630,630,630,downloaded_results/power_901_monthly_meteorolo...
91,GET,/tiles/6/63/52.png?reference=False&variable=TS...,10,0,510.0,507.804980,409.169504,648.463082,718.000000,0.063951,...,560,600,650,650,650,650,650,650,650,downloaded_results/power_901_monthly_meteorolo...


Filter aggregated results from the merged data frame (results representing aggregations across tile endpoints). Add columns for zoom and dataset.

In [7]:

df_filtered = merged_df[merged_df['Name'] != 'Aggregated']
df_filtered['zoom'] = [int(path.split('/')[2]) for path in df_filtered['Name']]
df_filtered['dataset'] = [file.split('/')[1].replace('_urls_stats.csv', '') for file in df_filtered['file']]
df_filtered.head()

Unnamed: 0,Type,Name,Request Count,Failure Count,Median Response Time,Average Response Time,Min Response Time,Max Response Time,Average Content Size,Requests/s,...,90%,95%,98%,99%,99.9%,99.99%,100%,file,zoom,dataset
0,GET,/tiles/0/0/0.png?reference=False&variable=tas&...,10,0,280.0,369.161288,251.240442,672.973636,3612.0,0.129508,...,670,670,670,670,670,670,670,downloaded_results/600_1440_1_CMIP6_daily_GISS...,0,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr
1,GET,/tiles/1/0/1.png?reference=False&variable=tas&...,10,0,250.0,260.582695,221.343795,335.845542,1266.0,0.129508,...,340,340,340,340,340,340,340,downloaded_results/600_1440_1_CMIP6_daily_GISS...,1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr
2,GET,/tiles/1/1/0.png?reference=False&variable=tas&...,10,0,240.0,282.260892,227.44789,385.395605,3354.0,0.129508,...,390,390,390,390,390,390,390,downloaded_results/600_1440_1_CMIP6_daily_GISS...,1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr
3,GET,/tiles/2/0/0.png?reference=False&variable=tas&...,10,0,250.0,269.354307,225.462393,439.454533,1667.0,0.129508,...,440,440,440,440,440,440,440,downloaded_results/600_1440_1_CMIP6_daily_GISS...,2,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr
4,GET,/tiles/2/2/0.png?reference=False&variable=tas&...,20,0,240.0,262.816168,226.358204,443.092825,1661.0,0.259016,...,340,440,440,440,440,440,440,downloaded_results/600_1440_1_CMIP6_daily_GISS...,2,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr


Plot results.

In [8]:
df_filtered.hvplot.scatter(
    x='zoom',
    y='Median Response Time',
    by='dataset',
    width=1000,
    height=600
)

Note: there are multiple tiles tested per zoom, with the exception of zoom 0, which is why there are multiple results for zooms 1-6.