# Tile Server End to End Benchmarks

In addition to code benchmarks, it is desirable to have e2e benchmarks which demonstrate the performance of the tile server. Running end-to-end benchmarks is documented in https://github.com/developmentseed/tile-benchmarking/tree/main/03-e2e/README.md.

Tested tiles ranged from zoom 0 to zoom 5. From testing we noticed that the most important factors in determining response time were the number of coordinate chunks and the chunk size. The time reported is the median response time and represents the median response time for multiple requests (usually 10) for the same tile (i.e. x, y, and z parameters). 

**Based on this, the recommendation would be to create chunks as small as possible (e.g. 1 timestep per chunk for the full spatial extent if visualization is the main goal) and not to chunk coordinate data.**

Below, we include an example of how to plot results from one execution of these benchmarks.

First we download the results:

In [2]:
%%capture
# download results from s3
!aws s3 cp --recursive s3://nasa-eodc-data-store/tile-benchmarking-results/2023-09-19_22-18-25/ downloaded_results/

In [5]:
# Import libraries
import os
import pandas as pd
import hvplot.pandas
import holoviews as hv
pd.options.plotting.backend = 'holoviews'
import warnings
warnings.filterwarnings('ignore')


Then we parse results into a dataframe.

In [6]:
# Specify the directory path and the suffix
directory_path = "downloaded_results/"
suffix = "_urls_stats.csv"  # For example, if you're interested in text files

# List all files in the directory
all_files = os.listdir(directory_path)

# Filter the files to only include those that end with the specified suffix
files_with_suffix = [f"{directory_path}{f}" for f in all_files if f.endswith(suffix)]

In [7]:
dfs = []
for file in files_with_suffix:
    df = pd.read_csv(file)
    df['file'] = file
    dfs.append(df)

merged_df = pd.concat(dfs)

Then we filter the rows for aggregated results from the merged data frame. The "Aggregated" results represent aggregations across tile endpoints and we are going to plot results by dataset and zoom level.

In [None]:
df_filtered = merged_df[merged_df['Name'] != 'Aggregated']
df_filtered['zoom'] = [int(path.split('/')[2]) for path in df_filtered['Name']]
df_filtered['dataset'] = [file.split('/')[1].replace('_urls_stats.csv', '') for file in df_filtered['file']]

A plot cannot capture as much information about the dataset as a plot so we include both options below. Below we print information about the datasets alongside the results for some explanatory.

In [41]:
# Let's print just the results for zoom 0

df_filtered_min = df_filtered[['dataset', 'zoom', 'Median Response Time']]
zarr_info = pd.read_csv('https://raw.githubusercontent.com/developmentseed/tile-benchmarking/main/03-e2e/zarr_info.csv')
zarr_info_min = zarr_info[['collection_name', 'chunks', 'chunk_size_mb', 'number_of_spatial_chunks', 'dtype']]
results_info_merged = zarr_info_min.merge(df_filtered_min, left_on='collection_name', right_on='dataset', how='outer')

for zoom in range(6):
    print(f"Results for Zoom {zoom}")
    this_zoom_results = results_info_merged[results_info_merged['zoom'] == zoom].drop(columns=['dataset'])
    average_response = this_zoom_results.groupby('collection_name')['Median Response Time'].mean().reset_index()
    average_response.rename(columns={'Median Response Time': 'Average Median Response'}, inplace=True)
    
    # Merge this with the original dataframe
    result = pd.merge(average_response, zarr_info_min, on='collection_name', how='left')
    
    display(result.sort_values(['chunk_size_mb', 'number_of_spatial_chunks']))    

Results for Zoom 0


Unnamed: 0,collection_name,Average Median Response,chunks,chunk_size_mb,number_of_spatial_chunks,dtype
3,aws-noaa-oisst-feedstock_reference,470.0,"{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,1.0,int16
6,power_901_monthly_meteorology_utc.zarr,2800.0,"{'time': 492, 'lat': 25, 'lon': 25}",2.346039,332.6976,float64
1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr,280.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
4,cmip6-kerchunk,280.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
5,cmip6-pds_GISS-E2-1-G_historical_tas,470.0,"{'time': 600, 'lat': 90, 'lon': 144}",29.663086,1.0,float32
0,365_262_262_CMIP6_daily_GISS-E2-1-G_tas.zarr,3400.0,"{'time': 365, 'lat': 262, 'lon': 262}",95.577469,12.586679,float32
2,600_1440_29_CMIP6_daily_GISS-E2-1-G_tas.zarr,570.0,"{'time': 29, 'lat': 600, 'lon': 1440}",95.581055,1.0,float32


Results for Zoom 1


Unnamed: 0,collection_name,Average Median Response,chunks,chunk_size_mb,number_of_spatial_chunks,dtype
3,aws-noaa-oisst-feedstock_reference,450.0,"{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,1.0,int16
6,power_901_monthly_meteorology_utc.zarr,1050.0,"{'time': 492, 'lat': 25, 'lon': 25}",2.346039,332.6976,float64
1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr,245.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
4,cmip6-kerchunk,255.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
5,cmip6-pds_GISS-E2-1-G_historical_tas,450.0,"{'time': 600, 'lat': 90, 'lon': 144}",29.663086,1.0,float32
0,365_262_262_CMIP6_daily_GISS-E2-1-G_tas.zarr,725.0,"{'time': 365, 'lat': 262, 'lon': 262}",95.577469,12.586679,float32
2,600_1440_29_CMIP6_daily_GISS-E2-1-G_tas.zarr,540.0,"{'time': 29, 'lat': 600, 'lon': 1440}",95.581055,1.0,float32


Results for Zoom 2


Unnamed: 0,collection_name,Average Median Response,chunks,chunk_size_mb,number_of_spatial_chunks,dtype
3,aws-noaa-oisst-feedstock_reference,446.666667,"{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,1.0,int16
6,power_901_monthly_meteorology_utc.zarr,667.5,"{'time': 492, 'lat': 25, 'lon': 25}",2.346039,332.6976,float64
1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr,245.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
4,cmip6-kerchunk,240.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
5,cmip6-pds_GISS-E2-1-G_historical_tas,456.666667,"{'time': 600, 'lat': 90, 'lon': 144}",29.663086,1.0,float32
0,365_262_262_CMIP6_daily_GISS-E2-1-G_tas.zarr,828.0,"{'time': 365, 'lat': 262, 'lon': 262}",95.577469,12.586679,float32
2,600_1440_29_CMIP6_daily_GISS-E2-1-G_tas.zarr,528.333333,"{'time': 29, 'lat': 600, 'lon': 1440}",95.581055,1.0,float32


Results for Zoom 3


Unnamed: 0,collection_name,Average Median Response,chunks,chunk_size_mb,number_of_spatial_chunks,dtype
3,aws-noaa-oisst-feedstock_reference,430.0,"{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,1.0,int16
6,power_901_monthly_meteorology_utc.zarr,482.857143,"{'time': 492, 'lat': 25, 'lon': 25}",2.346039,332.6976,float64
1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr,244.0,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
4,cmip6-kerchunk,242.727273,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
5,cmip6-pds_GISS-E2-1-G_historical_tas,451.111111,"{'time': 600, 'lat': 90, 'lon': 144}",29.663086,1.0,float32
0,365_262_262_CMIP6_daily_GISS-E2-1-G_tas.zarr,696.908492,"{'time': 365, 'lat': 262, 'lon': 262}",95.577469,12.586679,float32
2,600_1440_29_CMIP6_daily_GISS-E2-1-G_tas.zarr,532.0,"{'time': 29, 'lat': 600, 'lon': 1440}",95.581055,1.0,float32


Results for Zoom 4


Unnamed: 0,collection_name,Average Median Response,chunks,chunk_size_mb,number_of_spatial_chunks,dtype
3,aws-noaa-oisst-feedstock_reference,432.142857,"{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,1.0,int16
6,power_901_monthly_meteorology_utc.zarr,460.714286,"{'time': 492, 'lat': 25, 'lon': 25}",2.346039,332.6976,float64
1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr,237.700313,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
4,cmip6-kerchunk,238.666667,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
5,cmip6-pds_GISS-E2-1-G_historical_tas,454.285714,"{'time': 600, 'lat': 90, 'lon': 144}",29.663086,1.0,float32
0,365_262_262_CMIP6_daily_GISS-E2-1-G_tas.zarr,587.857143,"{'time': 365, 'lat': 262, 'lon': 262}",95.577469,12.586679,float32
2,600_1440_29_CMIP6_daily_GISS-E2-1-G_tas.zarr,531.333333,"{'time': 29, 'lat': 600, 'lon': 1440}",95.581055,1.0,float32


Results for Zoom 5


Unnamed: 0,collection_name,Average Median Response,chunks,chunk_size_mb,number_of_spatial_chunks,dtype
3,aws-noaa-oisst-feedstock_reference,431.153846,"{'time': 1, 'zlev': 1, 'lat': 720, 'lon': 1440}",1.977539,1.0,int16
6,power_901_monthly_meteorology_utc.zarr,434.074074,"{'time': 492, 'lat': 25, 'lon': 25}",2.346039,332.6976,float64
1,600_1440_1_CMIP6_daily_GISS-E2-1-G_tas.zarr,245.0756,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
4,cmip6-kerchunk,245.769231,"{'time': 1, 'lat': 600, 'lon': 1440}",3.295898,1.0,float32
5,cmip6-pds_GISS-E2-1-G_historical_tas,450.740741,"{'time': 600, 'lat': 90, 'lon': 144}",29.663086,1.0,float32
0,365_262_262_CMIP6_daily_GISS-E2-1-G_tas.zarr,504.444765,"{'time': 365, 'lat': 262, 'lon': 262}",95.577469,12.586679,float32
2,600_1440_29_CMIP6_daily_GISS-E2-1-G_tas.zarr,530.0,"{'time': 29, 'lat': 600, 'lon': 1440}",95.581055,1.0,float32


## Interpretation of results

As we saw in previous tests, chunk size matters at all zoom levels, whereas the number of spatial chunks matters much more at low zoom levels than high zoom levels.