We need to find a 5km route through the canals of Amsterdam and an appropriate time for the event. 

## Finding a route

We consider the following factors:

* Water quality: Less polluted waters are safer for swimmers.
* Canal traffic: We want to minimise impact on canal traffic. Canals with less traffic would also have better quality water.

### (Swim) water quality

We have limited data available for water quality.
However, we do know that Amsterdam uses a combined sewage system, so one of the major sources of water pollution is sewer overflows caused by heavy rainfall [@leemans_wu_2017]. 
Thus, a possible precaution we can take is to avoid sewer overflow points

The Waternet sewerage network data is available on [Overheid.nl](https://data.overheid.nl/dataset/xnhveaeyheww2w).
Unfortunately, the download link for the WFS data returned a 404 error.
Instead, we used the provided API to retrieve the sewage nodes, then saved it as a [GeoJSON file](../../data/sewage_nodes.geojson). 
(Coordinate Reference System used is EPSG:28992.)
We then filtered the data to get the sewage overflow nodes.

In [26]:
from urllib.request import urlopen
import json
import time
import os.path
import geopandas as gpd
from pyproj import Transformer

FILENAME_SEWER_NODES = "data/sewer_nodes.geojson"
FILENAME_SEWAGE_OVERFLOW_NODES = "data/sewage_overflow_nodes.geojson"
URL_SEWER_NODES = "https://api.data.amsterdam.nl/v1/leidingeninfrastructuur/waternet_rioolknopen/?page_size=1000"
SEWAGE_OVERFLOW_TYPES = [
    "Uitlaat gemengde overstort", # Mixed overflow
    "Uitlaat vuilwater nooduitlaat", # Black water emergency outlet
    "(Externe) overstortput", # (External) overflow
    "Overstort met signalering", # Overflow with signaling
    "Interne overstortput", # Internal overflow
    "Nooduitlaat met signalering" # Emergency overflow with signaling
]

def get_sewer_nodes(url, geojson_filename, is_test_run=False):
    """Return sewer nodes as geodataframe. If test run, retrieve max 3 pages of results from API."""
    if os.path.exists(geojson_filename):
        print("Sewer nodes data has already been parsed to GeoJSON in '{}'".format(geojson_filename))
        gdf = gpd.read_file(geojson_filename)
    else:
        print("Sewer nodes GeoJSON file does not exist. Requesting data from API {}".format(url))
        geojson_data = retrieve_sewer_nodes_data_from_api(url, is_test_run)
        
        # Save data for future use
        with open(geojson_filename, "a+", encoding='utf-8') as outfile:
            json.dump(geojson_data, outfile)
        print("Sewer nodes data saved to file '{}'".format(geojson_filename))

        gdf = gpd.GeoDataFrame.from_features(geojson_data['features'])
    return gdf

def request_sewer_nodes_data(url, is_test_run):
    """Retrieve all sewer node results from API. For test runs, stop after first 3 pages of results."""
    api_response = json.load(urlopen(url))

    sewer_node_entries = []
    count  = 0

    while api_response is not None:
        time.sleep(0.5) # avoid spamming the server?

        print("Retrieving page", count)
        data = api_response["_embedded"]["waternet_rioolknopen"]
        sewer_node_entries += data

        if "next" in api_response["_links"]: # has next page of results
            api_response = json.load(urlopen(api_response["_links"]["next"]["href"]))
        else: # is last page of results
            api_response = None

        count += 1

        if is_test_run and count >= 3:
            break
    
    return sewer_node_entries

def parse_sewer_node_entries(sewer_node_entries):
    """Parse sewer node results to GeoJSON"""
    transformer = Transformer.from_crs("EPSG:7415", "EPSG:4326")

    geojson = {
        "type": "FeatureCollection",
        "features": []
    }

    for entry in sewer_node_entries:
        # print(entry)
        x, y, z = entry["geometrie"]["coordinates"]
        lat, lon = transformer.transform(x, y)
        feature = {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [lon, lat]
            },
            "properties": {
                "id": entry["id"],
                "typeKnoop": entry["typeKnoop"]
            }
        }
        geojson["features"].append(feature)
    
    return geojson


def retrieve_sewer_nodes_data_from_api(url, is_test_run=False):
    """Return sewer nodes data as GeoJSON. In test run, retrieve max 3 pages of results from API."""
    data_entries = request_sewer_nodes_data(url, is_test_run)
    geojson_data = parse_sewer_node_entries(data_entries)
    return geojson_data


def get_sewage_overflow_nodes(url, sewer_nodes_filename, sewage_overflow_nodes_filename, is_test_run):
    if os.path.exists(sewage_overflow_nodes_filename):
        print("Sewage overflow nodes data already exists in file '{}'".format(sewage_overflow_nodes_filename))
        gdf_overflows = gpd.read_file(sewage_overflow_nodes_filename)
    else:
        gdf_sewer_nodes = get_sewer_nodes(url, sewer_nodes_filename, is_test_run)
        gdf_overflows = gdf_sewer_nodes[gdf_sewer_nodes["typeKnoop"].isin(SEWAGE_OVERFLOW_TYPES)]
        gdf_overflows.to_file(sewage_overflow_nodes_filename)
        print("Sewage overflow nodes data saved to file '{}'".format(sewage_overflow_nodes_filename))
    return gdf_overflows

gdf_overflows = get_sewage_overflow_nodes(
    URL_SEWER_NODES, FILENAME_SEWER_NODES, FILENAME_SEWAGE_OVERFLOW_NODES, 
    is_test_run=True
)




Sewer nodes data has already been parsed to GeoJSON in 'data/sewer_nodes.geojson'
Sewage overflow nodes data saved to file 'data/sewage_overflow_nodes.geojson'


### Canal traffic

Our route should avoid areas of high canal traffic, to minimise impact on boats.
This would also result in a route with cleaner water.

Waternet commissioned TNO to produce a model to predict traffic densities in the canal. [@snelder_op_2013] The prediction results from the model are as follows:

![image](images/Ams%20Canal%20Speeds-Layout1-06.jpg)

### Routes used by previous open water swim events

The Amsterdam City Swim is held every summer in the canals of Amsterdam.
We have the routes for 2019 and 2023, both of which are the same, other than the direction.

![Amsterdam City Swim 2019 route](images/Ams%20Canal%20Speeds-Layout1-09.jpg)

![Amsterdam City Swim 2023 route](images/Ams%20Canal%20Speeds-Layout1-11.jpg)

![Amsterdam City Swim 2019 & 2023 routes](images/Ams%20Canal%20Speeds-Layout1-12.jpg)


### Identifying potential routes

Based on the City Swims, Amsterdam Oost seems to be a suitable area for open water swimming events.
Visual comparison also shows that Amsterdam Oost has relatively fewer sewage overflow points and less canal traffic.

![Amsterdam Oost](images/Ams%20Canal%20Speeds-Layout1-13.jpg)





Thus, we have identified 3 potential 5km routes in this area, indicated in the images below.
Our recommendation is the third route, as it traverses the fewest number of sewage overflow points.

![Route 1](images/Ams%20Canal%20Speeds-Layout1-14.jpg)

![Route 2](images/Ams%20Canal%20Speeds-Layout1-15.jpg)

![Route 3](images/Ams%20Canal%20Speeds-Layout1-16.jpg)


## Determining the optimal time with least canal traffic

The canals used for the open water swim will need to be closed off for the event, but we want to minimise the impact on the canal boat routes.
Thus, for the continuity of boat traffic, we look towards hosting the event outside of ‘rush hours’ on the canals. 

As can be seen in the figure below [@snelder_op_2013], the busiest hours on the water usually start around 15:00.
For that reason the swimming event willshould be finished before 15:00.

![image](images/canal-traffic-by-time-of-day.png)


Next, we need to determine the start time of the race. We know that

1. People swim at speeds of about 8 km/h. [@thornton_speed_2019]
1. Required length of the swim route is 5km.
1. For safety reasons, each wave of swimmers should have a maximum of 120 swimmers in the group. [@british_triathlon_open_nodate]
1. We plan for around 3000 swimmers for the swim meet, similar to the Amsterdam City Swim. [@amsterdam_city_swim_swim_nodate]

In [2]:
# Insert code for computations

# Original text:
# As people swim about 8 km/h, each round of 5 km would take approximately 40 minutes. 
# With maximum 120 swimmers per group it would take 25 waves to accommodate all 3000 swimmers. 



Thus, the first wave of the race starts at 7:00 am with a next wave going every 20 minutes, this way the last wave will be out of the water by 15:00.

## Conclusion

The swim meet is proposed to be conducted from 07:00 to 15:00 pm, with the following route:

![Route 3](images/Ams%20Canal%20Speeds-Layout1-16.jpg)
