# Project Group - 20 

Members: 

1. Sheikh Arfahmi Bin Sheikh Arzimi

2. Ewan Brett

3. Cedric Nissen

4. Nills Hollnagel

5. Luka Rehviašvili

Student numbers: 

1. 6452868

2. 6525318

3. 6560733

4. 6540848

5. 6299318

# Research Objective

To develop an interactive dashboard that aids effective crowd flow management for the SAIL2025 event in Amsterdam.

# Introduction 

Effective crowd management is essential to ensure public safety and improve visitors’ experience at large-scale events such as SAIL 2025 in Amsterdam. Without sufficient crowd monitoring, crowd managers may lack oversight of crowd densities, potentially increasing the risk of overcrowding and related safety incidents. Past tragedies, such as the Seoul Halloween Parade in 2022, Houston’s Astroworld Festival in 2021 and Germany’s Love Parade in 2010, underscore the importance of proactive crowd monitoring and prediction systems. This project, therefore, aims to develop an interactive dashboard that serves to aid the SAIL Crowd Monitoring Team (CMT) in making informed decisions in real-time to manage crowd levels effectively and efficiently. Beyond SAIL, this dashboard could also be a modular tool which can be implemented by other large-scale event organisers worldwide. This project is also part of the broader crowd management strategy and will be integrated with physical control measures to form a holistic solution to crowd management challenges. 

# Contribution Statement

*Be specific. Some of the tasks can be coding (expect everyone to do this), background research, conceptualisation, visualisation, data analysis, data modelling*

**Author 1: Sheikh **: Future Crowd Flow Prediction (ML Module)

**Author 2: Nils **: User Authentication & Security 

**Author 3: Cedric **: Reports, Settings & Multi-User Management

**Author 4: Ewan **: Visualization / Dashboard Interface

**Author 5: Luka **: Current Crowd Flow Data Pipeline

# Data Used

## Confirmed Datasets

1.	Crowd flow (based on sensor counts)

    •	Number of sensors: 46 locations, bidirectional

    •	Refresh rate: Every 3 minutes

2.	SAIL Event timetable (https://www.sail.nl/programma-en-plattegrond)

3.	Geospatial Information of Sail 2025 Area (OpenStreetMaps, ArcGIS)

## Potential Datasets (Pending Requests)

1.	Vessel position

2.	Positions of traffic marshals

3.	NS Train (live) Timetables (https://ndovloket.nl/index.html) 

4.	GVB (live) Timetables - Metros, Trams, Buses (https://ndovloket.nl/index.html) 

5.	Meteorological data (based on KNMI data, alternative https://www.wunderground.com/history/weekly/nl/schiphol/EHAM/date/2025-8-20 copy to Excel and export as CSV)



# Data Pipeline

1.	Data Ingestion 

    •	Usage of confirmed datasets listed above. 

    •	In case access is granted, the above listed “potential datasets” will also be considered.

2.	Data Processing/Transformation 

    •	Compile all the data into a time-series pandas DataFrames, for easy reading, updating and plotting.

    •	Clean and standardise data to ensure consistency across sources.

3.	Data Storage 

    •	Store all data on shared drives that are backed up on university servers – OneDrive.

    •	Have a redundant version of the data, in case a file becomes corrupted and unusable. 

4.	Data Analysis & Prediction 

    •	Analysis of the retrieved data 

            i.	Cleaning

            ii.	Removing the duplicates 

            iii. Making sure there are no inconsistencies

            iv.	Bias checks

                1.	Inspect the representation across key groups 

                2.	Check missingness, errors and label quality by group

    •	The usage of the appropriate data analysis tool (Python and Excel (in case of))

    •	Prediction will be based on the need and the initial results

5.	Data Visualisation & Delivery

    •	Deliver dashboard as a Web Application using streamlit

    •	Display interactive maps, charts and tables based on the preference of the CMT, adapting the view based on size and category of the dataset respectively.


In [23]:
import pandas as pd
def load_sensor_locations():
    sensor_loc = pd.read_csv("data/sensor_location.csv")
    # Split Lat/Long into floats
    sensor_loc[['Lat', 'Lon']] = sensor_loc['Lat/Long'].str.split(',', expand=True).astype(float)
    return sensor_loc



In [24]:
sensor_loc = load_sensor_locations()

In [25]:
sensor_loc

Unnamed: 0,Objectummer,Locatienaam,Lat/Long,Breedte,Effectieve breedte,Lat,Lon
0,CMSA-GAKH-01,Kalverstraat t.h.v. 1,"52.372634, 4.892071",8.0,67,52.372634,4.892071
1,CMSA-GAWW-11,Korte Niezel,"52.374616, 4.899830",38.0,34,52.374616,4.89983
2,CMSA-GAWW-12,Oudekennissteeg,"52.373860, 4.898690",3.0,26,52.37386,4.89869
3,CMSA-GAWW-13,Stoofsteeg,"52.372439, 4.897689",26.0,22,52.372439,4.897689
4,CMSA-GAWW-14,Oudezijds Voorburgwal t.h.v. 91,"52.373538, 4.898166",4.0,36,52.373538,4.898166
5,CMSA-GAWW-15,Oudezijds Achterburgwal t.h.v. 86,"52.372916, 4.898207",32.0,28,52.372916,4.898207
6,CMSA-GAWW-16,Oudezijds Achterburgwal t.h.v. 91,"52.372628, 4.898233",31.0,27,52.372628,4.898233
7,CMSA-GAWW-17,Oudezijds Voorburgwal t.h.v. 206,"52.372782, 4.896649",51.0,47,52.372782,4.896649
8,CMSA-GAWW-19,Molensteeg,"52.373587, 4.899815",29.0,25,52.373587,4.899815
9,CMSA-GAWW-20,Oudebrugsteeg,"52.375350, 4.897480",57.0,53,52.37535,4.89748
