This repository contains code that extracts YouTube videos based on a mapping.csv file and performs object detection using YOLOv11. The primary objective of this work is to evaluate pedestrian behaviour in a cross-country or cross-cultural context using freely available YouTube videos.
This study presents a comprehensive cross-cultural evaluation of pedestrian behaviour during road crossings, examining variations between developed and developing states worldwide. As urban landscapes evolve and autonomous vehicles (AVs) become integral to future transportation, understanding pedestrian behaviour becomes paramount for ensuring safe interactions between humans and AVs. Through an extensive review of global pedestrian studies, we analyse key factors influencing crossing behaviour, such as cultural norms, socioeconomic factors, infrastructure development, and regulatory frameworks. Our findings reveal distinct patterns in pedestrian conduct across different regions. Developed states generally exhibit more structured and rule-oriented crossing behaviours, influenced by established traffic regulations and advanced infrastructure. In contrast, developing states often witness a higher degree of informal and adaptive behaviour due to limited infrastructure and diverse cultural influences. These insights underscore the necessity for AVs to adapt to diverse pedestrian behaviour on a global scale, emphasising the importance of incorporating cultural nuances into AV programming and decision-making algorithms. As the integration of AVs into urban environments accelerates, this research contributes valuable insights for enhancing the safety and efficiency of autonomous transportation systems. By recognising and accommodating diverse pedestrian behaviours, AVs can navigate complex and dynamic urban settings, ensuring a harmonious coexistence with human road users across the globe.
The dataset is available on kaggle. The dataset shall soon be made available on a permanent FAIR storage.
If you use this work for academic work please cite the following paper:
Alam, M. S., Martens, M. H., & Bazilinskyy, P. (2025). Understanding global pedestrian behaviour in 401 cities with dashcam videos on YouTube. Under review. Available at https://bazilinskyy.github.io/publications/alam2025crossing
The code is open-source and free to use. It is aimed for, but not limited to, academic research. We welcome forking of this repository, pull requests, and any contributions in the spirit of open science and open-source code. For inquiries about collaboration, you may contact Md Shadab Alam (md_shadab_alam@outlook.com) or Pavlo Bazilinskyy (pavlo.bazilinskyy@gmail.com).
Tested with Python 3.10.18 and the uv package manager.
Follow these steps to set up the project.
Step 1: Install uv. uv is a fast Python package and environment manager. Install it using one of the following methods:
macOS / Linux (bash/zsh):
curl -LsSf https://astral.sh/uv/install.sh | shWindows (PowerShell):
irm https://astral.sh/uv/install.ps1 | iexAlternative (if you already have Python and pip):
pip install uvStep 2: Fix permissions (if needed):
Sometimes uv needs to create a folder under ~/.local/share/uv/python (macOS/Linux) or %LOCALAPPDATA%\uv\python (Windows).
If this folder was created by another tool (e.g. sudo), you may see an error like:
error: failed to create directory ... Permission denied (os error 13)To fix it, ensure you own the directory:
mkdir -p ~/.local/share/uv
chown -R "$(id -un)":"$(id -gn)" ~/.local/share/uv
chmod -R u+rwX ~/.local/share/uv# Create directory if it doesn't exist
New-Item -ItemType Directory -Force "$env:LOCALAPPDATA\uv"
# Ensure you (the current user) own it
# (usually not needed, but if permissions are broken)
icacls "$env:LOCALAPPDATA\uv" /grant "$($env:UserName):(OI)(CI)F"Step 3: After installing, verify:
uv --versionStep 4: Clone the repository:
git clone https://github.com/Shaadalam9/pedestrians-in-youtube.git
cd pedestrians-in-youtubeStep 5: Ensure correct Python version. If you donβt already have Python 3.10.18 installed, let uv fetch it:
uv python install 3.10.18The repo should contain a .python-version file so uv will automatically use this version.
Step 6: Create and sync the virtual environment. This will create .venv in the project folder and install dependencies exactly as locked in uv.lock:
uv sync --frozenStep 7: Activate the virtual environment:
macOS / Linux (bash/zsh):
source .venv/bin/activateWindows (PowerShell):
.\.venv\Scripts\Activate.ps1Windows (cmd.exe):
.\.venv\Scripts\activate.batStep 8: Ensure that dataset are present. Place required datasets (including mapping.csv) into the data/ directory:
Step 9: Run the code:
python3 analysis.pyConfiguration of the project needs to be defined in config. Please use the default.config file for the required structure of the file. If no custom config file is provided, default.config is used. The config file has the following parameters:
data: Directory containing data (CSV output from YOLO).videos: Directories containing the videos used to generate the data.mapping: CSV file that contains mapping data for the cities referenced in the data.prediction_mode: Configures YOLO for object detection.tracking_mode: Configures YOLO for object tracking.always_analyse: Always conduct analysis even when pickle files are present (good for testing).display_frame_tracking: Displays the frame tracking during analysis.save_annotated_img: Saves the annotated frames produced by YOLO.delete_labels: Deletes label files from YOLO output.delete_frames: Deletes frames from YOLO output.delete_youtube_video: Deletes saved YouTube videos.compress_youtube_video: Compresses YouTube videos (using the H.255 codec by default).delete_runs_files: Deletes files containing YOLO output after analysis.check_missing_mapping: Identifies all the missing csv files.min_max_videos: Gives snippets of the fastest and slowest crossing pedestrian.track_buffer_sec: Keep tracks longer (in seconds).analysis_level: Specifies the analysis level; supported versions includecityandcountry.client: Specifies the client type for downloading YouTube videos; accepted values are"WEB","ANDROID"or"ios".model: Specifies the YOLO model to use; supported/tested versions includev8xandv11x.boundary_left: Specifies the x-coordinate of one edge of the crossing area used to detect road crossings (normalised between 0 and 1).boundary_right: Specifies the x-coordinate of the opposite edge of the crossing area used to detect road crossings (normalised between 0 and 1).use_geometry_correction: Specifies the distance threshold for applying geometry correction. If set to 0, geometry correction is skipped.population_threshold: Specifies the minimum population a city must have to be included in the analysis.footage_threshold: Specifies the minimum amount of footage required for a city to be included in the analysis.min_city_population_percentage: Specifies the minimum proportion of a countryβs population that a city must have to be included in the analysis.min_speed: Specifies the minimum speed limit for pedestrian crossings to be included in the analysis.max_speed: Specifies the maximum speed limit for pedestrian crossings to be included in the analysis.countries_analyse: Lists the countries to be analysed.confidence: Sets the confidence threshold parameter for YOLO.update_ISO_code: Updates the ISO code of the country in the mapping file during analysis.update_pop_country: Updates the countryβs population in the mapping file during analysis.update_gini_value: Updates the GINI value of the country in the mapping file during analysis.update_pytubefix: Updates thepytubefixlibrary each time analysis starts.font_family: Specifies the font family to be used in outputs.font_size: Specifies the font size to be used in outputs.plotly_template: Defines the template for Plotly figures.logger_level: Level of console output. Can be: debug, info, warning, error.sleep_sec: Amount of seconds of pause in the end of the loop inmain.py.git_pull: Pull changes from git repository in the end of the loop inmain.py.email_send: Send email about completion of the job in the end of the loop inmain.py. See the following paragraph for the additional parameters in thesecretfile.email_sender: Email address of the the "sender" of the email.email_recipients: List of emails for sending the message.max_workers: Specifies the maximum number of segment-processing worker threads (i.e., how many segments can be analysed in parallel). Increasing this increases concurrent segment processing, subject to GPU/CPU and I/O limits.download_workers: Specifies the maximum number of concurrent video download/prepare workers. Increasing this allows multiple videos to be downloaded/prepared in parallel (useful when network/FTP is the bottleneck).max_active_segments_per_video: Specifies the maximum number of segments from the same video that are allowed to be processed concurrently.- If set to 1, the scheduler tends to distribute workers across different videos (e.g., with
max_workers=3, it will try to process 3 different videos at once). - If set to 2+, multiple workers may process segments from the same video simultaneously, which can improve throughput when one video has many segments but reduces βvideo diversityβ across workers.
- If set to 1, the scheduler tends to distribute workers across different videos (e.g., with
For working with external APIs of VideoFiles, GeoNames, BEA, TomTom, Trafikab, and Numbeo (paid), the API keys need to be placed in file secret (no extension) in the root of the project. The file needs to be formatted as default.secret. The email SMTP server, account and password need to be also set here. This is optional for just running the analysis on the dataset. For running the the main.py script at least an empty secret file directly copies from the template is required.
Video: https://www.youtube.com/watch?v=_Wyg213IZDI.
Video: https://www.youtube.com/watch?v=0K9vaQxKZ9k.
Video: https://www.youtube.com/watch?v=3jVszt_78_k.
Video: https://www.youtube.com/watch?v=uFG1_JBZUmM.
Video: https://www.youtube.com/watch?v=U0pdQ8eZtHY.
Video: https://www.youtube.com/watch?v=rdx7UFXYSz0.
Locations of cities with footage in dataset. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Locations of cities with footage in dataset with a density overlay of population. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Locations of cities with footage in dataset with a density overlay of the number of videos in the dataset. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Locations of cities with footage in dataset with a density overlay of the total number of seconds of footage in the dataset. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Total time of footage over the number of videos in the dataset on the city level. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Total time of footage over the number of videos in the dataset on the country level. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Distribution of videos by continent. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Distribution of segments (parts of videos included in dataset) by type of vehicle.
Locations of cities with footage in dataset. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Locations of cities with footage in dataset with a density map of total amount of footage considered. Note: continents are based on geography, i.e., the cities in Russia east from Ural mountains are shown as Asia.
Total time of footage over number of detected pedestrians.
Distribution of crossing speed in the dataset.
Crossing decision time (sorted by countries).
Crossing decision time in day (sorted by countries).
Crossing decision time in night (sorted by countries).
Crossing decision time (sorted by average of day and night).
Crossing decision time (sorted by day).
Crossing decision time (sorted by night).
Crossing speed (sorted by countries).
Crossing speed in day (sorted by countries).
Crossing speed in night (sorted by countries).
Crossing speed (sorted by average of day and night).
Crossing speed in day (sorted by average of day and night).
Crossing speed in night (sorted by average of day and night).
Crossing speed over crossing decision time.
Crossing speed over crossing decision time, during daytime.
Crossing speed over crossing decision time, during night time.
Crossing speed over population of city.
Crossing decision time over population of city.
Crossing speed over traffic mortality.
Crossing decision time over traffic mortality.
Crossing speed over literacy rate.
Crossing decision time over literacy rate.
Crossing speed over Gini coefficient.
Crossing decision time over Gini coefficient.
Crossing speed over traffic index.
Crossing decision time over traffic index.
Crossing decision time over traffic index.
Correlation matrix at daytime.
Correlation matrix at night time.
Correlation matrix for Africa.
Correlation matrix for Oceania.
Correlation matrix for Europe.
Correlation matrix for North America.
Correlation matrix for South America.
Road crossings with traffic signals (normalised over time and number of detected pedestrians).
Road crossings without traffic signals (normalised over time and number of detected pedestrians).
Road crossings with and without traffic signals (normalised over time and number of detected pedestrians).
To add more videos to the the mapping file, run python add_video.py. It is a Flask web form which allows to add new footage. The form understands if the city is already present in the dataset and adds a new videos to the existing row in the mapping file. Providing state is optional, and is recommended for USA πΊπΈ and Canada π¨π¦. Providing country is mandatory.
Adding new video to a city. In the case for Delft, Netherlands π³π± (with state not mentioned).
For each video, it is possible to add multiple segments (parts of the video). To add a new segment/video, it is mandatory to add the following information: Time of day, Vehicle, Start time (seconds) (a counter of the current second is shown under the embedded video), End time (seconds) (it must be larger than the starting time), and FPS (to see the FPS of the video, click with secondary mouse button on the video and go to "Stats for nerds"π€; FPS value is shown as a value following the resolution, e.g. "1920x1080@30"). All other values are attempted to be fetched automatically from various APIs and by analysing the video. All values can be adjusted by hand in the mapping file in case of mistakes/missing information.
Each video can contain multiple segments (with each new segment starting at the same timestamp as the end of the previous segment or later). All video-level values (including FPS) do not have to be updated for each new segment (i.e., only start and end, time of day, and vehicle type of each new segment shall be provided).
Form understands that there is no entry for Delft, Netherlands in the mapping file yet and allows to add the first video for that city. The latitude and longitude coordinates are fetched for new cities automatically. They are shown on the embed map under the video. Dragging the marker will adjusted the fetched coordinates.
If the city already exists in data, the form extends the entry for that city with the new video. In this example, a new video is added to Kyiv, Ukraine ππ. The values in Start time and End time under the embedded video also indicate that one or multiple segments for this video are already present in the mapping file; in this case a new segment would be added to the video.
The form accepts the following shortcuts and click events:
- A: pasting current timestamp in video to the "Start time (seconds)" field.
- S: pasting current timestamp in video to the "End time (seconds)" field.
- D: pasting the value of "Last second" (red value under embedded video) to the "End time (seconds)" field and setting "Start time (seconds)" field as 0. Clicking on "Current time" results in the same behaviour.
- Q: selecting "Day" value for "Time of day" field.
- W: selecting "Night" value for "Time of day" field.
If you have any questions or suggestions, feel free to reach out to md_shadab_alam@outlook.com or pavlo.bazilinskyy@gmail.com.
This project is licensed under the MIT License - see the LICENSE file for details.


