This repository contains the code, datasets, and tutorial notebook from my YouTube video:
π Watch the full tutorial on YouTube: https://youtu.be/mFMenAad5vA
Pleae make sure to like and subscribe to my channel for more step-by-step tutorials. Thank you.
In this project, we take a messy geospatial dataset (with missing coordinates, swapped lat/lon, inconsistent country codes, duplicates, and mixed formats) and clean it step by step in Python. Finally, we build an interactive Folium map with marker clusters and a heatmap.
geospatial_dirty_locations.csv
β messy input datasetcountry_lookup.csv
β helper table to normalize country names and ISO codesgeospatial_cleaning_tutorial.ipynb
β ready-to-run Jupyter Notebook with all cleaning stepsgeospatial_cleaned.csv
β cleaned dataset after running the notebookgeospatial_cleaned_map.html
β interactive map export (open in any browser)
- Clone this repository:
git clone https://github.com/DataGeekIsMyName/geospatial-data-cleaning-python.git cd geospatial-data-cleaning-python
pip install pandas numpy folium pip install geopandas shapely fiona pyproj
This project is for educational purposes. Datasets are synthetic and for demo use only. You are free to fork, adapt, and use in your own projects with attribution.
Support My Work
Subscribe on YouTube β DataGeekIsMyName
β Buy Me a Coffee: https://buymeacoffee.com/datageekismyname π Donate via PayPal: https://www.paypal.com/donate/?hosted_button_id=ZCL24X55R9C5G
Thank you for your support!