CORE Skills Data Science Springboard - Day 13 - Geospatial Data and Pitching for Success
The aim of today's session will be to introduce the basics of working with spatial data in a data science context. We'll take a look at how you can use simple extensions to Pandas to incorporate data with spatial attributes, look at the different types of spatial data, and show how these can be used in a feature extraction pipeline for machine learning.
You should aim to understand some of the basic spatial data gotchas (looking at you, projections), and have an overview of the basic parts of the PyData stack that allow you to work with spatial data.
Pre-session Reading & Resources
If you read nothing else, take a look at this beautiful walk through urban planning via OpenStreetMap using python: https://geoffboeing.com/2017/04/urban-form-analysis-openstreetmap/
If you don't already have access to a commercial package like ArcGIS or MapInfo through your work and want a good free GIS package then QGIS is definately worth a look: https://www.qgis.org. It will do most of what the big commercial packages will do and is really useful as a geospatial data viewer.
For an interactive approach to the warping that map projections do, take a look at https://thetruesize.com
For those of you interested in R, the textbook Urban Analytics has some great examples or problem-driven geospatial processing to look through - you can get the source code here: https://github.com/alexsingleton/urban_analytics and a copy of the textbook ($) here: https://au.sagepub.com/en-gb/oce/urban-analytics/book249267
For a massive list of other geospatial resources, checkout
Awesome Geospatial: https://github.com/sacridini/Awesome-Geospatial