## Welcome to the OSM data analsyis workshop!

In this workshop, we will learn some techniques to analyze and visualize OpenStreetMap data. 

There are many ways to go about this. We will use **Jupyter Notebooks**, an interactive data analysis and visualization environment that uses **Python**. Why?

1. It's interactive: you see results (and mistakes) right away.
2. It's popular: there is a lot of help out there, and many tools to make your life easier.
3. It's free: Everything is open source.



At the end of this workshop, we will be able to make something like this:
![chart](https://images.rtijn.org/2024/mappingusa-workshop/output.png)

The aim of this workshop is not to teach you Python or using Jupyter. There are excellent tutorials that cover learning these things.

Specifically I recommend the [excellent online book about "Spatial Python"](https://pygis.io/docs/a_intro.html) at PyGis.io.

## Agenda

1. Getting OSM data via Overpass
2. Manipulating data with Pandas
3. Street network data with osmnx

## Part 1. OSM data from Overpass 
For the first part, we will work with some data from the *Overpass API*. Overpass lets you query data from OSM to suit a lot of needs, including ours. We use the convenient [Overpass Turbo](https://overpass-turbo.eu/#) interface, which includes some special tricks that make our lives easier.

Let's start with the simplest Overpass query:

```
node(1);
out;
```
[excute it here](https://overpass-turbo.eu/s/1G7f)

This simply fetches the the first node (point) ever in OpenStreetMap, the one with `id` of 1.
(Fun fact: this node was not originally created at its current location, and it did not always represent a communications tower. Do not ever assume that objects in OSM are stable.)

![node 1 on the map](https://images.rtijn.org/2024/mappingusa-workshop/node1.png)

This is not very useful yet. Let's add a little flavor.

```
node[shop]({{bbox}});
out;
```
[execute it here](https://overpass-turbo.eu/s/1G7j)


Now, we're fetching all the nodes that represent shops within the current map view on Overpass Turbo. So, the results you get will depend on the map view.

![overpass map view](https://images.rtijn.org/2024/mappingusa-workshop/overpass-1.png)

This looks nice, but the data is not useful for analysis in its current form. It's just dots on a map.

A button in the top right of the Overpass Turbo interface lets you switch between a map view and a data view. Let's have a look:

![overpass data view](https://images.rtijn.org/2024/mappingusa-workshop/overpass-2.png)

Better! But this data is still pretty hard to work with. What we like as aspiring data scientists is **columnar data**, think spreadsheets. 

Overpass will output CSV data if you tell it to in the first line:

```
[out:csv("shop","name",::version,::timestamp,::lon,::lat)];
node[shop]({{bbox}});
out;
```
[execute it here](https://overpass-turbo.eu/s/1G7l)

![overpass csv output](https://images.rtijn.org/2024/mappingusa-workshop/overpass-3.png)

It's not much, but we have some data we can work with! Let's move on.

## Data processing with Pandas

In the first notebook, we ended up with a small amount of CSV data to work with. Let's see what we can do with it.

We will use `Pandas`, a Python module that specializes in reading, manipulating and analyzing large datasets.

Let's import it into our environment.

In [None]:
import pandas as pd

Cool.

For the exercise, I saved the output in `data/shops.csv`. We will load it into a `Dataframe`. A Dataframe holds the data you want to work with, and you can perform all kinds of operations on it.

In [None]:
df_shops = pd.read_csv('data/shops.csv', delimiter='\t')

df_shops.head()

Very nice. 

In this workshop, I will gloss over some of the details, like the `delimiter` parameter in the command above. We're on a mission to see some results, and there are lots of resources to learn Pandas in depth.

This dataset is small and quite boring, but we are working with real OSM data!

To give you a flavor of what's possible, with a few lines of code, we can create a map of this data in our environment. Don't worry if you don't understand the code. We will start working with more interesting data in the next notebook.

In [None]:
# folium is a mapping library
import folium

# first, we create the map object and define its center as the mean of the coordinates
# we choose a zoom level of 12
m = folium.Map(location=[df_shops['@lat'].mean(), df_shops['@lon'].mean()], zoom_start=12)

# we add a marker for each shop, with the name as a label
for lat, lon, name in zip(df_shops['@lat'], df_shops['@lon'], df_shops['name']):
    label = name
    label = folium.Popup(label, parse_html=True)
    folium.Marker(
        location=[lat, lon],
        popup=label,
        icon=folium.Icon(color='blue', icon='shopping-cart', prefix='fa')
    ).add_to(m)

# display the map inline
m