![DataDunkers.ca Banner](https://github.com/Data-Dunkers/lessons/blob/main/images/top-banner.jpg?raw=true)

# Open Data Introduction

Many governments an organizations publish [Open Data](https://en.wikipedia.org/wiki/Open_data) that is free to use. For example, the [Edmonton Open Data Portal](https://data.edmonton.ca) includes many interesting data sets such as:

|Decription|Link|
|-|-|
|E-Scooter Locations|https://data.edmonton.ca/api/views/vq44-ni9f/rows.csv?accessType=DOWNLOAD|
|Bike Counters|https://data.edmonton.ca/api/views/tq23-qn4m/rows.csv?accessType=DOWNLOAD|
|Public Charging Stations for Electric Vehicles|https://data.edmonton.ca/api/views/xzhy-xe8z/rows.csv?accessType=DOWNLOAD|
|Soccer Fields|https://data.edmonton.ca/api/views/6avx-8i8e/rows.csv?accessType=DOWNLOAD|
|Public Washroom Locations|https://data.edmonton.ca/api/views/fw8s-c5qn/rows.csv?accessType=DOWNLOAD|
|Edible Fruit Tree Locations|https://data.edmonton.ca/api/views/eecg-fc54/rows.csv?accessType=DOWNLOAD|
|Fire Hydrant Locations|https://data.edmonton.ca/api/views/x4n2-2ke2/rows.csv?accessType=DOWNLOAD|
|Spray Park Locations|https://data.edmonton.ca/api/views/jyra-si4k/rows.csv?accessType=DOWNLOAD|
|Open City Wi-Fi Locations|https://data.edmonton.ca/api/views/vbxz-36ag/rows.csv?accessType=DOWNLOAD|

### Import Code Libraries and Data

Run the cell below to import the required Python libraries and a dataset.

In [None]:
link = 'https://data.edmonton.ca/api/views/vq44-ni9f/rows.csv?accessType=DOWNLOAD'

import pandas as pd
import plotly.express as px
import folium
from folium.plugins import FastMarkerCluster
data = pd.read_csv(link)
data

## Exploring Data

Let's seen what columns we have.

In [None]:
data.columns

We can also take a look at the unique values in one column.

In [None]:
column = 'Vendor'

data[column].unique()

## Visualizing Data

Numerical data can be visualized with scatter plots (`px.scatter`) or bar charts (`px.bar`), and we can create [histograms](https://en.wikipedia.org/wiki/Histogram) to count (and maybe group) data with (`px.histogram`).

In [None]:
px.scatter(data, x='Current Battery Fuel Level', y='Current Range in Meters', color='Vehicle Type', title='Current Range vs. Battery Level')

In [None]:
px.histogram(data, x='Current Range in Meters', title='Current Range Frequencies')

If the dataset has latitude and longitude data, we can create a map using [Folium](https://python-visualization.github.io/folium/latest/user_guide.html).

In [None]:
m = folium.Map(location=[data['Latitude'].median(), data['Longitude'].median()], zoom_start=12)
FastMarkerCluster(data=list(zip(data['Latitude'], data['Longitude']))).add_to(m)
display(m)

We can also filter a dataset, for example to include just e-bikes.

In [None]:
filtered_data = data[data['Vehicle Type'] == 'e-bike']
filtered_data

Then we can create the map from the `filtered_data`.

There are also other map styles (called `tiles`) that we can try:

* `openstreetmap`
* `stamen terrain`
* `stamen toner`
* `stamen watercolor`
* `cartodb positron`
* `cartodb dark_matter`
* `mapbox bright` (Limited zoom levels)
* `mapbox control room` (Limited zoom levels)

In [None]:
m = folium.Map(location=[data['Latitude'].median(), data['Longitude'].median()], zoom_start=12, tiles='stamen terrain')
FastMarkerCluster(data=list(zip(filtered_data['Latitude'], filtered_data['Longitude']))).add_to(m)
display(m)

Check out the [next notebook](open-data-challenge.ipynb) to continue your own analysis.

[![Data Dunkers License](https://github.com/Data-Dunkers/lessons/blob/main/images/bottom-banner.jpg?raw=true)](https://github.com/Data-Dunkers/lessons/blob/main/LICENSE.md)