![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Exploring Vancouver, British Columbia Open Data

In this notebook we will explore multiple ways to visualize the following Open Data set. 

Link to dataset https://opendata.vancouver.ca/explore/dataset/public-art/information/

Contains information licensed under the Open Government Licence – Vancouver. See https://opendata.vancouver.ca/pages/licence/ 


We will begin by downloading the data using their publicly available [API](https://en.wikipedia.org/wiki/Application_programming_interface). 

Run the cell below using the >| Run button (or press SHIFT + ENTER). 

In [None]:
# Get data from URL
import requests as r
# Parse data
from pandas.io.json import json_normalize
import pandas as pd
import plotly.io as pio
# Get data
print("Downloading data")
link = "https://tinyurl.com/ycjwdfhk"

try:
    API_response_trees = r.get(link)
    data = API_response_trees.json() 
    # Parse data
    records = json_normalize(data=data['records'])
    print("Success!")
    # Append coordinates
    lon = []
    lat = []
    for item in records['fields.geom.coordinates'].to_list():
        if type(item) !=float:
            lon.append(item[0])
            lat.append(item[1])
        else:
            lon.append(0)
            lat.append(0)

    records['longitude'] = lon
    records['latitude'] = lat
    print("Run the next cell to see the data.")
except:
    
    print("ERROR: Could not download data.")

In [None]:
print("Print first five rows in the dataset.")
records.head()

## Visualizing using matplotlib

Let's group the data by fields.status

In [None]:
grouped_by_status = records.groupby("fields.status").size().reset_index(name="Count")
grouped_by_status

Import matplotlib class pyplot using alias plt. 

In [None]:
import matplotlib.pyplot as plt

#### Bar chart

In [None]:
plt.bar(grouped_by_status["fields.status"],grouped_by_status["Count"]);
plt.title("BAR CHART: Status of Art");
plt.xlabel("Status");
plt.ylabel("Count");

#### Pie chart

In [None]:
labels = grouped_by_status["fields.status"]
patches, texts = plt.pie(grouped_by_status["Count"], startangle=90)
plt.legend(patches, labels, loc="best")
plt.title("PIE CHART: Status of Art")
plt.show()

#### Histogram

In [None]:
plt.hist(records['fields.geo_local_area'].dropna());
plt.tick_params('x', labelrotation=90);
plt.title("Histogram: Art Geographical Local Area");
plt.xlabel("Geo Local Area");
plt.ylabel("Count");
plt.show()

#### Scatter plot

In [None]:
non_nan = records.dropna()
plt.scatter(non_nan["fields.type"],pd.to_numeric(non_nan["fields.yearofinstallation"]))
plt.tick_params('x', labelrotation=90)
plt.title("SCATTER PLOT: Year of Installation against kind of installation");
plt.xlabel("Type of art");
plt.ylabel("Year");
plt.show()

#### All plots in a single plot

In [None]:
# Create figure 2 x 2
fig, axs = plt.subplots(2, 2, figsize=(12, 12))
# Bar chart
axs[0, 0].bar(grouped_by_status["fields.status"],grouped_by_status["Count"]);
axs[0, 0].set_title("BAR CHART: Status of Art");
axs[0, 0].set_xlabel("Status");
axs[0, 0].set_ylabel("Count");
# Histogram
axs[1, 0].hist(records['fields.geo_local_area'].dropna());
axs[1,0].tick_params('x', labelrotation=90)
axs[1, 0].set_title("HISTOGRAM: Neighbourhood where art is located");
axs[1, 0].set_xlabel("Neighbourhood");
axs[1, 0].set_ylabel("Count");
# Pie chart
labels = grouped_by_status["fields.status"]
patches, texts = axs[0, 1].pie(grouped_by_status["Count"], startangle=90)
axs[0, 1].legend(patches, labels, loc="best")
axs[0, 1].set_title("PIE CHART: Status of Art")
# Scatter plot
non_nan = records.dropna()
axs[1, 1].scatter(non_nan["fields.type"],pd.to_numeric(non_nan["fields.yearofinstallation"]))
axs[1,1].tick_params('x', labelrotation=90)
axs[1, 1].set_title("SCATTER PLOT: Year of Installation against kind of installation");
axs[1, 1].set_xlabel("Type of art");
axs[1, 1].set_ylabel("Year");

## Visualizing using Plotly

Importing library.

In [None]:
import plotly.express as px
import plotly.io as pio

In [None]:
fig = px.histogram(records,x="fields.neighbourhood",title="Histogram, art per neighborhood")
fig.show()
pio.write_html(fig,"art_per_neighbourhood_Histogram.html", auto_open=True)

In [None]:
fig = px.pie(records,"fields.type",title="Pie chart: type of art")
fig.show()
pio.write_html(fig,"type_of_art_piechart.html", auto_open=True)

In [None]:
fig = px.bar(records,'fields.status',title="Bar chart: status of art")

fig.show()

pio.write_html(fig,"status_of_art_barchart.html", auto_open=True)

In [None]:
fig= px.scatter(records,'fields.neighbourhood','fields.type',marginal_y="box", marginal_x="histogram",
          color="fields.status",
           title="Scatter plot (main plot) of type of art vs neighborhood. Bar chart (top), box plot (right)")

fig.show()

pio.write_html(fig,"type_art_vs_neighborhood_scatter.html", auto_open=True)

## Cufflinks

In [None]:
#load the "cufflinks" library under the short name "cf"
import cufflinks as cf

#command to display graphics correctly in a Jupyter notebook
cf.go_offline()

#### Exercise

Run the code below to generate visualizations of the kinds of art. 

In [None]:
# Group data by fields type
Type_of_field = records.groupby("fields.type").size().reset_index(name="Count")
Type_of_field

In [None]:
import plotly

fig = Type_of_field.iplot(kind='bar',
                    y="Count",
                    x="fields.type",
                    title="Bar chart: Type of art",
                   xTitle="Type of art", yTitle="Count")

plotly.offline.plot(fig,filename="cufflinks/example.html")


In [None]:
Type_of_field.iplot(kind='pie',values="Count",labels="fields.type",
                   title="Pie chart: Type of art")


In [None]:

px.density_heatmap(records,"fields.type",
                   "fields.artists",
                   title="Heatmap of type of art against the number of artists involved")


#### Exercise

Re-rerun the code above, and substitute fields.type for one of fields.geo_local_area or fields.neighborhood. Create three new cells with your code. 

## Folium

In [None]:
!pip install folium

In [None]:
import folium
# We want to cluster them using the MarkerCluster submodule from folium plugins
from folium.plugins import MarkerCluster 

# ✏️ Your code here
latitude = 49.2827
longitude = -123.1207

# Initial coordinates 
SC_COORDINATES = [latitude, longitude]

# Create a map using our initial coordinates
map_osm=folium.Map(location=SC_COORDINATES, zoom_start=10, tiles='Stamen Terrain')

# Display the map 
display(map_osm)


In [None]:
#Create marker cluster and add to our map
marker_cluster = MarkerCluster().add_to(map_osm)

# Iterate over each record, and add tree x and y coordinates, as well as tree name
MAX_RECORDS = len(records)
# For each record in rawData
for each in records[0:MAX_RECORDS].iterrows():
    # Use folium.Marker function, use X and Y coordinates to specify location
    folium.Marker(location = [each[1]['latitude'],each[1]['longitude']], 
                  # Add tree name
                  popup=folium.Popup(each[1]['fields.url'],sticky=True),
                  
                  #Make color/style changes here
                  icon=folium.Icon(color='green', icon='fa-tint', prefix='fa'),
                  # Make sure our trees cluster nicely!
                  clustered_marker = True).add_to(marker_cluster)

# Show the map
display(map_osm)
map_osm.save('van_art.html')

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)