![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Trees in Strathcona County

Strathcona County collects data on all trees that are on public land. We are going to explore this dataset.

## Getting Ready

This section sets up many things behind the scenes which are required for the rest of this notebook. Most of the code blocks in this section are ready-to-run so you won't have to do any modifications. You don't need to know everything about various tasks being accomplished by the code cell in this section to complete the challenges. However feel free to ask mentors about anything that makes you curious.

### Importing Libraries

`▸Run` the cell below to import the required Python libraries.

In [None]:
%pip install -q pyodide_http plotly folium nbformat
import pyodide_http
pyodide_http.patch_all()
import pandas as pd
import pandas as pd
import plotly.express as px
import folium
from folium.plugins import FastMarkerCluster
print('Setup Complete')

### Importing Data

We'll use a data set provided by Strathcona County on [data.strathcona.ca](https://data.strathcona.ca/Environment/Trees/ig6t-pdus). It contains tree locations and types, updated four times per year.

Alternatively, we can look at [Edible Fruit Trees](https://data.edmonton.ca/Environmental-Services/Edible-Fruit-Trees/h4ti-be2n) in Edmonton from `https://data.edmonton.ca/api/views/eecg-fc54/rows.csv?accessType=DOWNLOAD`.

In [None]:
trees = pd.read_csv('https://opendata.arcgis.com/api/v3/datasets/8a68e1b6c525481bbd2b524616739ee1_0/downloads/data?format=csv&spatialRefId=3776')
#trees = pd.read_csv('https://data.edmonton.ca/api/views/eecg-fc54/rows.csv?accessType=DOWNLOAD')
trees

## Analysis

We can now do some analysis of the dataset, such as figuring out which tree types are the most common.

We'll group data by `species`, and use the `size()` method to count how many of each kind there are. The `.sort_values()` method will then sort by the `count` we created.

In [None]:
counts_by_name = trees.groupby('species').size().reset_index(name='count')
counts_by_name.sort_values(by='count', ascending=False, inplace=True)
counts_by_name

You can now see the most common types of trees in Strathcona County. Let's visualize the data with a pie chart.

In [None]:
px.pie(counts_by_name.head(5), values='count', names='species', title='Most Common Trees in Strathcona County')

Or the top ten most common as a bar graph.

In [None]:
px.bar(counts_by_name.head(10), x='Name', y='count', title='Most Common Trees in Strathcona County')

## Mapping Data

Since we have a dataframe with `latitude` and `longitude` columns, we will use the Python library called `folium` to visualize our data on a map.

First we will create and display a map. To figure out where the center of the map should be, we'll find the median values from those columns.

In [None]:
median_latitude = trees['latitude'].median()
median_longitude = trees['longitude'].median()

tree_map = folium.Map(location=[median_latitude, median_longitude], zoom_start=10)
display(tree_map)

There are also other map styles that we can try:

* `openstreetmap`
* `stamen terrain`
* `stamen toner`
* `stamen watercolor`
* `cartodb positron`
* `cartodb dark_matter`
* `mapbox bright` (Limited zoom levels)
* `mapbox control room` (Limited zoom levels)

We can now add the tree locations into our map. 

In the cell below we will add markers using the `folium.FastMarkerCluster` function. Each marker will be created from the `latitude` and `longitude` coordinates and added to our map called `tree_map`.

The cell may take a while to run, you'll know it's running if you see a `[*]` by the top left of the cell.

In [None]:
tree_map.add_child(FastMarkerCluster(trees[['latitude','longitude']].values.tolist()))
display(tree_map)

We can also create a map with a different marker style and a text popup label using the code below.

In [None]:
callback = ('function (row) {' 
                'var marker = L.marker(new L.LatLng(row[0], row[1]));'
                'var icon = L.AwesomeMarkers.icon({'
                "icon: 'tree',"
                "iconColor: 'green',"
                "markerColor: 'white',"
                "prefix: 'fa'});"
                'marker.setIcon(icon);'
                "var popup = L.popup({maxWidth: '300'});"
                "const display_text = {text: row[2]};"
                "var mytext = $(`<div id='mytext' class='display_text' style='width: 100.0%; height: 100.0%;'> ${display_text.text}</div>`)[0];"
                "popup.setContent(mytext);"
                "marker.bindPopup(popup);"
                'return marker};')

median_latitude = trees['latitude'].median()
median_longitude = trees['longitude'].median()

tree_map = folium.Map(location=[median_latitude, median_longitude], zoom_start=10)
tree_map.add_child(FastMarkerCluster(trees[['latitude','longitude','species']].values.tolist(), callback=callback))
display(tree_map)

You can now continue your own analysis in the [next notebook](trees-challenge.ipynb).

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)