# Data Circle Notebook 4

This notebook will introduce other graphs you can use from other libraries. Specifically, this notebook will provide graph examples from **seaborn**

In [None]:
import pandas as pd

In [None]:
df= pd.read_csv("PittsTrees.csv")

## First start by importing seaborn

The website with functions is avalible here:
https://seaborn.pydata.org/generated/seaborn.objects.Plot.html

If importing it using the line below gives you troup

In [None]:
import seaborn as sns 

## Step 1: decide what columns you would like to use!

In [None]:
df.columns

## Step 2: Make a mini data frame

In [None]:
quality_trees = df[['common_name', 'growth_space_type','air_quality_benfits_total_dollar_value']]
quality_trees

## Step 3: if you need, make more changes to your dataframe
For my graphs, I want to look at the distributions of air quality benefits for the top 8 most planted trees.

To do this I need to first get the list of the top 8 most planted trees (tree_types) and then have only those 8 trees be included in my new data frame.

In [None]:
tree_types = df[['common_name', 'id']].groupby('common_name').count().sort_values('id', ascending = False).iloc[0:8].reset_index().iloc[:,0]
tree_types

In [None]:
quality_trees = quality_trees[quality_trees['common_name'].isin(tree_types)].dropna()
quality_trees

## Step 4: Graph time!!
Here we are using the **seaborn violin plot** which allows us to compare distributions of different variables on the same graph. (https://seaborn.pydata.org/generated/seaborn.violinplot.html)

My graph could be used to see out of the top 8 trees which ones have the largest probability of supplying high air quality benefits. This information could be helpful in deciding which trees to plant more of. 

In [None]:
sns.violinplot(data = quality_trees, x = 'air_quality_benfits_total_dollar_value', y = 'common_name')

# Another graph example
Here is another useful seaborn plot: a **heatmap** (https://seaborn.pydata.org/generated/seaborn.heatmap.html)

- It is useful for showing correlations between 2 different variables with a value measured in common. 

I am using a heatmap in this example to look at the resulting air quality benefits from the correlations in combinations of tree type and it's growth space type. This could help look at what combinations provide the highest air quality benefits. 

To start, I find the top most popular growth space types and then apply that to my new data frame.

In [None]:
space_types = df[['growth_space_type', 'id']].groupby('growth_space_type').count().sort_values('id', ascending = False).iloc[0:5].reset_index().iloc[:,0]
space_types

In [None]:
quality_trees = quality_trees[quality_trees['growth_space_type'].isin(space_types)]
quality_trees

 For heatmaps, you usually have to use groupby and pivot to get your data frame in the right format for the heatmap plot function.

In [None]:
heatmap_trees = quality_trees.groupby(['common_name', 'growth_space_type']).sum().reset_index()
heatmap_trees

In [None]:
heatmap_trees = heatmap_trees.pivot(values = 'air_quality_benfits_total_dollar_value', 
                                    columns = 'common_name', index = 'growth_space_type')
heatmap_trees

## Graph time pt. 2 yay!!
Notice how heatmaps are just a color representation of what is shown in the table above.

An example of a conclusion you could make from this graph is 'Open or Unrestricted' space types provides the largest variety of air quality benifits across all trees. Or you could also say that we might want to plant more London Planetrees in tree lawns or parkways because that combination produces the highest overall air quality benifits. 

In [None]:
sns.heatmap(heatmap_trees)

## Now try it with your own dataset!
Check out more of their cool graphs here: https://seaborn.pydata.org/generated/seaborn.objects.Plot.html

If you are really loving graphs this is just the beginning, there are many more libraries where you can find more graph functions such as **Matplotlib**, **Plotly**, or **Geopandas**. We probably won't go into those unless you ask specifically for your project!