# Dominant Tree Genera in Vancouver's Neighbourhoods: A Study of Popularity, Characteristics, and their location

@author: Dhruvin Modi

August 11, 2024

## Foreword

In this report, the intent is to visualize the top three most popular tree genera and their characteristics that are found in each neighbourhood of Vancouver. We will achieve this by using `Vancouver Street Trees` dataset that is been located in [Here](https://opendata.vancouver.ca/explore/dataset/street-trees/information/?disjunctive.species_name&disjunctive.common_name&disjunctive.on_street&disjunctive.neighbourhood_name). I will be using this dataset to offer user friendly and interactive visualizations using `Altair` that could offer in-deth understanding about the most popular tree genera that could be find in neighbourhoods of Vancouver.

## Introduction to Research

## Question(s) of interest

This visualization was done for a private tree management firm in Vancouver to help them guide through the visualiation of most popular tree generas available in the city of Vancouver. They had specific questions that needed to be analyzed and visualized based on their business of interest. 

Below are questions of interest that will be answered through this vusualization to offer more insightful outcome towards the group of these tree genera.

1. What are the top three most popular tree genera in each neighborhood of Vancouver, and is there diversity in the planting of these genera?
2. How do these popular tree genera in Vancouver compare in terms of diameter, height, and overall size and does certain tree genera consistently have larger or smaller size?
3. Which planting areas are popular for cultivating large and small tree genera? Are tree genera predominantly concentrated in specific planting areas, or are they evenly distributed across different plant area sizes?
4. What are the typical sizes of popular tree genus that thrive in close proximity to urban infrastructure? 

## Introduction to Data

To visualize data about the top three prominent tree genera across different neighbourhood of Vancouver, it is vital to have official dataset that contains right information to process our analysis. For this purpose, we have used `The City of Vancouver` dataset that can be viewed on this [link](https://raw.githubusercontent.com/UBC-MDS/data_viz_wrangled/main/data/Trees_data_sets/small_unique_vancouver.csv) it is been stored as `.csv` format that is easy to access using `Pandas` library. This dataset is a subset of larger public dataset that is officially been publised by `The City of Vancouver` that could be referred [here](https://opendata.vancouver.ca/explore/dataset/street-trees/information/?disjunctive.species_name&disjunctive.common_name&disjunctive.height_range_id&disjunctive.on_street&disjunctive.neighbourhood_name). The dataset was been utilized as per the direction of the `Open Government Licence` that could be viewed by clicking [here](https://opendata.vancouver.ca/pages/licence/). Moreover, I acknowledge that this dataset is been published under copyright of `2024 City of Vancouver`. 

## Columns of Interest

Referring to the dataset about the Trees from `The City of Vancouver`, there are few key columns that would be very crucial for plotting our primary visualzations. The columns that would be vital for our analysis will be neighbourhood_name, diameter, height_range_id, genus_name, plant_area, latitude, longitude. Our main visualization over above four research questions would be highly based on these columns. Furthermore, for interactive plots and to offer widgets to plots, columns such as curb, street_side_name, and plant_area will be crucial. Detailed information about the data present inside the column of interest is further provided in analysis section.

## Introduction to Data Processing

In this report the data has been processed with focus on questons of our interest. Primarily, the data has been grouped based on the genus_name and is been filtered with top three genus_name for every neighbhood_name from the dataset as this is the main theme that underlines our visualization. Furthermore, we are interested in plant_area column and hence Null/NaN values with it will be filtered out. Also, majority of our analysis is been aggregated and grouped based on questions of interest that could offer the best possible visualization outcome that is easy to understand.

## Analysis

Lets begin by importing necessary packages that are required for our analysis.

In [1]:
import pandas as pd
import altair as alt
from vega_datasets import data
alt.data_transformers.enable("default", max_rows=None)
import json

In [2]:
trees_url = 'https://raw.githubusercontent.com/UBC-MDS/data_viz_wrangled/main/data/Trees_data_sets/small_unique_vancouver.csv'

treesdf = pd.read_csv(trees_url, parse_dates=['date_planted'])
treesdf.head()

Unnamed: 0.1,Unnamed: 0,std_street,on_street,species_name,neighbourhood_name,date_planted,diameter,street_side_name,genus_name,assigned,...,plant_area,curb,tree_id,common_name,height_range_id,on_street_block,cultivar_name,root_barrier,latitude,longitude
0,10747,W 20TH AV,W 20TH AV,PLATANOIDES,Riley Park,2000-02-23,28.5,EVEN,ACER,N,...,15,Y,21421,NORWAY MAPLE,4,0,,N,49.252711,-123.106323
1,12573,W 18TH AV,W 18TH AV,CALLERYANA,Arbutus-Ridge,1992-02-04,6.0,ODD,PYRUS,N,...,7,Y,129645,CHANTICLEER PEAR,2,2300,CHANTICLEER,N,49.25635,-123.158709
2,29676,ROSS ST,ROSS ST,NIGRA,Sunset,NaT,12.0,ODD,PINUS,N,...,7,Y,154675,AUSTRIAN PINE,4,7800,,N,49.213486,-123.083254
3,8856,DOMAN ST,DOMAN ST,AMERICANA,Killarney,1999-11-12,11.0,EVEN,FRAXINUS,N,...,7,Y,180803,AUTUMN APPLAUSE ASH,4,6900,AUTUMN APPLAUSE,N,49.220839,-123.036721
4,21098,EAST BOULEVARD,EAST BOULEVARD,HIPPOCASTANUM,Shaughnessy,NaT,15.5,ODD,AESCULUS,Y,...,N,Y,74364,COMMON HORSECHESTNUT,4,5200,,N,49.238514,-123.154958


In [3]:
drop = ['Unnamed: 0', 'std_street', 'assigned', 'civic_number', 'tree_id', 'on_street_block', 'cultivar_name', 'date_planted']

treesdf = treesdf.drop(columns=drop)
treesdf.head()

Unnamed: 0,on_street,species_name,neighbourhood_name,diameter,street_side_name,genus_name,plant_area,curb,common_name,height_range_id,root_barrier,latitude,longitude
0,W 20TH AV,PLATANOIDES,Riley Park,28.5,EVEN,ACER,15,Y,NORWAY MAPLE,4,N,49.252711,-123.106323
1,W 18TH AV,CALLERYANA,Arbutus-Ridge,6.0,ODD,PYRUS,7,Y,CHANTICLEER PEAR,2,N,49.25635,-123.158709
2,ROSS ST,NIGRA,Sunset,12.0,ODD,PINUS,7,Y,AUSTRIAN PINE,4,N,49.213486,-123.083254
3,DOMAN ST,AMERICANA,Killarney,11.0,EVEN,FRAXINUS,7,Y,AUTUMN APPLAUSE ASH,4,N,49.220839,-123.036721
4,EAST BOULEVARD,HIPPOCASTANUM,Shaughnessy,15.5,ODD,AESCULUS,N,Y,COMMON HORSECHESTNUT,4,N,49.238514,-123.154958


#### Columns Dropped:

* **Unnamed: 0:** This column appears to be an index and may not provide meaningful information for analysis.

* **std_street:** This Column seems to be identical to on_street column. While on_street has 607 Unique values, which is more than 603 of std_street so it will be reasonable to drop std_street column.

* **assigned, tree_id, on_street_block:** With our analysis goals, these columns may not be relevant. It seems to be irrelevant to include these columns as they are redundant informations for our analysis. Based on our questions of interest none of this columns would offer meaningful insights and hence not having them in our analysis won't create any difference.

* **civic_number:** This column doesn't offer any relevance to our analysis and visualization and hence decided to drop it.

* **date_planted:** This visualization won't be focused on temporal analysis and hence date_planted column would be of no use.

In [4]:
treesdf.info()
print('\n')
treesdf.describe(include='all', datetime_is_numeric=True)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 13 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   on_street           5000 non-null   object 
 1   species_name        5000 non-null   object 
 2   neighbourhood_name  5000 non-null   object 
 3   diameter            5000 non-null   float64
 4   street_side_name    5000 non-null   object 
 5   genus_name          5000 non-null   object 
 6   plant_area          4950 non-null   object 
 7   curb                5000 non-null   object 
 8   common_name         5000 non-null   object 
 9   height_range_id     5000 non-null   int64  
 10  root_barrier        5000 non-null   object 
 11  latitude            5000 non-null   float64
 12  longitude           5000 non-null   float64
dtypes: float64(3), int64(1), object(9)
memory usage: 507.9+ KB




Unnamed: 0,on_street,species_name,neighbourhood_name,diameter,street_side_name,genus_name,plant_area,curb,common_name,height_range_id,root_barrier,latitude,longitude
count,5000,5000,5000,5000.0,5000,5000,4950.0,5000,5000,5000.0,5000,5000.0,5000.0
unique,607,171,22,,4,67,38.0,2,361,,2,,
top,CAMBIE ST,SERRULATA,Renfrew-Collingwood,,ODD,ACER,10.0,Y,KWANZAN FLOWERING CHERRY,,N,,
freq,49,463,384,,2554,1218,736.0,4593,383,,4679,,
mean,,,,12.340888,,,,,,2.7344,,49.247349,-123.107128
std,,,,9.2666,,,,,,1.56957,,0.021251,0.049137
min,,,,0.0,,,,,,0.0,,49.202783,-123.22056
25%,,,,4.0,,,,,,2.0,,49.230152,-123.144178
50%,,,,10.0,,,,,,2.0,,49.247981,-123.105861
75%,,,,18.0,,,,,,4.0,,49.263275,-123.063484


Performing `.info()` and `.decribe()` on our `treesdf` dataframe that helped us gathering high level information about our columns of interest such as Null/Non-Null values and d-type of each column. Initiating `.describe()` helped into gathering information about unique values, count, top value for each qualitative data and offered detailed insights into our columns of interest.

### Columns of Interest Description of Data

* **on_street:** We aren't performing any primary analysis using this column. However, on_street column is crucial for us to offer into Tootip for our plots as it could help us gather vital information about the specific plot and the location of tree genera specifically. This column could be very crucial if we decide to perform future analysis related to specific streets in neighbourhood of Vancouver. This column has data related to street names on which particular tree is located.

* **species_name:** This column values are been used in Tooltip to gather information about the species under specific tree genera. However, we aren't performing any crucial visualization using this column but while exploring the top three tree genera, having species name attached to specific tree genera could offer vital information and could open future research questions specifically with species names. This column carries data related to tree species names for an individual tree. 

* **neighbourhood_name:** This column will be very crucial for our primary analysis as there might be instances where we need aggregate and group data on neighbourhood and perform our analysis. This column would be used for our dropdown option widget to gather in-depth analysis about neighbourhoods. This feature carries crucial information about the neighbourhood names where individual trees are located.

* **diameter and height_range_id:** These columns are crucial quantitative sources of data through which we are gonna visualize the characteristics of tree genera and would help us to distinguish between the popular tree genera. As the names suggest these columns has crucial information about trees diameter from the breast height (DBH in inches) and height_range_id columnn has data about the range of height for individual trees in feet. 

* **street_side_name:** This column will be useful for our widgets as a dropdown options that would offer in-depth analysis to our visualizations. THis column offers information such as which side is tree been planted ODD, Even, etc...

* **genus_name:** This feature is one of the most crucial data for our analysis as our research questions are dependent on it and our primary analysis would be conducted on basis of this column. Using this column the data has been aggregated and grouped in our our entire analysis. Moreover, this column data would be crucial in offering legends to our plots that could offer clear understanding of visualization.

* **plant_area:** This column will be useful in our primamry analysis for identifying the plant areas where the tree genera are located and spaces in which they thrive. This column is object dtype however it contains mix of numeric and quantitative data. The numeric data represents width of boulevard in feet and alphabet reprsents dedicated space name. Note, this column contains 50 Null values and would require attention to filter these data.

* **curb:** This column contains weather the trees are planted on curbs or not, which could be useful indepth analysis when interacted with plant_area. Hence, this column data is been used as radiobuttons widget.

* **root_barrier:** This column is particular useful in analyzing the question 4 of reaseach, it could provide vital information about trees that are in close proximity of the urban infrastructures. 

* **longitude and latitude:** This columns contain the logitude and latitude at which trees are located, it is crucial infomation to map this tree on geojson plots and to obtain vital visualization of distribution of top three tree genera.

### Q1: What are the top three most popular tree genera in each neighborhood of Vancouver, and is there diversity in the planting of these genera?

In [37]:
grouped = treesdf.groupby(['neighbourhood_name', 'genus_name']).size().reset_index(name='count') 
topgenus = grouped.groupby('neighbourhood_name').apply(lambda x: x.nlargest(3, 'count')).reset_index(drop=True) 

genusdf = pd.merge(treesdf, topgenus[['neighbourhood_name', 'genus_name']], on=['neighbourhood_name', 'genus_name']) 

genusdf = genusdf[['neighbourhood_name', 'on_street', 'genus_name', 'species_name', 'diameter', 'height_range_id', 
                 'street_side_name', 'plant_area', 'curb', 'root_barrier', 'latitude', 'longitude']]

genusdf = genusdf.dropna(subset=['plant_area'])
genusdf

Unnamed: 0,neighbourhood_name,on_street,genus_name,species_name,diameter,height_range_id,street_side_name,plant_area,curb,root_barrier,latitude,longitude
0,Riley Park,W 20TH AV,ACER,PLATANOIDES,28.50,4,EVEN,15,Y,N,49.252711,-123.106323
1,Riley Park,CAMBIE ST,ACER,FREEMANI X,6.00,1,EVEN,c,Y,N,49.254206,-123.115057
2,Riley Park,E WOODSTOCK AV,ACER,CAMPESTRE,3.00,2,ODD,10,Y,N,49.233623,-123.099265
3,Riley Park,SOPHIA ST,ACER,PSEUDOPLATANUS,11.00,4,EVEN,5,Y,N,49.250683,-123.098567
4,Riley Park,E 17TH AV,ACER,PLATANOIDES,5.00,2,EVEN,10,Y,N,49.255653,-123.102617
...,...,...,...,...,...,...,...,...,...,...,...,...
2669,Grandview-Woodland,GARDEN DRIVE,MALUS,ZUMI,8.75,1,EVEN,6,Y,N,49.273592,-123.057865
2670,Grandview-Woodland,E 3RD AV,MALUS,ZUMI,5.50,1,ODD,7,Y,Y,49.267795,-123.064083
2671,Grandview-Woodland,VICTORIA DRIVE,MALUS,FLORIBUNDA,12.00,2,ODD,B,Y,N,49.283158,-123.065682
2672,Grandview-Woodland,E 3RD AV,MALUS,ZUMI,5.00,2,ODD,8,Y,N,49.267832,-123.068793


In [38]:
genusdf.describe(include='all', datetime_is_numeric=True)

Unnamed: 0,neighbourhood_name,on_street,genus_name,species_name,diameter,height_range_id,street_side_name,plant_area,curb,root_barrier,latitude,longitude
count,2650,2650,2650,2650,2650.0,2650.0,2650,2650.0,2650,2650,2650.0,2650.0
unique,22,485,12,66,,,3,32.0,2,2,,
top,Kensington-Cedar Cottage,CAMBIE ST,ACER,SERRULATA,,,ODD,10.0,Y,N,,
freq,190,30,1209,459,,,1364,421.0,2476,2502,,
mean,,,,,12.934072,2.68,,,,,49.247402,-123.109484
std,,,,,8.786497,1.416966,,,,,0.021143,0.04845
min,,,,,0.0,0.0,,,,,49.202986,-123.22036
25%,,,,,5.0,2.0,,,,,49.2306,-123.145269
50%,,,,,11.5,2.0,,,,,49.247416,-123.109441
75%,,,,,18.0,4.0,,,,,49.263311,-123.06776


**Analysis i:** With the above analysis we are obtianing the new dataframe `genusdf` that is filtered from our main dataset. In this analysis we are aggregating the data based on `genus_name` and `neighbourhood_name` and obtaining the data of top three tree genera across all the individual neighbourhoods of Vancouver. The dataframe obtained from this analysis would act as a base to our further visualization. 

In [43]:
genus_counts = genusdf['genus_name'].value_counts().reset_index()
genus_counts.columns = ['genus_name', 'count']

# Sort by count in ascending order
sorted_genus_counts = genus_counts.sort_values(by='count', ascending=True)

# Create selection
selector_genusname = alt.selection_single(fields=['genus_name'])

# Bar chart
plota1 = alt.Chart(sorted_genus_counts).mark_bar().encode(
    x=alt.X('count:Q', title='Count'),
    y=alt.Y('genus_name:N', title='Genus Name', sort=alt.EncodingSortField(field='count', order='descending')),
    color=alt.Color('genus_name:N', scale=alt.Scale(scheme='tableau20'), legend=None),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.1)),
    tooltip=[alt.Tooltip('genus_name:N', title='Genus Name'), alt.Tooltip('count:Q', title='Count')]
).properties(
    width=350,
    height=350,
    title='Plot2: Total Counts for Top Three Tree Genera in Vancouver Neighbourhood'
).add_selection(selector_genusname).interactive()


url_geojson = 'https://raw.githubusercontent.com/UBC-MDS/exploratory-data-viz/main/data/local-area-boundary.geojson'
data_geojson_remote = alt.Data(url=url_geojson, format=alt.DataFormat(property='features",type="json'))

genus_title = alt.TitleParams(
    'Plot1: The Top Three Tree Genera for Each Neighbourhood in Vancouver',
    subtitle=['Acer and Prunus dominate the top two spots.',
    'While every Neighbourhood has different tree Genera at third spot'])

vancitymap = alt.Chart(data_geojson_remote).mark_geoshape(
    color = 'lightgray', opacity= 0.5, stroke='black').encode().properties(width=500, height=350, title=genus_title
).project(type='identity', reflectY=True)

select_genus = alt.selection_single(fields=['genus_name'], bind='legend')

plot_a = alt.Chart(genusdf).mark_point(
    shape='arrow',
    size=8,
).encode(
    longitude='longitude:Q',
    latitude='latitude:Q',
    color=alt.Color('genus_name:N', title='Genus Names', scale=alt.Scale(scheme='tableau20')),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.025)),
    tooltip=[
        alt.Tooltip('neighbourhood_name:N', title='Neighbourhood'),
        alt.Tooltip('on_street:N', title='Street Name'),
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('species_name:N', title='Tree Specie')]
).add_selection(selector_genusname)

final = alt.layer(vancitymap, plot_a)


final_plot1 = alt.vconcat(final, plota1, spacing=30)
final_plot1

### Q2: How do these popular tree genera in Vancouver compare in terms of diameter, height, and overall size and does certain tree genera consistently have larger or smaller size?

In [41]:
average_height = genusdf.groupby('genus_name')['height_range_id'].mean().reset_index()
average_height.columns = ['genus_name', 'average_height_range_id']

plotb1 = alt.Chart(average_height).mark_bar().encode(
    y=alt.Y('genus_name:N', title='Genus Name', sort=alt.EncodingSortField(field='average_height_range_id', order='descending')),
    x=alt.X('average_height_range_id:Q', title='Avg. Height in feet (e.g., 0 = 0-10 ft, 1 = 10-20 ft,..))'),
    color=alt.Color('genus_name:N', scale=alt.Scale(scheme='tableau20'), legend=None),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.05)),
    tooltip=[alt.Tooltip('genus_name:N', title='Genus Name'), alt.Tooltip('average_height_range_id:Q', title='Average Height Range ID')]
).properties(
    width=350,
    height=350,
    title='Plot2: Average Height Range of Tree Genera'
).add_selection(selector_genusname).interactive()

    
Characteristic_title = alt.TitleParams(
    'Plot3: Factors that differiantiates Tree Genera',
    subtitle='Quercus and Ulmus exhibits large sizes, while Malus and Crataegus has smallest sizes')

plot_b = alt.Chart(genusdf).mark_circle().encode(
    x=alt.X('height_range_id:Q', title='Trees Height in feet ((e.g., 0 = 0-10 ft, 1 = 10-20 ft, 2 = 20-30 ft, and 10 = 100+ ft))'),
    y=alt.Y('diameter:Q', title='Trees Diameter at Breast Height (DBH in inches)'),
    color=alt.Color('genus_name:N', title='Genus Name', scale=alt.Scale(scheme='tableau20')),
    size=alt.Size('combined_size:Q', scale=alt.Scale(domain=[0, 100], range=[30, 300]), legend=None),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.00)),
    tooltip=[
        alt.Tooltip('neighbourhood_name', title='Neighbourhood'),
        alt.Tooltip('on_street', title='Street Name'), 
        alt.Tooltip('genus_name', title='Genus Name'), 
        alt.Tooltip('species_name', title='Tree Specie'), 
        alt.Tooltip('diameter', title='Diameter (DBH in inches)'), 
        alt.Tooltip('height_range_id', title='Height Range in feet')]
).transform_calculate(
    combined_size='datum.diameter + datum.height_range_id'
).transform_filter(
    selector_genusname
).properties(width=500, height=350, title=Characteristic_title).add_selection(selector_genusname).interactive()

final_plot2 = alt.hconcat(plotb1, plot_b, spacing=30)
final_plot2

### Q3: Which planting areas are popular for cultivating large and small tree genera? Are tree genera predominantly concentrated in specific planting areas, or are they evenly distributed across different plant area sizes?

**Analysis ii:** Here to obatin impactful visualization on the `plant_area` we will be using `groupby` and `aggregating` function to obtaitain the mean of diameter and genus_name counts for each specific plant_area that could offer meaningful `heatmap` to visualize our outcome.

In [46]:
area_title = alt.TitleParams(
    'Plot4: Plant Areas Where the Most Popular Tree Genera Thrive',
    subtitle=['10, 6, and 8-foot boulevard widths, along with areas marked N for no sidewalk', 
              'are the most common sites for these trees.', 
              '(The numbers inside the boxes represent the count of each tree genus within that specific plant area)'])
              
stats = genusdf.groupby(['plant_area', 'genus_name']).agg(
    average_diameter=('diameter', 'mean'),
    count=('genus_name', 'count')
).reset_index()

plotc = alt.Chart(stats).mark_rect().encode(
    x=alt.X('plant_area:N', title='Plant Area (B=behind sidewalk, C=cutout, G=in tree grate, L=lane, P=park. Numeric value=boulevard width in ft.)'),
    y=alt.Y('genus_name:N', title='Genus Name', sort=alt.EncodingSortField(field='average_diameter', order='descending')),
    color=alt.Color('average_diameter:Q', title='Average Diameter (DBH in inches)', scale=alt.Scale(scheme='viridis', reverse=True)),
    tooltip=[
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('plant_area:N', title='Plant Area'),
        alt.Tooltip('average_diameter:Q', title='Average Diameter (DBH in inches)'),
        alt.Tooltip('count:Q', title='Count of Genus Name')]
).properties(
    width=550,
    height=400,
    title=area_title
).interactive()

textc = alt.Chart(stats).mark_text(
    align='center',
    baseline='middle',
    color='black',
    size=8
).encode(
    x=alt.X('plant_area:N'),
    y=alt.Y('genus_name:N'),
    text=alt.Text('count:Q', format='.0f'))

plotd = plotc + textc

plotd

### Q4: What are the typical sizes of popular tree genus that thrive in close proximity to urban infrastructure? 

**Analysis iii:** To analyze trees in close proximity of urban infrastructures it could be achieve by filtering trees genera having `root_barries` installed in them. So for this we will be only interested in columns of `root_barrier == Y`. This would help us to gather all tree genera with root barriers instlled.

In [47]:
genusdf_new = genusdf[genusdf['root_barrier'] == 'Y']

root_title = alt.TitleParams(
    'Plot5: Tree Genera that are in Close Proximity to Urban Infrastructure',
    subtitle='Typical height for these trees is 10-30 feet and DBH less than 14 inches')

select_genus1 = alt.selection_single(fields=['genus_name'], bind='legend')

plote = alt.Chart(genusdf_new).mark_point(size=100, filled=True).encode(
    x=alt.X('height_range_id:Q', title='Trees Height in feet ((e.g., 0 = 0-10 ft, 1 = 10-20 ft, 2 = 20-30 ft, and 10 = 100+ ft))'),
    y=alt.Y('diameter:Q', title='Trees Diameter at Breast Height (DBH in inches)'),
    color=alt.Color('genus_name:N', title='Genus Name', scale=alt.Scale(scheme='tableau20')),
    shape=alt.Shape('street_side_name:N', title='Street Side with Root Barriers', scale=alt.Scale(domain=['EVEN','MED', 'ODD'], range=['circle', 'triangle', 'square'])),
    opacity=alt.condition(select_genus1, alt.value(0.8), alt.value(0.00)),
    tooltip=[
        alt.Tooltip('neighbourhood_name:N', title='Neighbourhood'),
        alt.Tooltip('on_street:N', title='Street Name'),
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('height_range_id:O', title='Height Range ID'),
        alt.Tooltip('diameter:Q', title='Diameter (inches)'),
        alt.Tooltip('street_side_name:N', title='Street Side Name')]
).properties(
    width=500,
    height=350,
    title=root_title
).add_selection(select_genus1).interactive()

plote

### Interactive Dashboard with Widgets, Selection and Bindings

In [48]:
sorted_curb = sorted(genusdf['curb'].unique())
sorted_side = sorted(genusdf_new['street_side_name'].unique())
sorted_area = sorted(genusdf_new['plant_area'].unique())

radiobuttons_curb = alt.binding_radio(name='Curb: ', options=sorted_curb)
radiobuttons_side = alt.binding_radio(name='Street Side: ', options=sorted_side)
dropdown_area = alt.binding_select(name='Plant Area: ', options=sorted_area)

selection_curb = alt.selection_single(
    fields=['curb'],
    bind=radiobuttons_curb)

selection_side = alt.selection_single(
    fields=['street_side_name'],
    bind=radiobuttons_side)

selection_area = alt.selection_single(
    fields=['plant_area'],
    bind=dropdown_area)

selector_genusname = alt.selection_single(fields=['genus_name'])

plot_a1 = alt.Chart(genusdf).mark_point(
    shape='arrow',
    size=8,
).encode(
    longitude=alt.Longitude('longitude:Q'),
    latitude=alt.Latitude('latitude:Q'),
    color=alt.Color('genus_name:N', title='Genus Names', scale=alt.Scale(scheme='tableau20')),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.025)),
    tooltip=[
        alt.Tooltip('neighbourhood_name:N', title='Neighbourhood'),
        alt.Tooltip('on_street:N', title='Street Name'),
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('species_name:N', title='Tree Species')]
).add_selection(selector_genusname)

final1 = alt.layer(vancitymap, plot_a1)

plot_bar = alt.Chart(average_height).mark_bar().encode(
    y=alt.Y('genus_name:N', title='Genus Name'),
    x=alt.X('average_height_range_id:Q', title='Avg. Height in feet (e.g., 0 = 0-10 ft, 1 = 10-20 ft,..))'),
    color=alt.Color('genus_name:N', scale=alt.Scale(scheme='tableau20'), legend=None),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.05)),
    tooltip=[
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('average_height_range_id:Q', title='Average Height Range ID')]
).properties(
    width=300,
    height=750,
    title='Plot2: Average Height Range of Tree Genera'
).add_selection(selector_genusname).interactive()

plot_z = alt.Chart(genusdf).mark_circle().encode(
    x=alt.X('height_range_id:Q', title='Trees Height in feet ((e.g., 0 = 0-10 ft, 1 = 10-20 ft, 2 = 20-30 ft, and 10 = 100+ ft))'),
    y=alt.Y('diameter:Q', title='Trees Diameter at Breast Height (DBH in inches)'),
    color=alt.Color('genus_name:N', title='Genus Name', scale=alt.Scale(scheme='tableau20'), legend=None),
    size=alt.Size('combined_size:Q', scale=alt.Scale(domain=[0, 100], range=[30, 300]), legend=None),
    opacity=alt.condition(selector_genusname, alt.value(0.8), alt.value(0.00)),
    tooltip=[
        alt.Tooltip('neighbourhood_name:N', title='Neighbourhood'),
        alt.Tooltip('on_street:N', title='Street Name'),
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('species_name:N', title='Tree Species'),
        alt.Tooltip('diameter:Q', title='Diameter (DBH in inches)'),
        alt.Tooltip('height_range_id:Q', title='Height Range in feet')]
).transform_calculate(
    combined_size='datum.diameter + datum.height_range_id'
).transform_filter(selector_genusname).transform_filter(selection_curb
).properties(width=500, height=350, title=Characteristic_title
).add_selection(selector_genusname, selection_curb).interactive()


plote1 = alt.Chart(genusdf_new).mark_point(size=100, filled=True).encode(
    x=alt.X('height_range_id:Q', title='Trees Height in feet ((e.g., 0 = 0-10 ft, 1 = 10-20 ft, 2 = 20-30 ft, and 10 = 100+ ft))'),
    y=alt.Y('diameter:Q', title='Trees Diameter at Breast Height (DBH in inches)'),
    color=alt.Color('genus_name:N', title='Genus Name', scale=alt.Scale(scheme='tableau20')),
    shape=alt.Shape('street_side_name:N', title='Street Side with Root Barriers', scale=alt.Scale(domain=['EVEN','MED', 'ODD'], range=['circle', 'triangle', 'square']), legend=None),
    opacity=alt.condition(select_genus1, alt.value(0.8), alt.value(0.00)),
    tooltip=[
        alt.Tooltip('neighbourhood_name:N', title='Neighbourhood'),
        alt.Tooltip('on_street:N', title='Street Name'),
        alt.Tooltip('genus_name:N', title='Genus Name'),
        alt.Tooltip('height_range_id:O', title='Height Range ID'),
        alt.Tooltip('diameter:Q', title='Diameter (inches)'),
        alt.Tooltip('street_side_name:N', title='Street Side Name')]
).transform_filter(selection_side & selection_area
).properties(
    width=500,
    height=350,
    title=root_title
).add_selection(select_genus1, selection_side, selection_area).interactive()


panel1 = alt.vconcat(final1, plot_z, spacing=30).resolve_scale(x='independent', y='independent')

panel2 = alt.hconcat(panel1, plot_bar, spacing=30).resolve_scale(x='independent', y='independent')

final_panel = alt.vconcat(panel2, plote1).resolve_scale(color='independent')
final_panel

**Note:** Note all plots are interactive here where you can interact between two plots and get the output 