### Business Request

In this module we’ll be looking at data from the New York City tree census:
https://data.cityofnewyork.us/Environment/2015-Street-Tree-Census-Tree-Data/uvpi-gqnh

This data is collected by volunteers across the city, and is meant to catalog information
about every single tree in the city.

Build a dash app for a arborist studying the health of various tree species (as defined by the
variable `spc_common`) across each borough (defined by the variable `borough`). This
arborist would like to answer the following two questions for each species and in each
borough:

1. What proportion of trees are in good, fair, or poor health according to the `health`
variable?

2. Are stewards (steward activity measured by the `steward` variable) having an impact
on the health of trees?

Please see the accompanying notebook for an introduction and some notes on the Socrata
API.

*Deployment*: ​ Dash deployment is more complicated than deploying shiny apps, so
deployment in this case is *optional* (and will result in extra credit). You can read instructions on deploying a dash app to heroku here: ​ https://dash.plot.ly/deployment

### Import Libraries

In [3]:
import pandas as pd
from dash import Dash, dcc, html
from dash.dependencies import Input, Output
import plotly.express as px


### Get the unique boroughs and unique species

This is so we can create the drop down lists for each of the Dash apps. We'll be storing the unique boroughs in the `distinct_boroughs` dataframe, and the distinct species in the `distinct_spc_common` dataframe.

In [4]:
## Get all the boroughs in the dataset

boro_url = ('https://data.cityofnewyork.us/resource/nwxe-4ae8.json?' +\
            '$query=select distinct boroname').replace(' ', '%20')
distinct_boroughs = pd.read_json(boro_url)

In [5]:
# Get all the common species in the dataset

spc_common_url = ('https://data.cityofnewyork.us/resource/nwxe-4ae8.json?' +\
            '$query=select distinct spc_common').replace(' ', '%20')
distinct_spc_common = pd.read_json(spc_common_url)

In [26]:
boro = distinct_boroughs['boroname'][4]
spc_common = distinct_spc_common['spc_common'][10]

soql_url = ('https://data.cityofnewyork.us/resource/nwxe-4ae8.json?' +\
        '$select=health,count(tree_id)' +\
        '&$where=boroname=\'{}\'&spc_common=\'{}\''.format(boro, spc_common) +\
        '&$group=health').replace(' ', '%20')

soql_trees = pd.read_json(soql_url)

soql_trees['proportion'] = round(soql_trees['count_tree_id']/soql_trees['count_tree_id'].sum(), 2)

### Question 1 Dash App
Create a drop down menu for `spc_common` and `borough`, which will update the `soql_url` and thus the `soql_trees` dataframe. This question is pretty self explanatory. We just plot the proportion of trees using a bar chart.

In [30]:
# Run this app with `python app.py` and
# visit http://127.0.0.1:8050/ in your web browser.

from dash import Dash, dcc, html
from dash.dependencies import Input, Output
import plotly.express as px

app = Dash(__name__)

colors = {
    'background': '#111111',
    'text': '#7FDBFF'
}

app.layout = html.Div(style={'backgroundColor': colors['background']}, children=[
    html.H1(
        style={
            'textAlign': 'center',
            'color': colors['text']
        }
    ),

    dcc.Dropdown(id='boro_selection', options=[
    {'label': x, 'value': x} for x in distinct_boroughs['boroname']
    ],
            value = 'Bronx'),
            
    dcc.Dropdown(id='spc_selection', options=[
    {'label': x, 'value': x} for x in distinct_spc_common['spc_common']
    ],
            value = 'black walnut'),

    dcc.Graph(
        id='example-graph'
    )
])

@app.callback(Output('example-graph', 'figure'),
              [Input('boro_selection', 'value'),
               Input('spc_selection', 'value')])

def update_figure(boro_selection, spc_selection):
    soql_url = ('https://data.cityofnewyork.us/resource/nwxe-4ae8.json?' +\
            '$select=health,count(tree_id)' +\
            '&$where=boroname=\'{}\'&spc_common=\'{}\''.format(boro_selection, spc_selection) +\
            '&$group=health').replace(' ', '%20')

    soql_trees = pd.read_json(soql_url)

    soql_trees['proportion'] = round(soql_trees['count_tree_id']/soql_trees['count_tree_id'].sum(), 2)
    soql_trees = soql_trees.sort_values('proportion')

    fig = px.bar(
        soql_trees, 
        x="health", 
        y="proportion",
        text_auto=True,
        title = 'Proportion of {} Trees In Good, Fair, or Poor Health for the {} Borough'.format(spc_selection, boro_selection)
        )

    fig.update_layout(
        plot_bgcolor=colors['background'],
        paper_bgcolor=colors['background'],
        font_color=colors['text']
    )

    return fig


if __name__ == '__main__':
    app.run_server()

Dash is running on http://127.0.0.1:8050/

 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://127.0.0.1:8050
[33mPress CTRL+C to quit[0m
127.0.0.1 - - [25/Mar/2023 01:03:05] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [25/Mar/2023 01:03:05] "GET /_dash-layout HTTP/1.1" 200 -
127.0.0.1 - - [25/Mar/2023 01:03:05] "GET /_dash-dependencies HTTP/1.1" 200 -
127.0.0.1 - - [25/Mar/2023 01:03:05] "[36mGET /_dash-component-suites/dash/dcc/async-dropdown.js HTTP/1.1[0m" 304 -
127.0.0.1 - - [25/Mar/2023 01:03:05] "GET /_dash-component-suites/dash/dcc/async-graph.js HTTP/1.1" 200 -
127.0.0.1 - - [25/Mar/2023 01:03:05] "[36mGET /_dash-component-suites/dash/dcc/async-plotlyjs.js HTTP/1.1[0m" 304 -
127.0.0.1 - - [25/Mar/2023 01:03:05] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Mar/2023 01:03:09] "POST /_dash-update-component HTTP/1.1" 200 -


# Question 2

### Let's Think About This

Stewardess activity is defined as the following by the docs:

Below is a short list of the most common examples of what counts as one
stewardship activity:

- Helpful tree guards that do not appear professionally installed

- Mulch or woodchips

- Intentionally-planted flowers or other plants

- Signs related to care of the tree or bed, other than those installed by parks

- Decorations (not including wires or lights added to the tree)

- Seating in the tree bed, usually as part of the tree guard

- Viewing someone performing a stewardship activity during the survey

We should probably just consider trees that had stewardess activities done on them. Therefore, we will ignore trees where `steward == None`. Since there are 3 other `steward` categories (`1or2`,`3or4`,`4orMore`) and 3 `health` categories (`Good`,`Fair`,`Poor`), if we were to plot the `tree_id` count for each of these, we would end up with 6 different bar sizes, and I feel that that would be difficult to read and make conclusions as if stewards are having an impact on the health of trees.

Therefore, I propose that we do the following for each borough and species.

1. Ignore the observations where `steward == None`.
2. Group by `health`
3. Count the number of observations for each group.

Since each of the steward categories that are not None indicate that at least one stewardness activity took place for a tree, we can now generate bar charts that include a maximum of just 3 bars.

- One bar showing us the count where at least 1 stewardness activity took place and is in `Fair` health.
- Another bar showing us the count where at least 1 stewardness activity took place and is in `Good` health.
- Another bar showing us the count where at least 1 stewardness activity took place and is in `Poor` health.

Based on this, we can now determine the proportion of trees where:

- where at least 1 stewardness activity took place and is in `Fair` health vs. all of the trees where at least 1 stewardness activity took place in that borough and for that particular species
- where at least 1 stewardness activity took place and is in `Good` health vs. all of the trees where at least 1 stewardness activity took place in that borough and for that particular species
- where at least 1 stewardness activity took place and is in `Poor` health vs. all of the trees where at least 1 stewardness activity took place in that borough and for that particular species

In [75]:
# Run this app with `python app.py` and
# visit http://127.0.0.1:8050/ in your web browser.

app = Dash(__name__)

colors = {
    'background': '#111111',
    'text': '#7FDBFF'
}

app.layout = html.Div(style={'backgroundColor': colors['background']}, children=[
    html.H1(
        style={
            'textAlign': 'center',
            'color': colors['text']
        }
    ),

    dcc.Dropdown(id='boro_selection', options=[
    {'label': x, 'value': x} for x in distinct_boroughs['boroname']
    ],
            value = 'Bronx'),
            
    dcc.Dropdown(id='spc_selection', options=[
    {'label': x, 'value': x} for x in distinct_spc_common['spc_common']
    ],
            value = 'black walnut'),

    dcc.Graph(
        id='example-graph'
    )
])

@app.callback(Output('example-graph', 'figure'),
              [Input('boro_selection', 'value'),
               Input('spc_selection', 'value')])

def update_figure(boro_selection, spc_selection):
    soql_url = ('https://data.cityofnewyork.us/resource/nwxe-4ae8.json?' +\
            '$select=steward, health' +\
            '&$where=boroname=\'{}\'&spc_common=\'{}\''.format(boro_selection, spc_selection)
            ).replace(' ', '%20')

    soql_trees = pd.read_json(soql_url)
    soql_trees = ((soql_trees.query('steward != \'None\'').groupby('health').count()['steward']) / (soql_trees.query('steward != \'None\'').groupby('health').count()['steward'].sum())).to_frame().reset_index().rename(columns = {'steward':'steward_prop'})
    soql_trees['steward_prop'] = round(soql_trees['steward_prop'], 2)
    soql_trees = soql_trees.sort_values('steward_prop')

    fig = px.bar(
        soql_trees, 
        x="health", 
        y="steward_prop",
        text_auto=True,
        title = 'Proportion of {} Trees In Good, Fair, or Poor Health Where >= 1 Stewardess Activity Took Place for the {} Borough'.format(spc_selection, boro_selection)
        )

    fig.update_layout(
        plot_bgcolor=colors['background'],
        paper_bgcolor=colors['background'],
        font_color=colors['text']
    )

    return fig


if __name__ == '__main__':
    app.run_server()

Dash is running on http://127.0.0.1:8050/

 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://127.0.0.1:8050
[33mPress CTRL+C to quit[0m
127.0.0.1 - - [26/Mar/2023 19:40:48] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [26/Mar/2023 19:40:48] "GET /_dash-layout HTTP/1.1" 200 -
127.0.0.1 - - [26/Mar/2023 19:40:48] "GET /_dash-dependencies HTTP/1.1" 200 -
127.0.0.1 - - [26/Mar/2023 19:40:48] "[36mGET /_dash-component-suites/dash/dcc/async-dropdown.js HTTP/1.1[0m" 304 -
127.0.0.1 - - [26/Mar/2023 19:40:48] "[36mGET /_dash-component-suites/dash/dcc/async-graph.js HTTP/1.1[0m" 304 -
127.0.0.1 - - [26/Mar/2023 19:40:48] "[36mGET /_dash-component-suites/dash/dcc/async-plotlyjs.js HTTP/1.1[0m" 304 -
127.0.0.1 - - [26/Mar/2023 19:40:49] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [26/Mar/2023 19:40:51] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [26/Mar/2023 19:40:57] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [26/Mar/2023 19:40:58] "POST /_dash-update-component HTTP/1.1" 200 -
