# Final Project - Part 2
### by Anisha Raja

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import altair as alt

In [2]:
# Found it necessary to specify rows to include since bottom of the dataset has some explanatory text.
personnel = pd.read_csv('county_personnel.csv').head(159)

#Source: https://stackoverflow.com/questions/11346283/renaming-column-names-in-pandas
personnel = personnel.rename(columns={'2020-21 Administrators, Average Annual Salary, Dollars': 'Administrators', '2020-21 Support Personnel, Average Annual Salary, Dollars': 'Support Personnel', '2020-21 PK-12 Teachers, Average Annual Salary, Dollars': 'PK-12 Teachers'})
#personnel

## Creation of Scatter Plot: "driver" chart

In [3]:
#single = alt.selection_single(clear=False)
single = alt.selection_single()

# Source: https://altair-viz.github.io/gallery/scatter_tooltips.html
# Source: https://altair-viz.github.io/altair-tutorial/notebooks/06-Selections.html
# Source: https://altair-viz.github.io/user_guide/generated/channels/altair.Y.html
# Source: https://github.com/altair-viz/altair/issues/1919
scatter_chart = alt.Chart(personnel).mark_circle(size=250).encode(
    alt.X("2019 Public School CCRPI Score:Q", scale=alt.Scale(domain=[50, 95])),
    alt.Y("PK-12 Teachers:Q", scale=alt.Scale(domain=[45000, 70000]), title="PK-12 Teachers (Average Annual Salary in USD)"),
    color = alt.condition(single, alt.value('green'), alt.value('lightgray')),
    tooltip=['County', '2019 Public School CCRPI Score', 'PK-12 Teachers']
).properties(
    height=300,
    width=600
).add_selection(
    single
).properties(
    title={
        "text": ["Average Annual Salary of PK-12 Teachers", "Compared to Public School CCRPI score"]
    }
)

## Creation of Text that goes next to each point in the Scatter Plot

In [4]:
# Source: https://altair-viz.github.io/gallery/scatter_with_labels.html
# Source: https://stackoverflow.com/questions/68646813/how-to-add-text-on-interactive-scatter-on-altair
county_text = alt.Chart(personnel).mark_text(
    align='left',
    baseline='middle',
    dx=-20,
    dy=-15
).encode(
    alt.X("2019 Public School CCRPI Score:Q", scale=alt.Scale(domain=[50, 95])),
    alt.Y("PK-12 Teachers:Q", scale=alt.Scale(domain=[45000, 70000])),
    text='County:N'
).transform_filter(
    single
)

#scatter_chart + county_text
#scatter_chart

## Creation of Bar Chart : "driven" chart

In [5]:
# Source: https://stackoverflow.com/questions/72181211/grouped-bar-charts-in-altair-using-two-different-columns
# Source: https://altair-viz.github.io/gallery/grouped_bar_chart.html
# Source: https://stackoverflow.com/questions/53067796/altair-remove-or-suppress-automatically-generated-plot-legend
# Source: https://github.com/altair-viz/altair/issues/1919
bar_chart = alt.Chart(personnel).mark_bar().encode(
    alt.X("Personnel Category:N"),
    alt.Y("Average Annual Salary (USD):Q", scale=alt.Scale(domain=[40000, 140000])),
    color = alt.Color('Personnel Category:N', legend=None),
    tooltip=['Average Annual Salary (USD):Q']
).transform_filter(
    single
).transform_fold(
    as_=['Personnel Category', 'Average Annual Salary (USD)'],
    fold=['Administrators', 'Support Personnel', 'PK-12 Teachers']
).properties(
    title={
        "text": ["Comparing Salaries of", "Administrators, Support Staff,", "and PK-12 Teachers"]
    }
)

## Dashboard

In [6]:
chart = (scatter_chart + county_text) | bar_chart.properties(width=100)
#chart = scatter_chart.properties(width=500) | bar_chart.properties(width=100)
chart

In [7]:
myJekyllDir = 'C:/anisha/UIUC/Academics/9 - Spring 2023/IS 445/anisharaja.github.io/assets/json/'

In [8]:
chart.save(myJekyllDir+'final_dashboard.json')

## Dashboard Write-up
On the left side of the dashboard, there is a plot describing the relationship between the 2019 Public School CCRPI score and the average annual salary of Pre-K to Grade 12 teachers in various counties in the state of Georgia. Also on the left side, if one hovers over a circle, then there will be information about the county, the 2019 Public School CCRPI score and the average annual salary of Pre-K to Grade 12 teachers corresponding to that circle. On the right side of the dashboard, there is a chart illustrating a comparison between the average annual salaries of Administrators, Support Staff, and Pre-K to Grade 12 teachers in a particular county. Also on the right side, if one hovers over the bar, then the average annual salary associated with that bar will be displayed. The county that is illustrated on the right is based on the circle that is selected on the left. For example, if the county selected on the left is Jenkins county, then the average annual salaries of Administrators, Support Staff, and Pre-K to Grade 12 teachers of Jenkins county will be displayed on the right. If none of the circles are selected, then the labels for all of the counties will show. Once a county is selected, the dashboard should reset itself and show the data it is supposed to show. 

**My dataset is not too large, so I plan on hosting it through Github itself.**

## Contextual Dataset Writeup
- Title: Estimated average annual salary of teachers in public elementary and secondary schools, by state
- Link: https://nces.ed.gov/programs/digest/d22/tables/dt22_211.60.asp
- Usefulness: This dataset would be useful in my data story because it allows people to see how Georgia pays its teachers compared to other states, which can provide some context to my primary dataset.
- **This dataset is not too large, so I plan on hosting it through Github itself.**