# OPTIONAL Workbook for Homework #8

You are welcome to work on a local version of a notebook and upload it for this assignment.

This workspace is here if you'd rather not have to install all necessary packages locally.

You can download any json files to your local computer to add them to your jekyll page.

To download, you can right-click on the file and hit download.  For example, with the following code:

```python
from vega_datasets import data
import altair as alt

source = data.cars()
source.rename(columns={"Miles_per_Gallon":"Miles per Gallon"}, 
              inplace=True)


chart = alt.Chart(source).mark_circle(size=60).encode(
    x='Horsepower',
    y='Miles per Gallon',
    color='Origin',
    tooltip=['Name', 'Origin', 'Horsepower', 'Miles per Gallon']
).interactive()

chart.properties(width='container').save("cars.json")
```

You can download from the side bar like this:

In [1]:
import altair as alt
import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/building_inventory.csv")
alt.data_transformers.disable_max_rows()

df['Year Acquired Decade'] = (df['Year Acquired'] // 10) * 10
df['Year Constructed Decade'] = (df['Year Constructed'] // 10) * 10

acquired_chart = alt.Chart(df).mark_bar().encode(
    x=alt.X('Year Acquired Decade:O', title='Decade'),
    y=alt.Y('count()', title='Number of Buildings'),
    color=alt.Color('Year Acquired Decade:O', scale=alt.Scale(scheme='blues')),
    tooltip=['Year Acquired Decade', 'count()']
).transform_filter(alt.datum['Year Acquired'] > 0).properties(title="Buildings Acquired Over Time")

This plot visualizes the temporal trends of building acquisitions and constructions over time. By grouping buildings by decade, the plot shows the number of buildings acquired and constructed in each time period. This allows us to observe patterns of growth, such as which decades saw higher activity in acquiring or constructing new buildings.

The x-axis represents decades (e.g., 1970s, 1980s) as ordinal categories for clarity and grouping. The y-axis encodes the count of buildings acquired or constructed in each decade.

A sequential blue color scale is used for Year Acquired to emphasize progression over time. A sequential orange scale is applied for Year Constructed to differentiate it clearly from acquisitions. Tooltip encoding allows users to see specific decade values and building counts for both acquisitions and constructions when they hover over a bar.

Binning: Both Year Acquired and Year Constructed columns were binned into decades using integer division in Python: (Year // 10) * 10. This groups years into meaningful time intervals for easier comparison.

Filtering: Rows with missing or invalid data were removed to ensure accuracy.

For the interactivity, users can hover over any bar to see the decade, building count, and the category clearly.

In [2]:
constructed_chart = alt.Chart(df).mark_bar().encode(
    x=alt.X('Year Constructed Decade:O', title='Decade'),
    y=alt.Y('count()', title='Number of Buildings'),
    color=alt.Color('Year Constructed Decade:O', scale=alt.Scale(scheme='oranges')),
    tooltip=['Year Constructed Decade', 'count()']
).transform_filter(alt.datum['Year Constructed'] > 0).properties(title="Buildings Constructed Over Time")

This plot categorizes buildings by their usage descriptions and visualizes the distribution of their square footage. A boxplot was chosen to highlight the variability and typical size ranges for each usage category, making it easy to identify outliers and compare categories.

The x-axis represents Usage Description as categorical data, making each category clearly distinguishable. The y-axis encodes Square Footage using a logarithmic scale to handle the wide range of values, from small buildings to large facilities.

Tooltip encoding is included to display detailed statistics for each usage category.

Filtering: Rows with missing or zero values in the Square Footage column were removed to avoid misleading results.
Aggregation: Summary statistics for each usage category, such as median, quartiles, and outliers, were calculated automatically within the Altair boxplot framework.

For the interactivity, a dropdown menu allows users to filter the data by County. This interaction enables exploration of how building usage patterns and square footage distributions vary across different regions. Similar as the last plot, when hovering over a boxplot, users can view detailed information, including the median square footage, interquartile range, and the number of buildings for that category.

In [3]:
chart = constructed_chart | acquired_chart

In [4]:
myJekyllDir = '/Users/qiuboyuan/Desktop/IS445/assets/json/'

In [5]:
chart.save(myJekyllDir + 'chart.json')
constructed_chart.save(myJekyllDir + 'constructed_chart.json')
acquired_chart.save(myJekyllDir + 'acquired_chart.json')