Final project: Presentation

## Gathering data

The source code from *Homework 4* can collect and plot one community area to
compare common socioeconomic indicators. To reuse the code for the final 
project, which requires data for all community areas, we need to find their 
tract IDs. Note that a community area may have multiple census tracts.

In [1]:
from colorama import Fore, Style
from us.states import IL
from _lib import get_census, load_areas

area_names = [area.area for area in load_areas()]
tracts = {i: [] for i in range(1, 78)}

print(f'{Fore.YELLOW}Fetching...{Style.RESET_ALL}')
result = \
    get_census().acs5.get(
        ['NAME'],
        geo={'for': 'tract:*', 'in': f'state:{IL.fips} county:031'},
        year=2023,
    )

print(f'{Fore.GREEN}Collected {len(result)} tracts:{Style.RESET_ALL}')
for item in result:
    tract = item['tract']
    if len(tract) < 2:
        continue
    try:
        ca_num = int(tract[:2])
        if 1 <= ca_num <= 77:
            tracts[ca_num].append(tract)
    except ValueError:
        pass
[tracts[tract].sort() for tract in tracts]
for i, name in enumerate(area_names, start=1):
    print(f'{name:22s}: {Style.BRIGHT}{tracts[i]}{Style.RESET_ALL}')

[33mFetching...[0m
[32mCollected 1332 tracts:[0m
Rogers Park           : [1m['010100', '010201', '010202', '010300', '010400', '010501', '010502', '010503', '010600', '010701', '010702'][0m
West Ridge            : [1m['020100', '020200', '020301', '020302', '020400', '020500', '020601', '020602', '020701', '020702', '020801', '020802', '020901', '020902'][0m
Uptown                : [1m['030101', '030102', '030103', '030104', '030200', '030300', '030400', '030500', '030601', '030603', '030604', '030701', '030702', '030703', '030706', '030800', '030900', '031000', '031100', '031200', '031300', '031400', '031501', '031502', '031700', '031800', '031900', '032100'][0m
Lincoln Square        : [1m['040100', '040201', '040202', '040300', '040401', '040402', '040600', '040700', '040800', '040900'][0m
North Center          : [1m['050100', '050200', '050300', '050500', '050600', '050700', '050800', '050900', '051000', '051100', '051200', '051300', '051400'][0m
Lake View             

I am dividing the data collection into four periods based on the availability of
ACS 5-year estimates: 2009 (2008 is not available), 2013, 2018 and 2023. Then, 
using the mapping of "L" train station to community area from *Final project: 
Progress,* determine if the station was open during that period. We need to find 
community areas that  never had an open station before each period.

In [2]:
from colorama import Fore, Style
from _lib import load_areas

entries = []
for area in load_areas():
    for station in area.stations:
        year_count = len(station.years) if hasattr(station, 'years') else 0
        if year_count in {0, 4}:
            continue
        years_str = ', '.join([str(year) for year in station.years])
        entries.append(
            f'{area.area:16s} ' +
            f'{station.station:23s} ' +
            f'{year_count} years: ' +
            f'[{Style.BRIGHT}{years_str}{Style.RESET_ALL}]',
        )

print(f'{Fore.GREEN}Found {len(entries)} areas:{Style.RESET_ALL}')
for entry in entries:
    print(entry)

[32mFound 3 areas:[0m
Near West Side   Morgan                  3 years: [[1m2013, 2018, 2023[0m]
Loop             Washington/Wabash       2 years: [[1m2018, 2023[0m]
Near South Side  Cermak-McCormick Place  2 years: [[1m2018, 2023[0m]


Having identified community areas of interest, we can then compare before and 
after the station opening data (or reopening) for those areas. I am also 
including **Mount Greenwood** as a control group, since it has never had an "L"
station and is furthest away from one.

In [2]:
from colorama import Fore, Style
from pandas import DataFrame
from us.states import IL

from _lib import get_census, load_areas

SELECTED_YEARS = [2009, 2013, 2018, 2023]
SELECTED_AREAS = ['Near West Side', 'Loop', 'Near South Side', 'Mount Greenwood']
SELECTED_TABLE = {'B08301_002E': 'Population with personal vehicle'}

FRAME_COLUMNS = ['state', 'county', 'tract']


def fetch_tracts(tracts, title, call_client, year, columns):
    print(f'Fetching {title}... ', end='')
    result = \
        call_client(get_census()).get(
            list(columns.keys()),
            geo={'for': 'tract:*', 'in': f'state:{IL.fips} county:031'},
            year=year,
        )
    print('100%')
    frame = DataFrame(result).rename(columns=columns)
    return frame[frame['tract'].isin(tracts)] \
        [FRAME_COLUMNS + list(columns.values())].copy()


for area in [area for area in load_areas() if area.area in SELECTED_AREAS]:
    print(f'{Style.BRIGHT}Writing {area.area}:{Style.RESET_ALL}')
    for year in SELECTED_YEARS:
        fetch_tracts(
            area.tracts,
            f'{year - 5}\u2013{year} ACS 5-Year Estimates',
            lambda census: census.acs5,
            year,
            SELECTED_TABLE,
        ).to_csv(f'{area.area.lower().replace(' ', '_')}_{year}.csv', index=False)
print(f'{Fore.GREEN}Done.{Style.RESET_ALL}')

[1mWriting Near West Side:[0m
Fetching 2004–2009 ACS 5-Year Estimates... 100%
Fetching 2008–2013 ACS 5-Year Estimates... 100%
Fetching 2013–2018 ACS 5-Year Estimates... 100%
Fetching 2018–2023 ACS 5-Year Estimates... 100%
[1mWriting Loop:[0m
Fetching 2004–2009 ACS 5-Year Estimates... 100%
Fetching 2008–2013 ACS 5-Year Estimates... 100%
Fetching 2013–2018 ACS 5-Year Estimates... 100%
Fetching 2018–2023 ACS 5-Year Estimates... 100%
[1mWriting Near South Side:[0m
Fetching 2004–2009 ACS 5-Year Estimates... 100%
Fetching 2008–2013 ACS 5-Year Estimates... 100%
Fetching 2013–2018 ACS 5-Year Estimates... 100%
Fetching 2018–2023 ACS 5-Year Estimates... 100%
[1mWriting Mount Greenwood:[0m
Fetching 2004–2009 ACS 5-Year Estimates... 100%
Fetching 2008–2013 ACS 5-Year Estimates... 100%
Fetching 2013–2018 ACS 5-Year Estimates... 100%
Fetching 2018–2023 ACS 5-Year Estimates... 100%
[32mDone.[0m


## Plotting graph

Run `acs_analysis.py` and `vehicle_analysis.py` to generate plots for the
presentation slides.

<img
  width="640px"
  alt="Diagram 1"
  src="https://github.com/hanggrian/IIT-CS579/raw/assets/assignments/proj3/diagram1.svg"/>

<img
  width="640px"
  alt="Diagram 2"
  src="https://github.com/hanggrian/IIT-CS579/raw/assets/assignments/proj3/diagram2.svg"/>

<img
  width="640px"
  alt="Diagram 3"
  src="https://github.com/hanggrian/IIT-CS579/raw/assets/assignments/proj3/diagram3.svg"/>
  
<img
  width="640px"
  alt="Diagram 4"
  src="https://github.com/hanggrian/IIT-CS579/raw/assets/assignments/proj3/diagram4.svg"/>