## FCC and Redlining Features

As **input**, this notebook requires you to provide the path to the census files and the FCC Form 477 File.

As **output**, this notebook requires you to define the path you'd like to save the output DataFrame to.

Please set the **rerun** variable to **True** to regenerate existing files.

In [None]:
import glob as glob

# Set Input Paths
directory = ""  # set beginning of path if needed
addresses = "data/address/original/xxx"
fcc_direct = "data/input/redlining/xxx"

# Set Output Paths
append_redline = "data/address/transformation1/xxx"

This page is additionally meant to take **input** of a Pandas DataFrame containing the following columns: 

{...

CLASS `Address`: (contains `addressfull` methods)
- `self.line1 = firstline`
- `self.line2 = secondline`
- `self.city = city`
- `self.state = state`
- `self.zipcode = zipcode`
- `self.apt_type = apt_type`
- `self.apt_number = apt_number`

ADDITIONALLY:
- `latitude`
- `longitude`

...}

This page is meant to **SAVE** and **output** the following feature appended to the input ".geojson":

{...
 
 REDLINING FEATURE:
  - `redlining_grade = redlining grade, merged from Mapping Inequality based on the lat and lon of the address`

  FCC FEATURES:
  - `n_providers = the number of other wired competitors in the addresses' Census block group. Sourced from FCC Form 477`
  
...}


In [None]:
# Relevant Code
import json
from tqdm import tqdm
import pandas as pd
from shapely.geometry.polygon import Polygon
from config import state2redlining


## Redlining
def get_holc_grade(row: dict, polygons: list) -> str:
    """
    Converts any lat and lon in a dictionary into a shapely point,
    then iterate through a list of dictionaries containing
    shapely polygons shapes for each HOLC-graded area.
    """
    point = Point(float(row["lon"]), float(row["lat"]))
    for polygon in polygons:
        if polygon["shape"].contains(point):
            return polygon["grade"]
    return None


def check_redlining(df: pd.DataFrame) -> pd.DataFrame:
    """
    Get redlining grades for each address in "df".
    Note: we use city-level HOLC grades, but index on state.
    Thanks for the Mapping Inequality project for digitizing the maps,
    which are stored in `../data/input/redlining`.
    """
    data = []
    for state, _df in tqdm(df.groupby("state")):
        # read redlining maps for each city in `df`.
        files = state2redlining.get(state, [])
        polygons = []
        if files:
            for fn in files:
                geojson = json.load(open(fn))
                for record in geojson["features"]:
                    shape = Polygon(record["geometry"]["coordinates"][0][0])
                    grade = record["properties"]["holc_grade"]
                    polygons.append({"shape": shape, "grade": grade})
            _df["redlining_grade"] = _df.apply(
                get_holc_grade, polygons=polygons, axis=1
            )
        data.extend(_df.to_dict(orient="records"))
    return pd.DataFrame(data)