# State-County Table with Population Data

This notebook creates a comprehensive table of U.S. counties and states, enriched with population data. The data is sourced from the cleaned Bigfoot sightings dataset and supplemented with U.S. Census Bureau's ACS5 data for population statistics. 

---

## Objectives
1. Extract unique county-state pairs from the Bigfoot sightings dataset.
2. Use the Census API to fetch population data for all U.S. counties.
3. Combine the datasets and save the resulting table for future use.


In [None]:
# Get dependencies
import requests
import pandas as pd
import os
from dotenv import load_dotenv

# Obtain environment variables
load_dotenv()
CENSUS_API_KEY = os.getenv('CENSUS_API_KEY')

# load bigfoot data
bigfoot_coords = pd.read_json('../data/bigfoot_coordinates_clean_cols.json')

### Extract Unique County-State Pairs
From the Bigfoot dataset, extract unique combinations of counties and states.


In [None]:
county_df = bigfoot_coords[['county', 'state']].drop_duplicates(subset=['county','state'])

### Fetching Population Data Using the Census API
Retrieve population data for all U.S. counties from the Census Bureau's ACS5 dataset. The response is converted into a DataFrame for further processing.

In [None]:
# API URL and params
api_url = "https://api.census.gov/data/2020/acs/acs5"
params = {
    "get": "NAME,B01003_001E",
    "for": "county:*",
    "key": CENSUS_API_KEY
}

# Make the API request
response = requests.get(api_url,params=params)

# check for successful connection
if response.status_code == 200:
    # Parse the JSON response
    data = response.json()

    # Create a DataFrame
    columns = data[0] # Column names
    rows = data[1:] # Data rows
    county_pop_df = pd.DataFrame(rows, columns=columns)

    # Cleanup  
    county_pop_df.rename(columns={
        "NAME": "name", 
        "B01003_001E": "population", 
        "state": "state_no",
        "county": "county_no" 
        }, inplace=True)
    county_pop_df['population'] = pd.to_numeric(county_pop_df['population'], errors='coerce')
    
    print(county_pop_df.head())


### Cleaning and Combining the Data

In [None]:
county_pop_df[['county', 'state']] = county_pop_df['name'].str.split(',', expand=True)
county_pop_df.head()
county_pop_df = county_pop_df[['state', 'county', 'population']]
county_pop_df.to_json('../data/county_populations.json', orient='records')

## Conclusion

This notebook successfully creates a clean table of counties and states with population data. The processed dataset is stored in `../data/county_populations.json` for further analysis and integration into other projects.
