# Econometric UNFCCC Green Cities Commitment Analysis: USA
## Data Preparation

#### Importing and Cleaning Datasets
Data includes the following:
1. Full list of US City, County and State data
2. UNFCCC data on cities' commitments, actions, etc
3. 2020 Election results by state
4. Land Temperature data
5. US Bureau of Labor Statistics State Unemployment Rate
6. Poverty Estimates for U.S., States, and Counties, 2021
7. Number of Natural Disasters by State since 1953

In [1]:
import pandas as pd
import numpy as np

##### 1. US Cities, Counties and State Data
simplemaps.com. US Cities Database | Simplemaps.com. [online] Available at: https://simplemaps.com/data/us-cities.

In [28]:
Cities = pd.read_csv('AllUSCities.csv')

In [38]:
Cities = Cities[Cities['population'] >= 1000]

In [40]:
Cities.head()

Unnamed: 0,id,city,state,state_name,fips,county,latitude,longitude,population
0,1840034016,New York City,NY,New York,36081,Queens,40.6943,-73.9249,18908608
1,1840020491,Los Angeles,CA,California,6037,Los Angeles,34.1141,-118.4068,11922389
2,1840000494,Chicago,IL,Illinois,17031,Cook,41.8375,-87.6866,8497759
3,1840015149,Miami,FL,Florida,12086,Miami-Dade,25.784,-80.2101,6080145
4,1840020925,Houston,TX,Texas,48201,Harris,29.786,-95.3885,5970127


##### 2. UNFCCC Data
Global Climate Action UNFCCC - Actor Tracking (2022). Available at: https://climateaction.unfccc.int/Actors.

In [31]:
UNFCCC = pd.read_csv('UNFCCC.csv')

In [32]:
UNFCCC = UNFCCC[(UNFCCC['country'] == 'United States of America') & (UNFCCC['organizationType'] == 'City') & (UNFCCC['actorProperties_population'] >= 1000)]  # Filter for cities in the US with populations over 1000 
UNFCCC['Date'] = pd.to_datetime(UNFCCC['Date'], format = '%d/%m/%Y')  # format date as datetime object
UNFCCC = UNFCCC.sort_values(by = 'Date', ascending = False)
UNFCCC = UNFCCC.drop_duplicates(subset = 'publicId', keep = 'first')  # keep only most recent observation of a city
UNFCCC.reset_index(inplace = True)  # Reset index
UNFCCC.drop(['index','Date','id','accountingYear','organizationType'], axis = 1, inplace = True)  # drop unneeded columns

In [33]:
# Generate City and State Columns from OrganisationName
UNFCCC['city'] = UNFCCC['organizationName'].apply(lambda x: x.split(',')[0].strip())
UNFCCC['state'] = UNFCCC['organizationName'].apply(lambda x: x.split(',')[1].strip() if len(x.split(',')) == 2 else np.NaN)   

# Clean City names for better joining
strings_to_remove = ['City of', 'City Of', 'Town of', 'Town Of', 'Township Of', 'City and County of', 'Borough Of', 'Village of', '(Town)']
for string in strings_to_remove:
    UNFCCC['city'] = UNFCCC['city'].str.replace(string, '', regex = False)

UNFCCC['city'] = UNFCCC['city'].apply(lambda x: x.strip())

UNFCCC = UNFCCC[['city','state','hasCommitments','hasActionsUndertaken',
                'hasEmissionInventory','hasInitiativeParticipations','hasImpact','hasMitigations',
                'hasAdaptations','hasRiskAssessments','hasClimateActionPlans','hasFinanceActions']]

Remove the following irregular / nonconforming observations:
1. Metropolitan Council, Twin Cities: covers bistate area and Minneapolis is already included
2. Chicago Metropolitan Mayors Caucus:  Chicago is already included
3. Metropolitan Washington Council of Governments (COG):  covers bistate area and District of Columbia is already included
4. Mid-America Regional Council:  covers bistate area and Kansas City is already included
5. San Francisco/Bay Area Air Quality Management District:  San Francisco is already included

In [34]:
cities_to_remove = ['Metropolitan Council', 'Chicago Metropolitan Mayors Caucus', 'Metropolitan Washington Council of Governments (COG)', 'Mid-America Regional Council', 'San Francisco/Bay Area Air Quality Management District’S']
UNFCCC = UNFCCC[~UNFCCC['city'].isin(cities_to_remove)]

Make the following alterations to irregular / nonconforming observations:
1. Metropolitan Government of Nashville and Davidson County, TN: rename to Nashville
2. Durham: add North Carolina as state

In [35]:
UNFCCC.loc[UNFCCC['city'] == 'Metropolitan Government of Nashville and Davidson County', 'city'] = 'Nashville'
UNFCCC.loc[UNFCCC['city'] == 'Durham', 'state'] = 'NC'

In [36]:
UNFCCC.head()

Unnamed: 0,city,state,hasCommitments,hasActionsUndertaken,hasEmissionInventory,hasInitiativeParticipations,hasImpact,hasMitigations,hasAdaptations,hasRiskAssessments,hasClimateActionPlans,hasFinanceActions
0,Park Forest,IL,True,True,False,False,False,True,True,True,True,False
1,Emeryville,CA,True,True,True,True,False,True,True,True,True,False
2,Grand Rapids,MI,True,True,False,True,False,True,True,True,False,False
3,Fremont,CA,True,True,True,True,False,True,True,True,True,False
4,Fort Worth,TX,False,True,False,False,False,False,True,False,False,False


##### 3. 2020 Election Results by State
Wikipedia Contributors (2019). 2020 United States presidential election. [online] Wikipedia. Available at: https://en.wikipedia.org/wiki/2020_United_States_presidential_election.

In [55]:
ElectionbyState = pd.read_csv('ElectionbyState.csv')

In [56]:
RedBlue = lambda row: False if row['Biden/Harris_Democratic_EV'] > row['Trump/Pence_Republican_EV'] else True  # Assign blue for a state if it had more democratic electoral college votes
ElectionbyState['redState'] = ElectionbyState.apply(RedBlue, axis = 1)
ElectionbyState = ElectionbyState[['state_abb','redState']]
ElectionbyState.rename(columns = {'state_abb':'state'}, inplace = True)

In [57]:
ElectionbyState.head()

Unnamed: 0,state,redState
0,AL,True
1,AK,True
2,AZ,False
3,AR,True
4,CA,False


##### 4. Land Temperatures from 1828 to 2013
www.kaggle.com. (Berkeley Earth). Climate Change: Earth Surface Temperature Data. [online] Available at: https://www.kaggle.com/datasets/berkeleyearth/climate-change-earth-surface-temperature-data.

In [82]:
LandTemp = pd.read_csv('GlobalLandTemperatures.csv')

In [83]:
LandTemp = LandTemp[LandTemp['Country'] == 'United States']  # filter for USA State Data
LandTemp['Date'] = pd.to_datetime(LandTemp['dt'], format = '%Y-%m-%d')  # create datetime column
LandTemp['Year'] = LandTemp['Date'].dt.year  # create year column
LandTemp['Month'] = LandTemp['Date'].dt.month_name()  # create month column
LandTemp = LandTemp[['Year','Month','State','AverageTemperature']]

In [84]:
LandTemp = LandTemp[LandTemp['Month'].isin(['January','July'])]  # keep only January and July observations
LandTemp = LandTemp.pivot(index = ['Year','State'], columns = 'Month', values = 'AverageTemperature').reset_index()  # Pivot to get widened form
LandTemp['janJulyDiff'] = LandTemp['July'] - LandTemp['January']  # calculate July to January temperature difference
LandTemp.reset_index(inplace = True)
LandTemp.dropna(inplace = True)

In [104]:
# large changes in temperature --> https://www.climate.gov/news-features/understanding-climate/climate-change-global-temperature
# Indicator: difference in averages of 1940-1960 and 2000-2020
LandTemp20002013 = LandTemp[LandTemp['Year'].between(2000,2003)].groupby('State')['janJulyDiff'].mean()  # mean of January July temperature differences 2000-2013
LandTemp19401960 = LandTemp[LandTemp['Year'].between(1940,1960)].groupby('State')['janJulyDiff'].mean()  # mean of January July temperature differences 1940-1960

tempDiff = LandTemp20002013 - LandTemp19401960
tempDiff = pd.DataFrame(tempDiff).reset_index()
tempDiff.rename(columns = {'State':'state','janJulyDiff':'tempDiff'}, inplace = True)

In [105]:
tempDiff.head()

Unnamed: 0,state,tempDiff
0,Alabama,1.67944
1,Alaska,-3.913524
2,Arizona,-0.513298
3,Arkansas,0.912119
4,California,-0.821321


##### 5. US Bureau of Labor Statistics State Unemployment Rate (February 2024)
Bls.gov. (2024). Table 1. Civilian labor force and unemployment by state and selected area, seasonally adjusted. [online] Available at: https://www.bls.gov/news.release/laus.t01.htm.

In [17]:
Unemp = pd.read_csv('StateEmployment.csv')

In [18]:
Unemp.head()

Unnamed: 0,state,unemploymentRate
0,AK,0.047
1,AL,0.03
2,AR,0.036
3,AZ,0.041
4,CA,0.053


##### 6. Poverty Estimates for U.S., States, and Counties, 2021
U.S. Department of Commerce, Bureau of the Census, Small Area Income and Poverty Estimates (SAIPE) Program

In [65]:
Poverty = pd.read_csv('PovertyEstimates.csv')

In [66]:
Poverty['povertyProp'] = Poverty['PCTPOVALL_2021'] / 100  # convert to decimal
Poverty = Poverty[['fips','povertyProp']]  # option to include: 'county','state',

In [67]:
Poverty.head()

Unnamed: 0,fips,povertyProp
0,0,0.128
1,1000,0.163
2,1001,0.107
3,1003,0.108
4,1005,0.23


##### 7. Number of Natural Disasters by State since 1953
worldpopulationreview.com. Natural Disasters by State [Updated May 2023]. [online] Available at: https://worldpopulationreview.com/state-rankings/natural-disasters-by-state.

In [24]:
NaturalDisasters = pd.read_csv('NaturalDisasters.csv')

In [25]:
NaturalDisasters.head()

Unnamed: 0,state,numDisasters
0,AL,82
1,AK,55
2,AZ,68
3,AR,71
4,CA,284


#### Joining Datasets

In [109]:
df = pd.merge(Cities, UNFCCC, how = 'left', on = ['state','city'])

bool_columns = ['hasCommitments','hasActionsUndertaken','hasEmissionInventory','hasInitiativeParticipations','hasImpact','hasMitigations','hasAdaptations', 'hasRiskAssessments', 'hasClimateActionPlans','hasFinanceActions']
df[bool_columns] = df[bool_columns].fillna(False).applymap(pd.to_numeric, errors='coerce').astype(bool)  # assign False to NaNs and ensure type bool

In [110]:
df = pd.merge(df, ElectionbyState, on = 'state', how = 'left')
df = pd.merge(df, tempDiff, on = 'state', how = 'left')
df = pd.merge(df, Unemp, on = 'state', how = 'left')
df = pd.merge(df, Poverty, on = 'fips', how = 'left')
df = pd.merge(df, NaturalDisasters, on = 'state', how = 'left')

df = df[['city','state','county','fips','latitude','longitude','population','redState','unemploymentRate','povertyProp','tempDiff','numDisasters','hasCommitments', 'hasActionsUndertaken',
       'hasEmissionInventory', 'hasInitiativeParticipations', 'hasImpact','hasMitigations', 'hasAdaptations', 'hasRiskAssessments','hasClimateActionPlans', 'hasFinanceActions']]

In [111]:
df.head()

Unnamed: 0,city,state,county,fips,latitude,longitude,population,redState,unemploymentRate,povertyProp,...,hasCommitments,hasActionsUndertaken,hasEmissionInventory,hasInitiativeParticipations,hasImpact,hasMitigations,hasAdaptations,hasRiskAssessments,hasClimateActionPlans,hasFinanceActions
0,New York City,NY,Queens,36081,40.6943,-73.9249,18908608,False,0.044,0.136,...,True,True,True,True,False,True,True,True,True,False
1,Los Angeles,CA,Los Angeles,6037,34.1141,-118.4068,11922389,False,0.053,0.141,...,True,True,True,True,False,True,True,True,True,True
2,Chicago,IL,Cook,17031,41.8375,-87.6866,8497759,False,0.048,0.138,...,True,True,True,True,False,True,True,True,True,True
3,Miami,FL,Miami-Dade,12086,25.784,-80.2101,6080145,True,0.031,0.152,...,True,True,True,True,False,True,True,True,True,False
4,Houston,TX,Harris,48201,29.786,-95.3885,5970127,True,0.039,0.164,...,True,True,True,True,False,True,True,True,True,False
