# Explore who owns TRI facilities

There are several industry sectors who must report to the TRI, including mining, utilities, manufacturing, hazardous waste, etc. These are all industries one would expect to be releasing toxic chemicals. Also included in the data is a field for parent company. 

In this notebook I explore parent companies that stand out as owning a larger number of facilities than others.

In [1]:
# import packages
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point
from shapely.geometry import Polygon
from shapely.geometry import mapping

import warnings
warnings.simplefilter(action='ignore')

In [2]:
# read tri geojson in
gdf = gpd.read_file('../data/map-data/tri-2018.geojson')
gdf

Unnamed: 0,TRIFD,FACILITY_NAME,LATITUDE,LONGITUDE,INDUSTRY_SECTOR,PARENT_CO_NAME,geometry
0,00608DCRBNRD3KM,IDI CARIBE INC,17.972778,-66.231944,Chemicals,,POINT (-66.23194 17.97278)
1,0060WHPNTRCARR1,HP INTERNATIONAL TRADING BV (PUERTO RICO BRANC...,18.456470,-67.136550,Chemicals,HP INC,POINT (-67.13655 18.45647)
2,00610BXTRHROAD4,EDWARDS LIFESCIENCES TECHNOLOGY SARL,18.293900,-67.136600,Miscellaneous Manufacturing,EDWARDS LIFESCIENCES LLC,POINT (-67.13660 18.29390)
3,00610CRBGNCARR4,GE INTERNATIONAL OF PR LLC,18.294021,-67.140643,Electrical Equipment,GENERAL ELECTRIC CO (GE CO),POINT (-67.14064 18.29402)
4,00612PRPCMPR681,PREPA-CAMBALACHE COMBUSTION TURBINE PLANT,18.471100,-66.699400,Electric Utilities,PUERTO RICO ELECTRIC POWER AUTHORITY,POINT (-66.69940 18.47110)
...,...,...,...,...,...,...,...
21598,99801CRLSK331CL,COEUR ALASKA INC KENSINGTON GOLD PROJECT,58.867490,-135.104756,Metal Mining,COEUR MINING INC.,POINT (-135.10476 58.86749)
21599,99801KNNCT13401,HECLA GREENS CREEK MINING CO,58.081802,-134.641206,Metal Mining,HECLA MINING CO,POINT (-134.64121 58.08180)
21600,9980WCRWLY176JA,CROWLEY JUNEAU BULK FUEL STORAGE FACILITY,58.289420,-134.395240,Petroleum Bulk Terminals,CROWLEY FUELS LLC,POINT (-134.39524 58.28942)
21601,99901SCSTG1300S,US COAST GUARD BASE KETCHIKAN,55.333730,-131.625330,Transportation Equipment,US DEPARTMENT OF HOMELAND SECURITY,POINT (-131.62533 55.33373)


In [3]:
# find value counts of parent companies to find top companies
comp = gdf['PARENT_CO_NAME'].value_counts().rename_axis('parentCompany').reset_index(name='count')

# set option to view all results in display window
pd.set_option('display.max_rows', None)

comp

Unnamed: 0,parentCompany,count
0,,5188
1,US DEPARTMENT OF DEFENSE,268
2,BERKSHIRE HATHAWAY INC,191
3,CEMEX INC,171
4,ARGOS USA CORP,164
5,KOCH INDUSTRIES INC,139
6,CRH AMERICAS INC,130
7,CLEAN HARBORS INC,115
8,TYSON FOODS INC,112
9,MARATHON PETROLEUM CORP,90


In [4]:
# drop first row which is the sum of all facilities that didn't report a parent company
comp = comp.drop(comp.index[0]).reset_index()
comp = comp.drop(['index'], axis=1)
comp

Unnamed: 0,parentCompany,count
0,US DEPARTMENT OF DEFENSE,268
1,BERKSHIRE HATHAWAY INC,191
2,CEMEX INC,171
3,ARGOS USA CORP,164
4,KOCH INDUSTRIES INC,139
5,CRH AMERICAS INC,130
6,CLEAN HARBORS INC,115
7,TYSON FOODS INC,112
8,MARATHON PETROLEUM CORP,90
9,MARTIN MARIETTA MATERIALS INC,90


There are over 4,000 parent companies listed here, but some stand out more than others. For the sake of the map, I am going to filter the top 10 parent companies to show in a dropdown, letting the user explore the spatial relation between the parent company and the facilities they own. The top 10 parent companies own at least 90 facilities.

In [5]:
top10 = comp.head(10)
top10

Unnamed: 0,parentCompany,count
0,US DEPARTMENT OF DEFENSE,268
1,BERKSHIRE HATHAWAY INC,191
2,CEMEX INC,171
3,ARGOS USA CORP,164
4,KOCH INDUSTRIES INC,139
5,CRH AMERICAS INC,130
6,CLEAN HARBORS INC,115
7,TYSON FOODS INC,112
8,MARATHON PETROLEUM CORP,90
9,MARTIN MARIETTA MATERIALS INC,90


After finding the top 10 parent companies, I created a csv file containing:
- company name
- address of headquarter
- number of facilities owned
- industry
- and a brief description of each company

Addresses and company information were obtained through google maps and Bloomberg Company Profiles. I was unable to find the exact address for Cemex Inc, only that the company was headquartered in San Pedro Garza García, Mexico. An address was chosen from the centroid of the country to represent Cemex Inc's headquarters.

In [6]:
# read in company info
companies = pd.read_csv('../data/input-data/parent-companies.csv', encoding='latin-1')
companies

Unnamed: 0,name,parentCompany,count,address,industry,description
0,US Department of Defense,US DEPARTMENT OF DEFENSE,268,"100 S Washington Blvd, Arlington VA 22202",Military & Arms,The United States Department of Defense is an ...
1,Berkshire Hathaway Inc,BERKSHIRE HATHAWAY INC,191,"3555 Farnam Street Omaha, NE 68131",Multinational Conglomerate Holding Company,Berkshire Hathaway is an American multinationa...
2,Cemex Inc,CEMEX INC,171,"66220 San Pedro Garza García, Nuevo Leon, Mexico",Construction Materials-Cement & Aggregates,Cemex Inc. manufactures cement and ready-mixed...
3,Argos USA Corp,ARGOS USA CORP,164,"3015 Windward Plaza, Alpharetta, GA 30005 Unit...",Construction Materials-Cement & Aggregates,Argos USA LLC produces and distributes cements...
4,Koch Industries Inc,KOCH INDUSTRIES INC,139,"2256 Wichita, KS 67201 United States","Oil, Gas & Coal","Koch Industries, Inc. operates as a diversifie..."
5,CRH Americas Inc,CRH AMERICAS INC,128,"900 Ashwood Pkwy, Dunwoody, GA 30338",Construction Materials-Cement & Aggregates,"CRH America, Inc. provides construction materi..."
6,Clean Harbors Inc,CLEAN HARBORS INC,115,"42 Longwater Drive Norwell, MA 02061 United St...",Waste Management,"Clean Harbors Environmental Services, Inc. pro..."
7,Tyson Foods Inc,TYSON FOODS INC,112,"2200 West Don Tyson Parkway Springdale, AR 727...",Consumer Products-Packaged Food,"Tyson Foods, Inc. produces, distributes, and m..."
8,Marathon Petroleum Corp,MARATHON PETROLEUM CORP,90,"539 South Main Street Findlay, OH 45840 United...","Oil, Gas & Coal",Marathon Petroleum Corporation operates as a c...
9,Martin Marietta Materials Inc,MARTIN MARIETTA MATERIALS INC,90,"2710 Wycliff Road Raleigh, NC 27607 United States",Construction Materials-Cement & Aggregates,"Martin Marietta Materials, Inc. produces aggre..."


In [7]:
# geocode addresses, assigning lat/long values to new columns
from geopy import Nominatim

geolocator = Nominatim()

# iterate over rows
for index, row in companies.iterrows():
    location = geolocator.geocode(row['address']) # geocode address
    companies.loc[index,'latitude'] = location.latitude # create latitude column
    companies.loc[index,'longitude'] = location.longitude # create longitude column

companies

Unnamed: 0,name,parentCompany,count,address,industry,description,latitude,longitude
0,US Department of Defense,US DEPARTMENT OF DEFENSE,268,"100 S Washington Blvd, Arlington VA 22202",Military & Arms,The United States Department of Defense is an ...,38.865921,-77.073293
1,Berkshire Hathaway Inc,BERKSHIRE HATHAWAY INC,191,"3555 Farnam Street Omaha, NE 68131",Multinational Conglomerate Holding Company,Berkshire Hathaway is an American multinationa...,41.257407,-95.965389
2,Cemex Inc,CEMEX INC,171,"66220 San Pedro Garza García, Nuevo Leon, Mexico",Construction Materials-Cement & Aggregates,Cemex Inc. manufactures cement and ready-mixed...,25.657775,-100.367025
3,Argos USA Corp,ARGOS USA CORP,164,"3015 Windward Plaza, Alpharetta, GA 30005 Unit...",Construction Materials-Cement & Aggregates,Argos USA LLC produces and distributes cements...,34.093298,-84.239493
4,Koch Industries Inc,KOCH INDUSTRIES INC,139,"2256 Wichita, KS 67201 United States","Oil, Gas & Coal","Koch Industries, Inc. operates as a diversifie...",37.692236,-97.337545
5,CRH Americas Inc,CRH AMERICAS INC,128,"900 Ashwood Pkwy, Dunwoody, GA 30338",Construction Materials-Cement & Aggregates,"CRH America, Inc. provides construction materi...",33.932267,-84.340375
6,Clean Harbors Inc,CLEAN HARBORS INC,115,"42 Longwater Drive Norwell, MA 02061 United St...",Waste Management,"Clean Harbors Environmental Services, Inc. pro...",42.160652,-70.884074
7,Tyson Foods Inc,TYSON FOODS INC,112,"2200 West Don Tyson Parkway Springdale, AR 727...",Consumer Products-Packaged Food,"Tyson Foods, Inc. produces, distributes, and m...",36.154429,-94.154233
8,Marathon Petroleum Corp,MARATHON PETROLEUM CORP,90,"539 South Main Street Findlay, OH 45840 United...","Oil, Gas & Coal",Marathon Petroleum Corporation operates as a c...,41.036255,-83.650158
9,Martin Marietta Materials Inc,MARTIN MARIETTA MATERIALS INC,90,"2710 Wycliff Road Raleigh, NC 27607 United States",Construction Materials-Cement & Aggregates,"Martin Marietta Materials, Inc. produces aggre...",35.819505,-78.691141


In [8]:
# convert dataframe to geodataframe
gdf = gpd.GeoDataFrame(companies, geometry=gpd.points_from_xy(companies.longitude, companies.latitude))

# define crs for geodataframe
gdf.crs = {'init' :'epsg:4326'}

# write to file
gdf.to_file('../data/map-data/parent-companies.geojson', driver='GeoJSON')