In [27]:
import pandas as pd

---

## State / County Data

### Cal Recycle

- This page displays estimated residential disposed waste composition by material type and stream for the specified California jurisdiction (county). Single family tons and multifamily tons add to give total residential tons. Percents show the proportion each material contributes to each stream.
- Source: https://www2.calrecycle.ca.gov/WasteCharacterization/ResidentialStreams

In [28]:
alameda_waste = pd.read_csv('../data/calrecycle-alameda-waste-material.csv')
alameda_waste.head(5)

Unnamed: 0,Material Category,Material Type,Jurisdiction(s),Single Family Tons,Regional Single Family Composition,Multi Family Tons,Statewide Multi Family Composition,Total Residential Tons,Total Residential Composition,Unnamed: 9
0,Paper,Uncoated Corrugated Cardboard,Alameda (Countywide),1917,0.7%,4231,3.6%,6147,1.5%,
1,Paper,Paper Bags,Alameda (Countywide),898,0.3%,625,0.5%,1523,0.4%,
2,Paper,Newspaper,Alameda (Countywide),2078,0.7%,5506,4.6%,7583,1.9%,
3,Paper,White Ledger Paper,Alameda (Countywide),531,0.2%,627,0.5%,1158,0.3%,
4,Paper,Other Office Paper,Alameda (Countywide),1368,0.5%,698,0.6%,2067,0.5%,


In [38]:
alameda_waste.describe()

Unnamed: 0,Single Family Tons,Multi Family Tons,Total Residential Tons,Unnamed: 9
count,68.0,68.0,68.0,0.0
mean,4214.235294,1743.808824,5958.044118,
std,11018.468002,3968.001412,14837.659662,
min,0.0,0.0,0.0,
25%,294.25,100.75,533.0,
50%,1045.5,621.5,1820.0,
75%,2485.25,1305.0,3838.0,
max,79627.0,29374.0,109001.0,


- The Solid Waste Information System (SWIS) database contains information on solid waste facilities, operations, and disposal sites in California.  Includes Latitude, Longitude and waste type processed
- Source: https://www2.calrecycle.ca.gov/SolidWaste/Site/Search

In [37]:
cal_waste_facilities = pd.read_csv('../data/calrecycle-waste-facilities.csv')
cal_waste_facilities.head(3)

Unnamed: 0,SWIS Number,Site Name,Activity,Waste Type,Site Is Archived,Site Operational Status,Site Regulatory Status,Site Type,Latitude,Longitude,Point of Contact,Activity Is Archived,Activity Operational Status,Activity Regulatory Status,Activity Category,Activity Classification
0,27-AA-0114,"Return to Earth, LLC",Agricultural Material Composting Operation,Agricultural,Yes,Closed,Surrendered,Non-Disposal Only,36.63163,-121.5477,Eric Kiruja,Yes,Closed,Surrendered,Composting,Solid Waste Operation
1,27-AA-0114,"Return to Earth, LLC",Agricultural Material Composting Operation,Green Materials,Yes,Closed,Surrendered,Non-Disposal Only,36.63163,-121.5477,Eric Kiruja,Yes,Closed,Surrendered,Composting,Solid Waste Operation
2,15-AA-0392,Demler Enterprises-Delano,Agricultural Material Composting Operation,Manure,No,Active,Notification,Non-Disposal Only,35.7588,-119.33784,Christine Karl,No,Active,Notification,Composting,Solid Waste Operation


---

## National Data

### EIA energy consumption

- Residential Energy consumption 
    - Totals and averages for each major end use: space heating, water heating, air-conditioning, refrigerators, and other (other appliances, electronics, and lighting)
    - Grouped by different categories like large census region, climate, housing type, etc.

- Source: https://www.eia.gov/consumption/residential/data/2015/index.php?view=consumption#by%20end%20uses'
- Downlads as an excel file that has some indentation in the index column such that it downloading it directly as a csv messes up the organization.  The rows that have no observations just separate the different cateogires

In [31]:
us_region_energy_usage = pd.read_csv('../data/eia-residential-energy-consumption.csv',index_col=0)
us_region_energy_usage.head(10)

Unnamed: 0,Total U.S.2 (Number of housing units (million)),Total Average energy expenditures \n(dollars per household using the end use),Space heating3,Water heating,Air condi-tioning,Refrig-erators,Other4
All homes,118.2,1856.0,543.0,296.0,265.0,103.0,714.0
Census region and division,,,,,,,
Northeast,21.0,2269.0,850.0,335.0,174.0,120.0,834.0
New England,5.6,2541.0,1046.0,379.0,127.0,124.0,926.0
Middle Atlantic,15.4,2169.0,779.0,318.0,188.0,118.0,800.0
Midwest,26.4,1760.0,604.0,246.0,148.0,99.0,681.0
East North Central,18.1,1762.0,612.0,247.0,140.0,101.0,676.0
West North Central,8.3,1757.0,587.0,243.0,166.0,96.0,691.0
South,44.4,1917.0,465.0,323.0,392.0,93.0,694.0
South Atlantic,23.5,1963.0,488.0,344.0,386.0,95.0,714.0


In [40]:
us_region_energy_usage.iloc[17:33]

Unnamed: 0,Total U.S.2 (Number of housing units (million)),Total Average energy expenditures \n(dollars per household using the end use),Space heating3,Water heating,Air condi-tioning,Refrig-erators,Other4
Census urban/rural classification5,,,,,,,
Urban,94.7,1773.0,495.0,274.0,269.0,102.0,694.0
Urbanized area,82.2,1782.0,488.0,272.0,279.0,104.0,705.0
Urban cluster,12.5,1710.0,540.0,286.0,202.0,92.0,625.0
Rural,23.5,2190.0,737.0,382.0,253.0,105.0,791.0
Metropolitan or micropolitan statistical area,,,,,,,
In metropolitan statistical area,98.5,1840.0,520.0,287.0,274.0,105.0,723.0
In micropolitan statistical area,12.3,1861.0,626.0,325.0,195.0,91.0,657.0
Not in metropolitan or micropolitan statistical area,7.4,2050.0,714.0,368.0,270.0,96.0,688.0
Climate region6,,,,,,,


### [todo]  BLS demographic information, Census data

...

---
## Global Data

### World Bank

- Source: https://datacatalog.worldbank.org/dataset/what-waste-global-database

- Global data on 336 cities. The metrics included cover all steps from the waste management value chain, including waste generation, composition, collection, and disposal, as well as information on user fees and financing, the informal sector, administrative structures, public communication, and legal information.

In [21]:
world_bank_city_data = pd.read_csv('../data/worldbank-city-level-data.csv')
world_bank_city_data.sample(5) #  a lot of nans

Unnamed: 0,iso3c,region_id,country_name,income_id,city_name,additional_data_annual_budget_for_waste_management_year,additional_data_annual_solid_waste_budget_year,additional_data_annual_swm_budget_2017_year,additional_data_annual_swm_budget_year,additional_data_annual_waste_budget_year,...,waste_treatment_compost_percent,waste_treatment_controlled_landfill_percent,waste_treatment_incineration_percent,waste_treatment_landfill_unspecified_percent,waste_treatment_open_dump_percent,waste_treatment_other_percent,waste_treatment_recycling_percent,waste_treatment_sanitary_landfill_landfill_gas_system_percent,waste_treatment_unaccounted_for_percent,waste_treatment_waterways_marine_percent
298,RUS,ECS,Russian Federation,UMC,Kemerovo,,,,,,...,,,,,98.1,,1.9,,,
275,PAK,SAS,Pakistan,LMC,Lahore,,,,,,...,,61.53,6.15,,,,,32.31,0.01,
70,CRI,LCN,Costa Rica,UMC,San JosÃ©,,,,,,...,,,,,,0.38,5.2,94.42,,
326,TUN,MEA,Tunisia,LMC,Sfax,,,,,,...,,,,,,,2.0,98.0,,
74,CYP,ECS,Cyprus,HIC,Nicosia,,,,,,...,2.87,,,87.23,,,9.89,,0.01,


In [24]:
world_bank_city_data.shape

(367, 113)

In [18]:
world_bank_city_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 367 entries, 0 to 366
Columns: 113 entries, iso3c to waste_treatment_waterways_marine_percent
dtypes: float64(70), object(43)
memory usage: 324.1+ KB


In [14]:
city_data.describe()

Unnamed: 0,additional_data_annual_swm_budget_2017_year,additional_data_annual_swm_budget_year,additional_data_annual_waste_budget_year,additional_data_collection_ton,additional_data_number_of_scavengers_on_dumpsites_number,additional_data_total_annual_costs_to_collect_and_dispose_of_city_s_waste_year,additional_data_total_swm_expenditures_year,additional_data_total_waste_management_budget_year,composition_food_organic_waste_percent,composition_glass_percent,...,waste_treatment_compost_percent,waste_treatment_controlled_landfill_percent,waste_treatment_incineration_percent,waste_treatment_landfill_unspecified_percent,waste_treatment_open_dump_percent,waste_treatment_other_percent,waste_treatment_recycling_percent,waste_treatment_sanitary_landfill_landfill_gas_system_percent,waste_treatment_unaccounted_for_percent,waste_treatment_waterways_marine_percent
count,1.0,2.0,1.0,1.0,1.0,1.0,1.0,1.0,291.0,269.0,...,67.0,60.0,29.0,16.0,100.0,48.0,128.0,56.0,121.0,4.0
mean,1030281.0,2599229.0,4035882.0,0.73,200.0,5900000.0,36424.0,475000.0,45.424727,4.058437,...,19.352687,46.004,42.091724,57.646875,66.2032,21.750771,15.599063,66.106607,32.496331,20.775
std,,3506159.0,,,,,,,18.321482,3.959363,...,20.20761,30.798648,29.848297,35.336363,33.515912,21.412265,15.488925,31.469398,30.609756,14.125715
min,1030281.0,120000.0,4035882.0,0.73,200.0,5900000.0,36424.0,475000.0,0.93,0.0,...,0.0,0.0,0.23,9.0,0.0,0.0,0.0,0.0,0.0,5.0
25%,1030281.0,1359614.0,4035882.0,0.73,200.0,5900000.0,36424.0,475000.0,34.2,1.5,...,2.785,20.0,10.0,23.75,40.475,5.75,4.8075,42.025,5.0,11.0
50%,1030281.0,2599229.0,4035882.0,0.73,200.0,5900000.0,36424.0,475000.0,47.9,3.0,...,14.0,48.7,41.02,70.05,74.5,14.71,10.0,78.05,25.5,21.55
75%,1030281.0,3838844.0,4035882.0,0.73,200.0,5900000.0,36424.0,475000.0,57.35,5.0,...,28.5,67.025,71.1,87.7075,100.0,36.1825,21.0,90.5,56.0,31.325
max,1030281.0,5078458.0,4035882.0,0.73,200.0,5900000.0,36424.0,475000.0,88.8,32.58,...,82.0,100.0,90.33,100.0,100.0,87.56,72.0,100.0,99.85,35.0


### OECD (Organisation for Economic Co-operation and Development)

#### Municipal Waste Generation and Treatment by Country and Year 

- Source: https://stats.oecd.org/Index.aspx?DataSetCode=MUNW

In [20]:
oecd_municipal_waste = pd.read_csv('../data/oecd-waste-municipal.csv')
oecd_municipal_waste.head(5)

Unnamed: 0,COU,Country,VAR,Variable,YEA,Year,Unit Code,Unit,PowerCode Code,PowerCode,Reference Period Code,Reference Period,Value,Flag Codes,Flags
0,AUS,Australia,MUNICIPAL,Municipal waste generated,1992,1992,TONNE,Tonnes,3,Thousands,,,12000.0,E,Estimated value
1,AUS,Australia,MUNICIPAL,Municipal waste generated,2000,2000,TONNE,Tonnes,3,Thousands,,,13200.0,E,Estimated value
2,AUS,Australia,MUNICIPAL,Municipal waste generated,2007,2007,TONNE,Tonnes,3,Thousands,,,12873.0,,
3,AUS,Australia,MUNICIPAL,Municipal waste generated,2008,2008,TONNE,Tonnes,3,Thousands,,,13096.5,,
4,AUS,Australia,MUNICIPAL,Municipal waste generated,2009,2009,TONNE,Tonnes,3,Thousands,,,13320.0,,


In [23]:
oecd_municipal_waste.shape

(20779, 15)

In [22]:
oecd_municipal_waste.Country.unique()

array(['Australia', 'Austria', 'Belgium', 'Canada', 'Czech Republic',
       'Denmark', 'Finland', 'France', 'Germany', 'Greece', 'Hungary',
       'Iceland', 'Ireland', 'Italy', 'Japan', 'Korea', 'Luxembourg',
       'Mexico', 'Netherlands', 'New Zealand', 'Norway', 'Poland',
       'Portugal', 'Slovak Republic', 'Spain', 'Sweden', 'Switzerland',
       'Turkey', 'United Kingdom', 'United States', 'Brazil', 'Chile',
       "China (People's Republic of)", 'Estonia', 'Indonesia', 'Israel',
       'Russia', 'Slovenia', 'OECD - Total', 'OECD - Europe', 'Colombia',
       'Latvia', 'Costa Rica', 'India', 'Lithuania', 'OECD Asia Oceania',
       'OECD America'], dtype=object)

#### Waste Generation by Sector. 

- Gives the waste generated by each industry and in which country. 
- Source: https://stats.oecd.org/Index.aspx?DataSetCode=MUNW

In [26]:
oecd_sector_waste = pd.read_csv('../data/oecd-waste-sector.csv')
oecd_sector_waste.head(5)

Unnamed: 0,COU,Country,ISIC,Variable,YEA,Year,Unit Code,Unit,PowerCode Code,PowerCode,Reference Period Code,Reference Period,Value,Flag Codes,Flags
0,AUS,Australia,10-33,Manufacturing industries: Total,2009,2009,TONNE,Tonnes,3,Thousands,,,13116.0,,
1,AUS,Australia,10-33,Manufacturing industries: Total,2011,2011,TONNE,Tonnes,3,Thousands,,,3858.2,,
2,AUS,Australia,10-12,Manufacture of food products; beverages and to...,2011,2011,TONNE,Tonnes,3,Thousands,,,920.0,,
3,AUS,Australia,13-15,"Manufacture of textiles, wearing apparel, leat...",2011,2011,TONNE,Tonnes,3,Thousands,,,171.9,,
4,AUS,Australia,16,Manufacture of wood and of products of wood an...,2011,2011,TONNE,Tonnes,3,Thousands,,,164.3,,
