# Predict Waste Production for its Reduction

## Context

According to the World Bank, in 2016 cities generated 2.01 billion tons of solid waste. Per
person, this is around 0.74 kg/day! With the rapid growth of cities, this number is only
expected to increase. As cities are growing, it is urgent that optimization processes for
waste processing and more targeted public education on waste management and
separation. Finally, it is also important to note that waste collection also has an impact on air pollution.

## Goal & Outcome

The goal of this challenge is to help identify trends in waste production and help to create
insights into how to reduce waste and optimize its collection. The expected outcome to this challenge is to identify waste trends and to produce an
explainable model for predicting future waste production.
Finally, don’t forget to propose the application (product) for the model and study its
impact.

## Data

Austin Resource Recovery daily report providing waste collection information based on the following categories:
- Report Date: The date collections information was recorded.
- Load Type: The specific type of load that is being collected on that day.
- Load Time: Date & Time of Loading
- Load Weight: The weight (in pounds) collected for each service on the day it was delivered to a diversion facility
- Drop off Site: The location where each type of waste is delivered for disposal, recycling or reuse: TDS Landfill indicates the Texas Disposal System landfill located at 12200 Carl Rd, Creedmoor, TX 78610; Balcones Recycling is a recycling facility located at 9301 Johnny Morris Road Austin, TX 78724; MRF is a Materials Recycling Facility (such as Texas Disposal Systems or Balcones Recycling); Hornsby Bend is located at 2210 FM 973, Austin, TX 78725 and accepts food scraps, yard trimmings, food-soiled paper and other materials collected by ARR, and combined with other waste to produce nutrient-rich dillo dirt, used for landscaping.
- Route Type: The general category of collection service provided by Austin Resource Recovery
- Route Number: Austin Resource Recovery route that the truck that collected this load was following. Each route has abbreviated letters indicating the service type (e.g. Bulk = "BU") and a number indicating the specific route.

This information is used to help ARR reach its goals to transform waste into resources while keeping our community clean. For more information, visit www.austintexas.gov/department/austin-resource-recovery

# Development

In [None]:
import pandas as pd
import plotly.express as px
import json
import fiona

In [131]:
!apt install libspatialindex-dev
!pip install osmnx
!pip install osmium
!pip install contextily
!pip install osm-runner

[1;31mE: [0mCould not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)[0m
[1;31mE: [0mUnable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?[0m
Collecting osmnx
  Downloading osmnx-1.1.2-py2.py3-none-any.whl (95 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m95.9/95.9 KB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting Rtree>=0.9
  Downloading Rtree-0.9.7-cp38-cp38-manylinux2010_x86_64.whl (994 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m994.7/994.7 KB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting numpy>=1.21
  Downloading numpy-1.22.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.8/16.8 MB[0m [31m15.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Collecting pyproj>=3.2
  Downloading pyproj-3.3.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.

Collecting rasterio
  Downloading rasterio-1.2.10-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (19.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.2/19.2 MB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting geopy
  Downloading geopy-2.2.0-py3-none-any.whl (118 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m118.9/118.9 KB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting geographiclib<2,>=1.49
  Downloading geographiclib-1.52-py3-none-any.whl (38 kB)
Collecting affine
  Downloading affine-2.3.0-py2.py3-none-any.whl (15 kB)
Collecting snuggs>=1.4.1
  Downloading snuggs-1.4.7-py3-none-any.whl (5.4 kB)
Installing collected packages: geographiclib, affine, xyzservices, snuggs, mercantile, geopy, rasterio, contextily
Successfully installed affine-2.3.0 contextily-1.2.0 geographiclib-1.52 geopy-2.2.0 mercantile-1.2.1 rasterio-1.2.10 snuggs-1.4.7 xyzservices-2022.2.0
You should consider upgrading via th

  Downloading jupyterlab_widgets-1.0.2-py3-none-any.whl (243 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m243.4/243.4 KB[0m [31m22.1 MB/s[0m eta [36m0:00:00[0m
Collecting SecretStorage>=3.2
  Downloading SecretStorage-3.3.1-py3-none-any.whl (15 kB)
Collecting jeepney>=0.4.2
  Downloading jeepney-0.7.1-py3-none-any.whl (54 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.1/54.1 KB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
Collecting setuptools-scm
  Downloading setuptools_scm-6.4.2-py3-none-any.whl (37 kB)
Collecting gssapi
  Using cached gssapi-1.7.3.tar.gz (1.3 MB)
  Preparing metadata (setup.py) ... [?25lerror
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m [31m[14 lines of output][0m
  [31m   [0m /bin/sh: 1: krb5-config: Permission denied
  [31m   [0m In distributed package, building from C fi

In [34]:
pd.set_option('float_format', '{:f}'.format)

In [35]:
data = pd.read_csv("data/waste_data.csv")

In [36]:
data.head()

Unnamed: 0,Report Date,Load Type,Load Time,Load Weight,Dropoff Site,Route Type,Route Number,Load ID
0,12/08/2020,BULK,12/08/2020 03:02:00 PM,5220.0,TDS LANDFILL,BULK,BU13,899097
1,12/08/2020,RECYCLING - SINGLE STREAM,12/08/2020 10:00:00 AM,11140.0,TDS - MRF,RECYCLING - SINGLE STREAM,RTAU53,899078
2,12/03/2020,RECYCLING - SINGLE STREAM,12/03/2020 10:34:00 AM,10060.0,BALCONES RECYCLING,RECYCLING - SINGLE STREAM,RHBU10,899082
3,12/07/2020,SWEEPING,12/07/2020 10:15:00 AM,7100.0,TDS LANDFILL,SWEEPER DUMPSITES,DSS04,899030
4,12/07/2020,RECYCLING - SINGLE STREAM,12/07/2020 04:00:00 PM,12000.0,TDS - MRF,RECYCLING - SINGLE STREAM,RMAU53,899048


In [37]:
data.tail()

Unnamed: 0,Report Date,Load Type,Load Time,Load Weight,Dropoff Site,Route Type,Route Number,Load ID
740868,04/09/2008,RECYCLING - PAPER,07/11/2021 07:00:39 AM,1080.0,MRF,RECYCLING,RW05,273708
740869,12/01/2015,BULK,07/11/2021 07:05:29 AM,9360.0,TDS LANDFILL,STORM,HAFLDBU15,676651
740870,04/25/2007,YARD TRIMMING,07/11/2021 07:01:56 AM,,HORNSBY BEND,YARD TRIMMINGS,YW04,224646
740871,04/09/2008,RECYCLING - COMINGLE,07/11/2021 07:00:39 AM,3960.0,MRF,RECYCLING,RW04,273706
740872,04/08/2008,RECYCLING - COMINGLE,07/11/2021 07:00:39 AM,5280.0,MRF,RECYCLING,RT24,273694


In [38]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 740873 entries, 0 to 740872
Data columns (total 8 columns):
 #   Column        Non-Null Count   Dtype  
---  ------        --------------   -----  
 0   Report Date   740873 non-null  object 
 1   Load Type     740873 non-null  object 
 2   Load Time     740873 non-null  object 
 3   Load Weight   668538 non-null  float64
 4   Dropoff Site  740873 non-null  object 
 5   Route Type    740873 non-null  object 
 6   Route Number  740873 non-null  object 
 7   Load ID       740873 non-null  int64  
dtypes: float64(1), int64(1), object(6)
memory usage: 45.2+ MB


In [39]:
data.describe()

Unnamed: 0,Load Weight,Load ID
count,668538.0,740873.0
mean,11763.477576,521353.123651
std,7554.855662,249972.621259
min,-4480.0,101223.0
25%,5740.0,289609.0
50%,11020.0,554862.0
75%,16520.0,741648.0
max,1562821.0,929006.0


In [40]:
data["Report Date"] = pd.to_datetime(data["Report Date"])
data["Load Time"] = pd.to_datetime(data["Load Time"])

In [41]:
data.head(100)

Unnamed: 0,Report Date,Load Type,Load Time,Load Weight,Dropoff Site,Route Type,Route Number,Load ID
0,2020-12-08,BULK,2020-12-08 15:02:00,5220.000000,TDS LANDFILL,BULK,BU13,899097
1,2020-12-08,RECYCLING - SINGLE STREAM,2020-12-08 10:00:00,11140.000000,TDS - MRF,RECYCLING - SINGLE STREAM,RTAU53,899078
2,2020-12-03,RECYCLING - SINGLE STREAM,2020-12-03 10:34:00,10060.000000,BALCONES RECYCLING,RECYCLING - SINGLE STREAM,RHBU10,899082
3,2020-12-07,SWEEPING,2020-12-07 10:15:00,7100.000000,TDS LANDFILL,SWEEPER DUMPSITES,DSS04,899030
4,2020-12-07,RECYCLING - SINGLE STREAM,2020-12-07 16:00:00,12000.000000,TDS - MRF,RECYCLING - SINGLE STREAM,RMAU53,899048
...,...,...,...,...,...,...,...,...
95,2020-12-08,ORGANICS,2020-12-08 13:43:00,14260.000000,ORGANICS BY GOSH,YARD TRIMMINGS-ORGANICS,OT25,899221
96,2020-12-09,BRUSH,2020-12-09 11:27:00,8200.000000,HORNSBY BEND,BRUSH,BR24,899245
97,2020-12-08,ORGANICS,2020-12-08 13:53:00,11660.000000,ORGANICS BY GOSH,YARD TRIMMINGS-ORGANICS,OBT99,899223
98,2020-12-08,ORGANICS,2020-12-08 14:53:00,12840.000000,ORGANICS BY GOSH,YARD TRIMMINGS-ORGANICS,OT10,899254


In [75]:
# Here we can see two typos which we can correct for
# The year was inputted wrong while the date and month were inputed right 
data = data[pd.DatetimeIndex(data["Load Time"]).year <= 2021] 

In [76]:
# Correct two rows
# data.iloc[[354250], 2] = data.iloc[[354250], 2].replace(year=2021)
# data.iloc[[730958], 2] = data.iloc[[730958], 2].replace(year=2021)

In [77]:
data[pd.DatetimeIndex(data["Load Time"]).year > 2021] 

Unnamed: 0,Report Date,Load Type,Load Time,Load Weight,Dropoff Site,Route Type,Route Number,Load ID,Year,Month,Day,Hour


In [78]:
data.dtypes

Report Date     datetime64[ns]
Load Type               object
Load Time       datetime64[ns]
Load Weight            float64
Dropoff Site            object
Route Type              object
Route Number            object
Load ID                  int64
Year                     int64
Month                    int64
Day                      int64
Hour                     int64
dtype: object

In [79]:
data.sort_values(by = "Load Time", ascending = False)

Unnamed: 0,Report Date,Load Type,Load Time,Load Weight,Dropoff Site,Route Type,Route Number,Load ID,Year,Month,Day,Hour
717844,2020-12-21,RECYCLING - SINGLE STREAM,2021-12-21 12:41:00,6940.000000,TDS LANDFILL,RECYCLING - SINGLE STREAM,RMAU21,906125,2021,12,21,12
739696,2020-11-24,ORGANICS,2021-12-07 00:00:00,1340.000000,ORGANICS BY GOSH,YARD TRIMMINGS-ORGANICS,OBT99,927983,2021,12,7,0
740735,2021-06-28,MIXED LITTER,2021-07-11 07:07:45,3140.000000,TDS LANDFILL,KAB,KAB02,927260,2021,7,11,7
740751,2021-06-30,GARBAGE COLLECTIONS,2021-07-11 07:07:42,17200.000000,TDS LANDFILL,GARBAGE COLLECTION,PW30,928229,2021,7,11,7
740721,2020-09-23,GARBAGE COLLECTIONS,2021-07-11 07:07:30,0.000000,TDS LANDFILL,GARBAGE COLLECTION,PAW70,889455,2021,7,11,7
...,...,...,...,...,...,...,...,...,...,...,...,...
107125,2012-10-16,BULK,2001-10-16 15:28:00,8260.000000,TDS LANDFILL,BULK,BU16,545996,2001,10,16,15
322083,2012-10-16,BULK,2001-10-16 11:51:00,14080.000000,TDS LANDFILL,BULK,BU16,545997,2001,10,16,11
175739,2012-03-16,BULK,2001-03-16 13:33:00,4740.000000,TDS LANDFILL,BULK,BU05,522334,2001,3,16,13
550853,2012-03-16,BULK,2001-03-16 09:38:00,4240.000000,TDS LANDFILL,BULK,BU05,522335,2001,3,16,9


In [80]:
data["Load Type"].unique()

array(['BULK', 'RECYCLING - SINGLE STREAM', 'SWEEPING',
       'GARBAGE COLLECTIONS', 'YARD TRIMMING', 'BRUSH', 'ORGANICS',
       'MIXED LITTER', 'RECYCLED METAL', 'TIRES', 'DEAD ANIMAL', 'LITTER',
       'RECYCLING - COMINGLE', 'RECYCLING - PAPER', 'BAGGED LITTER',
       'MULCH', 'MATTRESS', 'RECYCLING - PLASTIC BAGS',
       'CONTAMINATED RECYCLING', 'CONTAMINATED YARD TRIMMINGS',
       'YARD TRIMMING - X-MAS TREES', 'CONTAMINATED ORGANICS'],
      dtype=object)

In [81]:
data["Load Type"].value_counts()

GARBAGE COLLECTIONS            258433
RECYCLING - SINGLE STREAM      147652
SWEEPING                        88563
YARD TRIMMING                   69571
BULK                            40120
BRUSH                           39164
RECYCLING - PAPER               32162
RECYCLING - COMINGLE            31125
ORGANICS                        17721
DEAD ANIMAL                      6860
TIRES                            3233
MIXED LITTER                     2177
LITTER                           1578
MULCH                            1344
RECYCLED METAL                   1049
BAGGED LITTER                      43
RECYCLING - PLASTIC BAGS           40
YARD TRIMMING - X-MAS TREES        17
MATTRESS                            9
CONTAMINATED RECYCLING              8
CONTAMINATED YARD TRIMMINGS         1
CONTAMINATED ORGANICS               1
Name: Load Type, dtype: int64

## By Load Type

In [82]:
weightsum_by_type = data[["Load Weight", "Load Type"]].groupby(by = "Load Type").sum()
weightmean_by_type = data[["Load Weight", "Load Type"]].groupby(by = "Load Type").mean()

In [83]:
weightsum_by_type.sort_values(by= "Load Weight", ascending= False)

Unnamed: 0_level_0,Load Weight
Load Type,Unnamed: 1_level_1
GARBAGE COLLECTIONS,4414309701.5394
RECYCLING - SINGLE STREAM,1460816242.0
YARD TRIMMING,789597964.0
BULK,300764817.0
BRUSH,234882597.0
ORGANICS,192513186.0
SWEEPING,190643057.0
RECYCLING - PAPER,141661620.0
RECYCLING - COMINGLE,101514879.0
MULCH,10710573.0


In [84]:
weightmean_by_type.sort_values(by= "Load Weight", ascending= False)

Unnamed: 0_level_0,Load Weight
Load Type,Unnamed: 1_level_1
CONTAMINATED ORGANICS,25380.0
GARBAGE COLLECTIONS,17083.572444
BAGGED LITTER,12707.465116
YARD TRIMMING - X-MAS TREES,11750.0
SWEEPING,11538.73968
YARD TRIMMING,11352.301291
ORGANICS,10873.37961
RECYCLING - SINGLE STREAM,9896.324432
MULCH,7969.176339
BULK,7497.191141


https://routereadytrucks.com/blogs/know-4-major-types-garbage-trucks/

Front Loader Garbage Trucks
You will require massive containers to collect all the garbage from industrial and commercial properties. That is when front loader garbage trucks will help you with their size. Their containers, often called dumpsters, are spacious enough to collect industrial waste materials. From, slime and sludge to waste from factories, the design of these trucks make it possible to accommodate all types of garbage inside. They come with steel forks controlled hydraulically. An operator lift picks up the waste materials and dumps them into the container.

Most front loaders available in the US can lift containers weighing approximately 8000 lbs. On the other hand, they can hold trash of up to 40 cubic yards.

Side Loader Garbage Trucks
If industrial waste is not your cup of tea, you can focus on removal of household waste. For this, a side loader garbage truck will be most suitable. You need to load the waste materials from the side. There are two variants available in this truck: one with automatic robotic arms that will collect the garbage and second, manually. The automated side loaders are slightly more expensive. They require only one operator. You can collect rubbish from almost 1500 homes every day.

The size of the side loader garbage truck plays a crucial role in deciding the quantity of the waste materials it can carry. Most of the standard trucks can hold approximately 30,000 lbs of compacted garbage every day and hold up to 28 cubic yards of garbage. Some of these trucks are available at a budget-friendly price if you buy them second-hand. The manual side loaders will cost lesser compared to the automated side loaders.

Rear Loader Garbage Trucks
If you want to serve both commercial and residential clients, then get a rear loader garbage truck. These are the most versatile when it comes to trash collection. Their significant opening at the back allows you to collect massive quantities of waste in one go. Many residential clients keep their garbage inside bin bags. No matter what their size is, you can collect plenty of them inside the truck in one day. Like carrying, these trucks also help in dumping the contents too, thanks to their substantial rear opening.

Most rear loader garbage trucks can accommodate trash from as many as 800 to 850 homes. Some of the bigger variants can haul up to 18 tons of garbage. Their weight capacity ranges from 6 to 35 cubic yards depending on their size. You can purchase one of these used beasts for a very affordable price. But make sure you check the condition of the truck before buying.

Roll Off Trucks
These are the most popular garbage trucks when it comes to mass-scale commercial trash removal services. You can see them in demolition and construction sites. Their sturdy construction makes them a perfect fit for handling heavier materials, such as cardboard and steel. These trucks have massive roll off containers that you can drop at specified locations and then pick them up after a period after the clients have loaded them with waste materials.

These help to pick up the loaded container without much effort. A roll off truck can carry approximately 20,000 lbs, which is equal to 10 tons. Its sturdy construction makes sure the truck doesn’t get damaged during the pickup and drop off process.

Most people don’t value the engineering genius of garbage trucks. If you want to flourish in your garbage removal business, make sure you choose one of these trucks for higher efficiency because of the quantity of the trash they can carry.

## By Dropoff Sites

In [85]:
weightsum_by_dropoff = data[["Load Weight", "Dropoff Site"]].groupby(by = "Dropoff Site").sum()
weightmean_by_dropoff = data[["Load Weight", "Dropoff Site"]].groupby(by = "Dropoff Site").mean()

In [86]:
weightmean_by_dropoff.sort_values(by= "Load Weight", ascending= False)

Unnamed: 0_level_0,Load Weight
Dropoff Site,Unnamed: 1_level_1
CLARKSON,18640.0
GREAT NORTHERN,15673.023256
ELMONT,15303.4
TDS LANDFILL,15095.92366
BFI LANDFILL,14168.027682
WESTFIELD,13804.705882
BURGER CENTER,12785.285714
BRAKER SITE,12385.538037
KRAMER,11660.0
BARTON SKYWAY,11480.0


In [87]:
weightsum_by_dropoff.sort_values(by= "Load Weight", ascending= False)

Unnamed: 0_level_0,Load Weight
Dropoff Site,Unnamed: 1_level_1
TDS LANDFILL,4917859434.5394
HORNSBY BEND,874901337.0
BALCONES RECYCLING,603553677.0
TDS - MRF,565151254.0
MRF,475345737.0
ORGANICS BY GOSH,276194353.0
STEINER LANDFILL,98683172.0
BRAKER SITE,31422110.0
ZILKER,8872712.0
BFI LANDFILL,4094560.0


## By Time

In [88]:
list_datetime_objs = ["Year", "Month", "Day", "Hour"]

In [89]:
data["Year"] = pd.DatetimeIndex(data["Load Time"]).year
data["Month"] = pd.DatetimeIndex(data["Load Time"]).month
data["Day"] = pd.DatetimeIndex(data["Load Time"]).day
data["Hour"] = pd.DatetimeIndex(data["Load Time"]).hour

### Year

In [90]:
weightsum_by_year = data[["Load Weight", "Year"]].groupby(by = "Year").sum()
weightmean_by_year = data[["Load Weight", "Year"]].groupby(by = "Year").mean()

In [91]:
weightsum_by_year.sort_values(by= "Year", ascending= False)

Unnamed: 0_level_0,Load Weight
Year,Unnamed: 1_level_1
2021,308022145.5394
2020,538346515.0
2019,498997378.0
2018,492915926.0
2017,499623063.0
2016,494224133.0
2015,490125736.0
2014,474482585.0
2013,463140246.0
2012,453635109.0


In [94]:
fig = px.bar(weightsum_by_year, x= weightsum_by_year.index, y="Load Weight", title='Load Weight by Year')
fig.show()

### Month

In [95]:
weightsum_by_month = data[["Load Weight", "Month"]].groupby(by = "Month").sum()
weightmean_by_month = data[["Load Weight", "Month"]].groupby(by = "Month").mean()

In [98]:
weightsum_by_month.sort_values(by= "Month", ascending= False)

Unnamed: 0_level_0,Load Weight
Month,Unnamed: 1_level_1
12,682193789.0
11,638176652.0
10,620107337.0
9,576343634.0
8,588799928.0
7,614474435.0
6,669048718.298
5,711709802.0
4,733726337.0
3,758917079.2414


In [97]:
fig = px.bar(weightsum_by_month, x= weightsum_by_month.index, y="Load Weight", title='Load Weight by Month')
fig.show()

### Day

In [102]:
weightsum_by_day = data[["Load Weight", "Day"]].groupby(by = "Day").sum()
weightmean_by_day = data[["Load Weight", "Day"]].groupby(by = "Day").mean()

In [104]:
weightsum_by_day.sort_values(by= "Day", ascending= False)

Unnamed: 0_level_0,Load Weight
Day,Unnamed: 1_level_1
31,152541111.0
30,245724235.0
29,250483390.0
28,263211774.0
27,260741394.0
26,255571887.0
25,240395552.0
24,251897718.0
23,254768983.1164
22,254331774.125


In [109]:
fig = px.bar(weightsum_by_day, x= weightsum_by_day.index, y="Load Weight", title='Load Weight by Day')
fig.show()

### Hour

In [106]:
weightsum_by_hour = data[["Load Weight", "Hour"]].groupby(by = "Hour").sum()
weightmean_by_hour = data[["Load Weight", "Hour"]].groupby(by = "Hour").mean()

In [108]:
weightsum_by_hour.sort_values(by= "Hour", ascending= False)

Unnamed: 0_level_0,Load Weight
Hour,Unnamed: 1_level_1
23,107280.0
22,116140.0
21,561480.0
20,4108730.0
19,23716905.0
18,80603777.0
17,202358728.0
16,407622490.0
15,857953551.298
14,1324166677.1164


In [110]:
fig = px.bar(weightsum_by_hour, x= weightsum_by_hour.index, y="Load Weight", title='Load Weight by Hour')
fig.show()

## Open Street Maps

OSM dataset contains waste management locations listed in OpenStreetMap (OSM). Specifically, it includes OSM features having the tags "amenity:recycling", "amenity:waste_transfer_station", "amenity:sanitary_dump_station", "amenity:waste_disposal", or "industrial:scrap_yard". It includes a poi_type, a poi_name, and all other OSM tags as associated with the point (see https://taginfo.openstreetmap.org/tags).

In [114]:
%%bash
wget    https://download.bbbike.org/osm/extract/planet_-96.068,29.5_-94.852,30.146.osm.pbf \
    --quiet -O data/Houston.osm.pbf

In [124]:
!ogrinfo data/Houston.osm.pbf

INFO: Open of `data/Houston.osm.pbf'
      using driver `OSM' successful.
1: points (Point)
2: lines (Line String)
3: multilinestrings (Multi Line String)
4: multipolygons (Multi Polygon)
5: other_relations (Geometry Collection)


In [127]:
%%bash
ogr2ogr -f "GPKG" \
    data/houston_polygons.gpkg \
    data/Houston.osm.pbf \
    -nlt POLYGONS \
    -nln polygons

0...10...20...30...40...50...60...70...80...90...100 - done.




In [134]:
import json
import fiona

In [135]:
#Read data
layer_file = "data/houston_polygons.gpkg"
collection = list(fiona.open(layer_file,'r'))
df1 = pd.DataFrame(collection)


#Check Geometry
def isvalid(geom):
    try:
        shape(geom)
        return 1
    except:
        return 0

df1['isvalid'] = df1['geometry'].apply(lambda x: isvalid(x))
df1 = df1[df1['isvalid'] == 1]
collection = json.loads(df1.to_json(orient='records'))

#Convert to geodataframe
gdf_houston_poly = gpd.GeoDataFrame.from_features(collection)

KeyboardInterrupt: 