# Scraping LADOT Volume Data from PDFs

## Background
At LADOT, we have a lot of historical (and relatively current) data on vehicle volume counts made at intersections throughout the city at various times. The problem is that the format the data are currently in - PDF - doesn't readily allow for the type of big data analyses that we would like to perform. So for this task I went about trying to develop a method for scraping these historical PDF counts and formatting them into usable data tables using any python package I could. I ended up settling on the pdfquery package, which is really just a lightweight wrapper around the much more well-known package PDFMiner.

The roughly 1,000 PDFs typically (though not always) look like this (converted to images for display here):

![Example Volume PDF](images/example.jpg?raw=TRUE)

##### Format Advantages
* One big advantage is that the (manual count) volume data sheets are generally in the same format.
* With very few exceptions, the PDFs were generated from another program (rather than being scanned images).

##### Format Challenges
There are a few minor challenges:
* There are multiple tables on each page, and each is formatted differently.
* The tables / text do not always appear in the exact same location on each page, which meant I needed a range of parameters to test for the bounding boxes.

## General Approach
My approach can be broken down into the few key parts: (1) define bounding boxes, (2) search for text within the bounding boxes (3) reformat the resulting text into multiple data tables, and (4) join the resulting tables to the ID established by the Bureau of Engineering. 

##### Define Bounding Boxes
This was tricky. I initially began defining bounding boxes using pixel measurements from a few sample pages. However, I quickly realized that due to the second of the challenges I mentioned above that this would not work, since the tables are in different locations among the PDF documents. 

Instead, I decided to create bounding boxes on the fly for each document using relative positioning of certain keywords that appeared almost always on each PDF document. Using pdfquery, I could begin by searching the document for these keywords and then extract the x,y pixel coordinate locations for the bounding box of each one. By getting the coordinates of multiple keyword objects on the page, I could construct a set of bounding boxes that seemed to perform relatively well in capturing data tables.

(create image of what this looked like)

##### Search for Text within Bounding Boxes
Once I had the coordinates of the bounding boxes, this part was quite easy, using PDFQuery to extract text

(do i need to adjust any of the parameters in rapidminer??)

##### Reformat the Resulting Text into Data Tables
The final problem included taking the scraped text from the bounding boxes and reformatting them into usable data tables. I kept in mind the relational database model as I set the format for these tables. From the PDF image above, I decided on the following tables and attributes:

*tbl_manualcount_info:* This table contains the basic information about the manual count summary. Each count will have one tuple with the following information:
* street_ns: The North / South Street running through the intersection
* street_ew: The East / West street running through the intersection
* dayofweek: Day of the Week
* date: Date, in datetime format
* weather: Prevailing weather at the time of the count (Clear, Sunny, etc.)
* hours: the hours of the count (text)
* school_day: A Yes / No indication of whether the count occurred on a school day. This is important because it heavily affects the volume counts
* int_code: The "I/S Code" on the form corresponds to the CL_Node_ID on the BOE Centerline. This ID field makes it easy to join to the City's centerline network
* district: The DOT field district in which the count took place
* count_id: Unique identifier assigned to the summary

*tbl_manualcount_dualwheel:* This table contains count data for dual-wheeled (motorcycles), bikes, and buses. Each form will have 12 tuples with the following information:
* count_id: Unique identifier assigned to the summary in "tbl_manual_count_info"
* approach: Intersection approach being measured (N,S,E,W)
* type: Dual-Wheeled / Bikes / Buses
* volume: Count

*tbl_manualcount_peak:* This table contains the peak hour / 15 minute counts. Each form will have 16 tuples with the following information:
* count_id: Unique identifier assigned to the summary in "tbl_manual_count_info"
* approach: Intersection approach being measured (N,S,E,W)
* type: The type of count
    * am_15: The AM peak 15 minute count
    * am_hour: The AM peak hour count
    * pm_15: The PM peak 15 minute count
    * pm_hour: The PM peak hour count
* time: Time of each count (in datetime format)
* volume: Count

*tbl_manualcount_volumes:* This table contains the main volume counts for each approach at the intersection. The number of tuples for each form will vary depending on the number of hours surveyed. A 6-hour count will have 6 hours * 3 directions (left, right, through) * 4 approach directions = 54 tuples. Each tuple will have the following information:
* count_id: Unique identifier assigned to the summary in "tbl_manual_count_info"
* approach: Northbound (NB) / Southbound (SB) / Eastbound (EB) / Westbound (WB)
* movement: Right-Turn (Rt) / Through (Th) / Left-Turn (Lt)
* start_time: Start time of that count, in datetime format
* end_time: End time of that count, in datetime format
* volume: Count

*tbl_manualcount_peds:* This table contains pedestrian and schoolchildren counts during the same time as the main volume counts, so the number of tuples will also be dependent on the number of hours the location was surveyed. Each tuple will have the following information:
* count_id: Unique identifier assigned to the summary in "tbl_manual_count_info"
* xing_leg: The leg of the intersection that is being crossed. South Leg (SL) / North Leg (NL) / West Leg (WL) / East Leg (EL)
* type: Ped / Sch
* start_time: Start time of that count, in datetime format
* end_time: End time of that count, in datetime format
* volume: Count

##### Join Tables to BOE Count IDs
The last step involves taking all the data generated by this process and relating it to both a GPS coordinate and the intersection ID of the BOE centerline.

To do this exercise, I requested data from BOE that powers NavigateLA. There are actually two relevant tables that were provided by BOE. The first table, *dbo_dot_traffic_data_files* relates the name of the PDF to a TrafficID, so I can use this table to match the PDF names and get the resulting Traffic IDs. The second table, *dot_traffic_data* takes the TrafficID and relates it to the intersection ID (the same one that is usually on the front page of each traffic count summary) as well as the the lat / lon of the location and the intersection name. The structure of the two tables are shown below:

*dbo_dot_traffic_data_files*
* ID: Unique Identifier for the count
* TrafficID: This seems to be an identifier for the location
* TrafficType: Manual Count (manual_count) / Automatic Count (automatic_count) / Survey Data (survey_data)
* DocName: Name of the PDF document
* UniqueDocName: Not exactly sure the purpose of this one, perhaps at one time the DocNames were not unique?
* UploadDT: Date the PDF was uploaded to NavLA

*dot_traffic_data*
* TrafficID: Identifier for the location
* Intersection: The ID for the intersection, corresponding to the CL_NODE_ID in the BOE Centerline file
* ext: ?
* lat: Latitude
* lon: Longitude
* intersection: Intersection Name (eg ISLAND AVE at L ST)
* Shape: geometry object

## Getting Started
I'm actually going to start by loading and cleaning the tables provided by BOE (mentioned just above). 

In [47]:
import pandas as pd
import numpy as np

# Load traffic data files table
traffic_data_files_path = 'boe_tables/dbo_dot_traffic_data_files.csv'
dbo_dot_traffic_data_files = pd.read_csv(traffic_data_files_path, parse_dates=['UploadDT'])

# Drop rows where TrafficID is NaN, convert TrafficID to int type
dbo_dot_traffic_data_files = dbo_dot_traffic_data_files.dropna(axis=0, how='any',subset=['TrafficID'])
dbo_dot_traffic_data_files['TrafficID'] = dbo_dot_traffic_data_files['TrafficID'].astype(int)

# Subset out Survey Data and Automatic Counts
dbo_dot_traffic_data_files = dbo_dot_traffic_data_files[(dbo_dot_traffic_data_files['TrafficType'] == 'manual_count')]

# See traffic data files head
dbo_dot_traffic_data_files.head()

Unnamed: 0,ID,TrafficID,TrafficType,DocName,UniqueDocName,UploadDT
0,1,1435,manual_count,2_GRAVDM93.pdf,2_GRAVDM93.pdf,2007-04-02 08:38:30
1,2,1436,manual_count,4_CULVIS95.pdf,4_CULVIS95.pdf,2008-02-20 09:15:12
2,3,1436,manual_count,4_MONCUL100928.pdf,4_MONCUL100928.pdf,2011-08-09 13:58:55
3,4,1437,manual_count,16_VISTA DEL MAR.WATERVIEW07.pdf,16_VISTA DEL MAR.WATERVIEW07.pdf,2007-11-28 13:01:46
4,5,1437,manual_count,16_visvis01.pdf,16_visvis01.pdf,2007-12-03 16:30:42


In [46]:
# Load traffic data table
traffic_data_path = 'boe_tables/dot_traffic_data.csv'
dot_traffic_data = pd.read_csv(traffic_data_path)

# See traffic data head
dot_traffic_data["IntersectionID"].sort_values()

1593      2.0
1594      4.0
1595     16.0
7105     19.0
4508     26.0
3869     73.0
7766     98.0
1596    104.0
6119    109.0
5870    110.0
5662    122.0
1597    147.0
3825    151.0
1598    168.0
1599    169.0
7765    176.0
8213    201.0
4688    206.0
5568    210.0
1934    219.0
4687    221.0
1935    222.0
4717    258.0
1600    263.0
3965    269.0
1601    271.0
1602    273.0
6579    275.0
1603    288.0
6386    291.0
        ...  
8563      NaN
8564      NaN
8566      NaN
8567      NaN
8573      NaN
8574      NaN
8579      NaN
8581      NaN
8582      NaN
8592      NaN
8595      NaN
8647      NaN
8667      NaN
8669      NaN
8671      NaN
8675      NaN
8733      NaN
8734      NaN
8735      NaN
8736      NaN
8737      NaN
8738      NaN
8740      NaN
8781      NaN
8817      NaN
8831      NaN
8841      NaN
8909      NaN
8928      NaN
8934      NaN
Name: IntersectionID, Length: 8936, dtype: float64

In [5]:
### Setup

# Import PDF Scrape Module
import VolumeCountSheets_V2

# Other Dependent Modules
import csv
import glob
from datetime import datetime, date, time
import pdfquery

# Grab all PDFs within the folder
files = glob.glob('TrafficCountData\Manual Counts-20170526T005814Z-001\Manual Counts\*.pdf')

### Create Dataframes for each table

# tbl_manualcount_info
info_columns = ['street_ns','street_ew','dayofweek','date','weather','hours','school_day','int_code','district','count_id']
tbl_manualcount_info = pd.DataFrame(columns=info_columns)

# tbl_manualcount_dualwheel
dualwheel_columns = ['count_id','approach','type','volume']
tbl_manualcount_dualwheel = pd.DataFrame(columns=dualwheel_columns)

# tbl_manualcount_peak
peak_columns = ['count_id','approach','type','time','volume']
tbl_manualcount_peak = pd.DataFrame(columns=peak_columns)

# tbl_manualcount_volumes
vol_columns = ['count_id','approach','movement','start_time','end_time','volume']
tbl_manualcount_volumes = pd.DataFrame(columns=vol_columns)

# tbl_manualcount_peds
ped_columns = ['count_id','xing_leg','type','start_time','end_time','volume']
tbl_manualcount_peds = pd.DataFrame(columns=ped_columns)

### Loop through folder, parse with custom PDF Read Function, Append to dataframes

print len(files)
success = 0
failures = 0

for file in files:
    while success < 10:
        print success
        print file
        try:
            Manual_TC = VolumeCountSheets_V2.pdf_extract(file)
            print Manual_TC['Volume']
            success = success + 1
        except:
            print "failure"
            failures = failures + 1
print "Success Count"
print success
print "Failure Count"
print failures


1018
0
success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datet

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201

success!
[{'volume': '81', 'start_time': datetime.datetime(2015, 6, 11, 7, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 8, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 8, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 9, 0), 'movement': 'Rt'}, {'volume': '3', 'start_time': datetime.datetime(2015, 6, 11, 9, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 10, 0), 'movement': 'Rt'}, {'volume': '2', 'start_time': datetime.datetime(2015, 6, 11, 15, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 16, 0), 'movement': 'Rt'}, {'volume': '6', 'start_time': datetime.datetime(2015, 6, 11, 16, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 17, 0), 'movement': 'Rt'}, {'volume': '4', 'start_time': datetime.datetime(2015, 6, 11, 17, 0), 'approach': 'SB', 'end_time': datetime.datetime(2015, 6, 11, 18, 0), 'movement': 'Rt'}, {'volume': '75', 'start_time': datetime.datetime(201