-
Notifications
You must be signed in to change notification settings - Fork 0
Extract
Below is the summary of data sources in this project: Python script for downloading the files from Seattle Open Data to Azure file share.
Paid Parking Data is available for the city of Seattle in form of CSV
from 2012 to the Present. Except for the year 2020 (pandemic) all other years had file size of about 42 GB. Downloading the files was not straightforward as each file has a unique code associated with it. To fully automate the ingestion process, the code was extracted via an python automation script using Selenium and Headless Chrome browser.
Method | Source | Feature/Key | Frequency | Description |
---|---|---|---|---|
Python/Selenium | Seattle Open Data | 2012 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2012 |
Python/Selenium | Seattle Open Data | 2013 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2013 |
Python/Selenium | Seattle Open Data | 2014 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2014 |
Python/Selenium | Seattle Open Data | 2015 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2015 |
Python/Selenium | Seattle Open Data | 2016 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2016 |
Python/Selenium | Seattle Open Data | 2017 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2017 |
Python/Selenium | Seattle Open Data | 2018 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2018 |
Python/Selenium | Seattle Open Data | 2019 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2019 |
Python/Selenium | Seattle Open Data | 2020 Year-to-Date Historic | Once | Entire Paid Parking records for the year 2020 |
Python/Selenium | Seattle Open Data | 2021 Year-to-Date Delta | Daily | Delta Paid Parking records for the year 2021 |
Python/Selenium | Blockface | Blockface Data | Daily | Blocface Dataset updated daily |
Extraction Code:
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=chromeDriver_Path)
driver.get(self.seattle_open_data_url)
time.sleep(2)
# Enter the search '{Year} Paid Parking' in the search bar
search_data = driver.find_element_by_xpath(self.search_dataByYear)
if year !=current_year:
search_data.send_keys("{} Paid Parking".format(year))
else:
search_data.send_keys("Paid Parking Last 30 days")
time.sleep(4)
# Click on the search '{Year} Paid Parking' in the dropdown
print(driver.find_element_by_xpath(self.parking_Occpn_Option).text)
driver.find_element_by_xpath(self.parking_Occpn_Option).click()
time.sleep(10)
# Get the URL of Parking Occupancy Data by Year
url=driver.find_element_by_xpath(self.parking_Occpn_Option_ByYear).get_attribute("href")
global url_type, file_extn
urls=url.split("/")
if "Archive" in url:
url_type ="Archive"
file_extn =".zip"
else:
url_type = "Latest"
file_extn = ".csv"
code=urls[5]
Paid Parking Dataset (2012-2017)
Interested columns to clean and transform. Historic data needs to be transformed to get them in common format.
Column | Description |
---|---|
occupancydatetime | The date and time (minute) of the transaction as recorded |
paidoccupancy | This is the number of vehicles paid for parking at this time. |
blockfacename | Street segment, name of street with the “from street” and “to street;" Example is "1ST AVE BETWEEN BELL ST AND BATTERY ST" |
sideofstreet | Options are: E, S, N, W, NE, SW, SE, NW |
sourceelementkey | Unique identifier for the city street segment where the pay station is located |
parkingtimelimitcategory | In minutes. Options are 120 (2-hour parking), 240 (4-hour parking), 30, or 600 (10-hour parking) |
available_spots | Number of paid spaces on the blockface at the given date and time. |
paidparkingarea | The primary name of a paid parking neighborhood. Example is Commercial Core. |
paidparkingsubarea | A subset of a paid parking area—not all paid parking areas have subareas. |
paidparkingrate | Parking rate charged at date and time |
parkingcategory | An overall description of the type of parking allowed on a blockface |
latitude | Latitude of a location |
longitude | Longitude of a location |
Paid Parking Dataset (2018- Present)
Column | Description |
---|---|
occupancydatetime | The date and time (minute) of the transaction as recorded |
paidoccupancy | This is the number of vehicles paid for parking at this time. |
blockfacename | Street segment, name of street with the “from street” and “to street;" Example is "1ST AVE BETWEEN BELL ST AND BATTERY ST" |
sideofstreet | Options are: E, S, N, W, NE, SW, SE, NW |
sourceelementkey | Unique identifier for the city street segment where the pay station is located |
parkingtimelimitcategory | In minutes. Options are 120 (2-hour parking), 240 (4-hour parking), 30, or 600 (10-hour parking) |
available_spots | Number of paid spaces on the blockface at the given date and time. |
paidparkingarea | The primary name of a paid parking neighborhood. Example is Commercial Core. |
paidparkingsubarea | A subset of a paid parking area—not all paid parking areas have subareas. |
paidparkingrate | Parking rate charged at date and time |
parkingcategory | An overall description of the type of parking allowed on a blockface |
Location | Calculated based on the known location of a pay station along the same blockface. |
Blockface
Column | Description |
---|---|
station_id | Unique identifier for the city street segment where the pay station is located |
station_address | Station ID Address. |
side | Options are: E, S, N, W, NE, SW, SE, NW |
block_nbr | Blockface number |
parking_category | An overall description of the type of parking allowed on a blockface |
wkd_rate1 | Weekday rate for weekday_start1 to weekday_end1 |
wkd_start1 | weekday start1 In minutes. |
wkd_end1 | weekday end1 In minutes |
wkd_rate2 | Weekday rate for weekday_start2 to weekday_end2 |
wkd_start2 | weekday start2 In minutes. |
wkd_end2 | weekday end2 In minutes. |
wkd_rate3 | Weekday rate for weekday_start3 to weekday_end3 |
wkd_start3 | weekday start3 In minutes. |
wkd_end3 | weekday end3 In minutes. |
sat_rate1 | Saturday rate for sat_start1 to sat_end1 |
sat_start1 | Saturday start1 In minutes. |
sat_end1 | Saturday end1 In minutes. |
sat_rate2 | Saturday rate for sat_start2 to sat_end2 |
sat_start2 | Saturday start2 In minutes. |
sat_end2 | The primary name of a paid parking neighborhood. Example is Commercial Core. |
sat_rate3 | Saturday rate for sat_start3 to sat_end3 |
sat_start3 | Saturday start3 In minutes. |
sat_end3 | Saturday end3 In minutes. |
parking_time_limit | In minutes. Options are 120 (2-hour parking), 240 (4-hour parking), 30, or 600 (10-hour parking) |
subarea | A subset of a paid parking area—not all paid parking areas have subareas. |