Skip to content

Commit

Permalink
Merge pull request #45 from cityofaustin/knack_api
Browse files Browse the repository at this point in the history
Restructure and Knacpy Refactor
  • Loading branch information
johnclary committed Sep 15, 2017
2 parents 3919ddb + 3ed8eaf commit 7dbb521
Show file tree
Hide file tree
Showing 82 changed files with 3,840 additions and 2,794 deletions.
5 changes: 2 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
__pycache__
*.zip
dist
secrets.py
schtask.txt
intersection-database
mapDrive.bat
Expand All @@ -13,5 +12,5 @@ to_test.py
tp.py
script_status.md
log
shell_scripts
*.log
*.log
secrets.py
51 changes: 33 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,39 +4,54 @@ This repo houses ETL scripts for Austin Transportation's open data projects. The

## Quick Start

We use [Anacaonda](https://conda.io/) (actually, [Miniconda](https://conda.io/miniconda.html)) to manage Python environments. If you don't want to use Anaconda, [requirements.txt]() identifies all of the packages required to run the scripts in this repo.
We use [Miniconda](https://conda.io/miniconda.html)) to manage Python environments. If you don't want to use Anaconda, [requirements.txt]() identifies all of the packages required to run the scripts in this repo.

1. Install [Miniconda](https://conda.io/miniconda.html). Check out the [test drive](https://conda.io/docs/test-drive.html#managing-environments) if you haven't used Anaconda before.

2. Clone this repository into your directory of choice: `git clone https://github.com/cityofaustin/transportation-data-publishing`

3. `cd` into the repo directory, and run `conda create --name datapub1 --file requirements.txt` to create the data publishing environment

4. Create your `secrets.py` file following the template in [fake-secrets.py](https://github.com/cityofaustin/transportation-data-publishing/blob/master/fake_secrets.py)
4. Create your `secrets.py` file following the template in [fake-py](https://github.com/cityofaustin/transportation-data-publishing/blob/master/fake_py)

## About the Scripts
## About the Repo Structure

#### [ArcGIS Online Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/agol_helpers.py)
Query, add, and delete features from an ArcGIS Online Feature Service
#### [bcycle]()

#### [Data Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/data_helpers.py)
Handy bits of code for common ETL tasks, mostly borrowed from Stack Overflow snippets.
These scripts load B-Cycle tripe data from an Austin B-Cycle Dropbox folder to [data.austintexas.gov](http://data.austintexas.gov).

#### [Socrata Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/socrata_helpers.py)
Use the Socrata Open Data API to publish #opendata.
#### [config]()

#### [Knack Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/knack_helpers.py)
Scripts for accessing the [Knack API](http://knack.freshdesk.com/support/solutions/articles/5000444173-working-with-the-api).
Config holds configuration files needed for the various scripts. `secrets.py` belongs here -- see `fake_secrets.py` as a reference.

#### [Email Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/email_helpers.py)
Helpers for sending emails with [yagmail](https://github.com/kootenpv/yagmail)
#### [data_tracker]()

#### [KITS Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/kits_helpers.py)
Scripts for accessing the KITS SQL database which supports Austin Transportation's Advanced Traffic Management System (ATMS).
These scripts modify data in our Data Tracker application, and support its integration with other applications.

#### [GitHub Helpers](https://github.com/cityofaustin/transportation-data-publishing/blob/master/github_helpers.py)
Helpers for commiting to GitHub with programmaticaly. Code borrowed from @luqmaan and @openaustin's [Construction Permits](https://github.com/open-austin/construction-permits) project.
#### [open_data]()

These scripts publish transportation data to [data.austintexas.gov](http://data.austintexas.gov) and the City's ArcGIS Online organization site.

#### [shell_scripts]()

This is where we maintain the various shell scripts that instruct our VMs to run our Python code.

#### [traffic_study]()

These are the dedicated files for publishing traffic study data, as described [in the wiki](https://github.com/cityofaustin/transportation-data-publishing/wiki/Traffic-Count-Data-Publishing).

#### [util]()

We maintain general-purpose util scripts in the `util` folder. They store useful routines such as connecting to databases, publishing to specific applications, or converting between date formats.

## Contributing

Public contributions are welcome! Assign pull requests to [@johnclary](http://github.com/johnclary).
Public contributions are welcome! Assign pull requests to [@johnclary](http://github.com/johnclary).

## License

As a work of the City of Austin, this project is in the public domain within the United States.

Additionally, we waive copyright and related rights in the work worldwide through the [CC0 1.0 Universal public domain dedication](https://creativecommons.org/publicdomain/zero/1.0/).


87 changes: 0 additions & 87 deletions backup_objs.py

This file was deleted.

3 changes: 3 additions & 0 deletions bcycle/_setpath.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# update current path to include util directory
import sys
sys.path.append('..')
81 changes: 81 additions & 0 deletions bcycle/bcycle_kiosk_pub.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
'''
upload latest b-cycle kiosk data to socrata
'''
import csv
import pdb

import arrow
import dropbox

import _setpath
from config.secrets import *
from util import agolutil
from util import datautil
from util import emailutil
from util import socratautil


def get_dropbox_data(path, token):
'''
get dropbox csv and return as list of dicts
'''
print("get dropbox data")
client = dropbox.client.DropboxClient(token)
f, metadata = client.get_file_and_metadata(path)

content = f.read()

data_string = content.decode('utf-8')
data = data_string.splitlines()
# del(data[0]) # remove header row

reader = csv.DictReader(data)

return list(reader)


try:
socrata_creds = SOCRATA_CREDENTIALS
access_token = DROPBOX_BCYCLE_TOKEN

fieldnames = ['kiosk_id', 'kiosk_name', 'kiosk_status', 'latitude', 'longitude']

# AGOL CONFIG
SERVICE_URL = 'http://services.arcgis.com/0L95CJ0VTaxqcmED/arcgis/rest/services/bcycle_kiosks/FeatureServer/0/'

resource_id = 'qd73-bsdg'
resource_id_pub_log = 'n5kp-f8k4'

path = '/austinbcycletripdata/50StationPlusOld-LongLatInfo.csv'

# get latest kiosk data from B-cycle dropbox
data = get_dropbox_data(path, access_token)
data = datautil.upper_case_keys(data)
data = socratautil.create_location_fields(data)

token = agolutil.get_token(AGOL_CREDENTIALS)
data = datautil.replace_keys(data, {"STATUS" : "KIOSK_STATUS"} )
data = datautil.filter_by_key_exists(data, 'LATITUDE')

# replace arcgis online features
agol_payload = agolutil.build_payload(data)
del_response = agolutil.delete_features(SERVICE_URL, token)
add_response = agolutil.add_features(SERVICE_URL, token, agol_payload)

# reformat for socrata and upsert
data = datautil.lower_case_keys(data)

upsert_res = socratautil.upsert_data(socrata_creds, data, resource_id)

# update publication log
log_entry = socratautil.prep_pub_log(arrow.now(), 'bcycle_kiosk_update', upsert_res)

socratautil.upsert_data(socrata_creds, log_entry, resource_id_pub_log)

print(upsert_res)

except Exception as e:
print('Failed to process bcycle kiosk data for {}'.format(arrow.now().format()))
print(e)
emailutil.send_email(ALERTS_DISTRIBUTION, 'BCycle Kiosk Update Failure', str(e), EMAIL['user'], EMAIL['password'])
raise e
122 changes: 122 additions & 0 deletions bcycle/bcycle_trip_pub.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
'''
compare socrata and dropbox b-cycle data
upload latest data to socrata as needed
'''
import csv
import sys

import arrow
import dropbox
import requests

import _setpath
from config.secrets import *
from util import emailutil
from util import socratautil


def max_date_socrata(resource_id):
url = 'https://data.austintexas.gov/resource/{}.json?$query=SELECT max(checkout_date) as date'.format(resource_id)

try:
res = requests.get(url, verify=False)

except requests.exceptions.HTTPError as e:
raise e

return res.json()[0]['date']


def data_exists_dropbox(path, token):
'''
Check if last month's trip data exists on dropbox
'''
client = dropbox.client.DropboxClient(token)

try:
client.metadata(path)
return True

except:
return False


def get_dropbox_data(path, token):
'''
get dropbox csv and return as list of dicts
'''
client = dropbox.client.DropboxClient(token)
f, metadata = client.get_file_and_metadata(path)

content = f.read()

data_string = content.decode('utf-8')
data = data_string.splitlines()
del(data[0]) # remove header row

reader = csv.DictReader(data, fieldnames=fieldnames)

return list(reader)

try:
socrata_creds = SOCRATA_CREDENTIALS
access_token = DROPBOX_BCYCLE_TOKEN

fieldnames = ('trip_id', 'membership_type', 'bicycle_id', 'checkout_date', 'checkout_time', 'checkout_kiosk_id', 'checkout_kiosk', 'return_kiosk_id', 'return_kiosk', 'trip_duration_minutes')

one_month_ago = arrow.now().replace(months=-1)
dropbox_year = one_month_ago.format('YYYY')
dropbox_month = one_month_ago.format('MM')

resource_id_query = 'cwi3-ckqi'
resource_id_publish = 'tyfh-5r8s'
resource_id_pub_log = 'n5kp-f8k4'

socrata_dt = max_date_socrata(resource_id_query)
socrata_month = arrow.get(socrata_dt).format('MM')


if dropbox_month == socrata_month:
# data is already up to date on socrata
print("trip data already is up to date on socrata.")

# update publication log
upsert_res = { 'Errors' : 0, 'message' : 'No new trip data detected' , 'Rows Updated' : 0, 'Rows Created' : 0, 'Rows Deleted' : 0 }
log_entry = socratautil.prep_pub_log(arrow.now(), 'bcycle_trip_update', upsert_res)
socratautil.upsert_data(socrata_creds, log_entry, resource_id_pub_log)

sys.exit()

else:
current_file = 'TripReport-{}{}.csv'.format(dropbox_month, dropbox_year)
root = 'austinbcycletripdata' # note the lowercase-ness
path = '/{}/{}/{}'.format(root, dropbox_year, current_file)

if data_exists_dropbox(path, access_token):
print("getting new data")
data = get_dropbox_data(path, access_token)

print("upserting new data to socrata")
upsert_res = socratautil.upsert_data(socrata_creds, data, resource_id_publish)

# update publication log
log_entry = socratautil.prep_pub_log(arrow.now(), 'bcycle_trip_update', upsert_res)
socratautil.upsert_data(socrata_creds, log_entry, resource_id_pub_log)
print(upsert_res)

else:
# trip data for this month not yet available
print("trip data for last month not yet available.")

# update publication log
upsert_res = { 'Errors' : 0, 'message' : 'No new trip data detected' , 'Rows Updated' : 0, 'Rows Created' : 0, 'Rows Deleted' : 0 }
log_entry = socratautil.prep_pub_log(arrow.now(), 'bcycle_trip_update', upsert_res)
socratautil.upsert_data(socrata_creds, log_entry, resource_id_pub_log)

sys.exit()

except Exception as e:
print('Failed to process bcycle trip data for {}'.format(arrow.now().format()))
print(e)
emailutil.send_email(ALERTS_DISTRIBUTION, 'BCycle Trip Update Failure', str(e), EMAIL['user'], EMAIL['password'])
raise e
Loading

0 comments on commit 7dbb521

Please sign in to comment.