You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose of this issue is to translate the existing scripts from the dc_doh_hackathon repository to enable them to be run repeatedly on the restaurant inspection dataset as it is updated.
It looks like two individuals provided solutions to this issue: pull_inspection.py and AG_dropbox_upload/datakind_sept2017.R. Examine the two solutions and choose the one that works as expected (see the original full GitHub issue text below) and is most efficient. Then, modify the script to be run from the command line taking three arguments:
A folder with input restaurant inspection data files (the script should concatenate and merge the files in the directory as appropriate)
The shapefile for census blocks
The output filename
Please also provide a README.md that describes the script and how to run it.
You can model the solution after the files here or here
Place all of your files in the scripts/feature_engineering/extract_restaurant_inspection_features/ folder
For reference, here is the original issue description from the dc_doh_hackathon:
Start with the DC DOH Food Service Establishment Inspection report data in the /Data Sets/Restaurant Inspections/ folder in Dropbox.
Develop a script to extract the number of food establishments by type and risk category. More details on violations can be found here
Note that this issue depends upon the geocoding results from Issue #13
Input:
CSV files with inspection summary and violation details
Output:
A CSV file with
1 row for each establishment type and risk category, and each week, year, and census block
The following columns:
feature_id: The ID for the feature, in this case, "food_service_establishments" feature_type: The establishment_type from the restaurant data set feature_subtype: The risk_category from 1-5 year: The ISO-8601 year of the feature value week: The ISO-8601 week number of the feature value census_block_2010: The 2010 Census Block of the feature value value: The value of the feature, i.e. the number of food service establishments with the given types and risk categories in the specified week, year, and census block.
The text was updated successfully, but these errors were encountered:
The purpose of this issue is to translate the existing scripts from the dc_doh_hackathon repository to enable them to be run repeatedly on the restaurant inspection dataset as it is updated.
From the issue_20/ folder:
It looks like two individuals provided solutions to this issue:
pull_inspection.py
andAG_dropbox_upload/datakind_sept2017.R
. Examine the two solutions and choose the one that works as expected (see the original full GitHub issue text below) and is most efficient. Then, modify the script to be run from the command line taking three arguments:Please also provide a
README.md
that describes the script and how to run it.You can model the solution after the files here or
here
Place all of your files in the
scripts/feature_engineering/extract_restaurant_inspection_features/
folderFor reference, here is the original issue description from the dc_doh_hackathon:
issue_20
Start with the DC DOH Food Service Establishment Inspection report data in the
/Data Sets/Restaurant Inspections/
folder in Dropbox.Develop a script to extract the number of food establishments by type and risk category. More details on violations can be found here
Note that this issue depends upon the geocoding results from Issue #13
Input:
CSV files with inspection summary and violation details
Output:
A CSV file with
feature_id
: The ID for the feature, in this case, "food_service_establishments"feature_type
: Theestablishment_type
from the restaurant data setfeature_subtype
: Therisk_category
from1
-5
year
: The ISO-8601 year of the feature valueweek
: The ISO-8601 week number of the feature valuecensus_block_2010
: The 2010 Census Block of the feature valuevalue
: The value of the feature, i.e. the number of food service establishments with the given types and risk categories in the specified week, year, and census block.The text was updated successfully, but these errors were encountered: