You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose of this issue is to continue the data cleaning that was started in the dc_doh_hackathon repository. The routine should then be modified to be able to be run repeatedly on the restaurant inspection dataset as it is updated.
Take a look at `Issue 19 Walkthrough.ipynb' for the data-related conclusions reached at the hackathon. Continue with this script or start a new one that develops a routine to identify inspections that resulted in closures (see the original full GitHub issue text below). Be sure that the final script can be run from the command line taking three arguments:
The input restaurant inspection data file (...inspection_summary_data.csv)
Start with the DC DOH Food Service Establishment Inspection report data in the /Data Sets/Restaurant Inspections/ folder in Dropbox.
Develop a script to extract the number of food establishment inspections that resulted in (temporary) closure of the establishment. More details on violations can be found here
Input:
CSV files with inspection summary and violation details
Output:
A CSV file with
1 row for each establishment type and risk category, and each week, year, and census block
The following columns:
feature_id: The ID for the feature, in this case, "restaurant_inspection_closures" feature_type: The establishment_type from the restaurant data set feature_subtype: The risk_category from 1-5 year: The ISO-8601 year of the feature value week: The ISO-8601 week number of the feature value census_block_2010: The 2010 Census Block of the feature value value: The value of the feature, i.e. the number of inspections that resulted in closure in establishments with the given types and risk categories in the specified week, year, and census block.
The text was updated successfully, but these errors were encountered:
The purpose of this issue is to continue the data cleaning that was started in the dc_doh_hackathon repository. The routine should then be modified to be able to be run repeatedly on the restaurant inspection dataset as it is updated.
From the issue_19 folder:
Take a look at `Issue 19 Walkthrough.ipynb' for the data-related conclusions reached at the hackathon. Continue with this script or start a new one that develops a routine to identify inspections that resulted in closures (see the original full GitHub issue text below). Be sure that the final script can be run from the command line taking three arguments:
...inspection_summary_data.csv
)inspection_id
to census block (The output of issue Geocode Restaurant Inspections and Map to Census Blocks #6)Please also provide a
README.md
that describes the script and how to run it.You can model the solution after the files here or
here
Place all of your files in the
scripts/feature_engineering/extract_restaurant_inspection_features/
folderFor reference, here is the original issue description from the dc_doh_hackathon:
issue_19
Start with the DC DOH Food Service Establishment Inspection report data in the
/Data Sets/Restaurant Inspections/
folder in Dropbox.Develop a script to extract the number of food establishment inspections that resulted in (temporary) closure of the establishment. More details on violations can be found here
Input:
CSV files with inspection summary and violation details
Output:
A CSV file with
feature_id
: The ID for the feature, in this case, "restaurant_inspection_closures"feature_type
: Theestablishment_type
from the restaurant data setfeature_subtype
: Therisk_category
from1
-5
year
: The ISO-8601 year of the feature valueweek
: The ISO-8601 week number of the feature valuecensus_block_2010
: The 2010 Census Block of the feature valuevalue
: The value of the feature, i.e. the number of inspections that resulted in closure in establishments with the given types and risk categories in the specified week, year, and census block.The text was updated successfully, but these errors were encountered: