You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose of this issue is to translate the existing scripts from the dc_doh_hackathon repository to enable them to be run repeatedly on the restaurant inspection dataset as it is updated.
Take the rodents_or_trash_extraction.R script and verify that it works as expected (see the original full GitHub issue text below). Then, modify it to be run from the command line taking
three arguments:
The input restaurant inspection data file (...inspection_summary_data.csv)
Start with the DC DOH Food Service Establishment Inspection report data in the /Data Sets/Restaurant Inspections/ folder in Dropbox.
Develop a script to extract the number of food establishment inspections that found rodent or trash-related violations (violations 38 or 54). More details on violations can be found here
Input:
CSV files with inspection summary and violation details
Output:
A CSV file with
1 row for each establishment type and risk category, and each week, year, and census block
The following columns:
feature_id: The ID for the feature, in this case, "restaurant_violations_rodent_or_trash" feature_type: The establishment_type from the restaurant data set feature_subtype: The risk_category from 1-5 year: The ISO-8601 year of the feature value week: The ISO-8601 week number of the feature value census_block_2010: The 2010 Census Block of the feature value value: The value of the feature, i.e. the number of inspections that found rodent or trash-related violations in establishments with the given types and risk categories in the specified week, year, and census block.
The text was updated successfully, but these errors were encountered:
The purpose of this issue is to translate the existing scripts from the dc_doh_hackathon repository to enable them to be run repeatedly on the restaurant inspection dataset as it is updated.
From the issue_18/Code & Input Data folder:
Take the
rodents_or_trash_extraction.R
script and verify that it works as expected (see the original full GitHub issue text below). Then, modify it to be run from the command line takingthree arguments:
...inspection_summary_data.csv
)inspection_id
to census block (The output of issue Geocode Restaurant Inspections and Map to Census Blocks #6)Please also provide a
README.md
that describes the script and how to run it.You can model the solution after the files here or
here
Place all of your files in the
scripts/feature_engineering/extract_restaurant_inspection_features/
folderFor reference, here is the original issue description from the dc_doh_hackathon:
issue_18
Start with the DC DOH Food Service Establishment Inspection report data in the
/Data Sets/Restaurant Inspections/
folder in Dropbox.Develop a script to extract the number of food establishment inspections that found rodent or trash-related violations (violations 38 or 54). More details on violations can be found here
Input:
CSV files with inspection summary and violation details
Output:
A CSV file with
feature_id
: The ID for the feature, in this case, "restaurant_violations_rodent_or_trash"feature_type
: Theestablishment_type
from the restaurant data setfeature_subtype
: Therisk_category
from1
-5
year
: The ISO-8601 year of the feature valueweek
: The ISO-8601 week number of the feature valuecensus_block_2010
: The 2010 Census Block of the feature valuevalue
: The value of the feature, i.e. the number of inspections that found rodent or trash-related violations in establishments with the given types and risk categories in the specified week, year, and census block.The text was updated successfully, but these errors were encountered: