Skip to content
Tom Schenk Jr edited this page Jan 13, 2016 · 20 revisions

Project Workflow

A kanban board is located at Waffle.io. The project contains the following high-level tasks:

  • Combine the raw data files of the lab tests for 2008 and beyond.
  • Create a variable indicating lab results above the acceptable threshold.
  • Clean-up advisories from DrekBeach and remove advisories not caused by high predicted values of E. coli.
  • Merge (cleaned) advisories from above with lab results to determine if the advisory was correct.
  • Determine the baseline performance of the current model.
  • Add other data (e.g., weather and other predictors)
  • Create alternative models
  • Use test-train framework to compare performance of the new model.

Variables

Variable name Description
Client.ID Beach name
Full_date POSIX date of laboratory reading and corresponding prediction
Year Year of laboratory reading and corresponding prediction
Date Month_date of laboratory reading and corresponding prediction
Laboratory.ID Unique identifier for the laboratory testing
Reading.1 First laboratory testing results
Reading.2 Second laboratory testing results
Escherichia.coli Calculated geometric mean of Reading.1 and Reading.2, provided by the lab(do not use)
Units Units for Reading.1, Reading.2, and Escherichia.coli. Always "MPN/100 ML"
Sample.Collection.Time
Weekday Day of week for laboratory reading and corresponding prediction
Month Month for laboratory reading and corresponding prediction
Day Day of month for laboratory reading and corresponding prediction
Drek_Reading Actual reading values as scraped from Chicago Park District website via DrekBeach
Drek_Prediction Predicted values as scraped from Chicago Park District website via DrekBeach
Drek_Worst_Swim_Status "Worst" swim status collected throughout the course of a day
e_coli_geomean_actual_calculated Geometric mean of Reading.1 and Reading.2
elevated_levels_actual_calculated Binary variable indicating whether e_coli_geomean_actual_calculated is >= 235

Replication

[intentionally left blank]