Skip to content
Compare data analyst tools by cleaning, exploring and reporting Seattle Collisions data using Python, R, SQL, Tableau and Excel
R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
data
docs
.gitattributes
Collisions_OD.pdf
README.md
Seattle_Collisions_Data.rtf

README.md

Seattle-Collisions

This project uses Seattle Collisions data* to compare 5 data analyst methods of cleaning, exploring and reporting data using Python, R, SQL, Excel, and Tableau.

The motivation behind this project is two-fold:

  1. practice wrangling, exploring and storytelling with data
  2. identify commonalities, strengths and weaknesses of various toolsets used for wrangling, exploring and storytelling

example plot

*Seattle Collisions Data

The Seattle Collisions dataset is a compilation of over 200,000 collision reports created by Seattle Police Department (SPD) that were then recorded by Seattle Department of Transportation (SDOT), starting in 2004 and updated regularly.

Project Status

This project a work in progress. The R method is actively being refined and will act as the standard against which the Python, SQL, Excel, and Tableau modules will be developed.

How to Contribute

I encourage you to submit pull requests.

Please use the Udacity Git Commit Message Style Guide, and follow the "fork-and-pull" Git workflow:

  1. Fork the repo on GitHub
  2. Clone the project to your own computer
  3. Commit changes to your own branch
  4. Push your work back up to your fork
  5. Submit a Pull request so that I can review your changes

Note: Please take care to merge the latest from "upstream" before making a pull request.

To Do

* Download data programmatically

  • Improve documentation:
    • For better transparency, move data cleaning/change discussions outside the code.
    • Document the project as a whole
  • Improve R Studio version:
    • Improve visuals (for better storytelling)
    • Create functions for repeated visuals
  • Enhance R Studio version: add geo
  • Make R Studio version viewable
  • Add Python method
  • Add SQL method
  • Add Tableau method
  • Add Excel method
  • Write and publish formal project report

Files and Directories

R - R Studio notebooks for wrangling and analyzing data

Collisions_OD.pdf - Metadata (also found here: https://www.seattle.gov/Documents/Departments/SDOT/GIS/Collisions_OD.pdf)

You can’t perform that action at this time.