A full-day course for city employees in how to use R for data analysis using open data.
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 136 commits ahead, 160 commits behind datapolitan:gh-pages.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
code
images
.gitignore
20170605_DataAnalysisR_workbook.pdf
LICENSE
README.md
index.html
slide.css
workbook.pdf

README.md

Summary

A full-day course covering the key concepts of how to leverage the R programming language for data analysis using open data. The course will cover the basic syntax of R as it relates to performing basic exploratory data analysis, as well as how to create impactful charts, graphs, and other information visualizations using NYC Open Data for operational decision making.

Terminal Learning Objectives

  • Participants will understand what R is and why it's useful
  • Participants will understand how R structures data, and why that's different than Excel
  • Participants will open a dataset in R and shape into a usable structure for analysis
  • Participants will create a visualization and calculate summary statistics of a dataset in R
  • Participants will be exposed to elementary programming concepts and supplementary programming libraries in R
  • Participants will apply skills to conduct a simple analysis of a dataset from the NYC Open Data Portal
  • Participants will model how R can be used to build a data driven culture in their workplace

Key Audience

Analysts working in city government with basic programming knowledge and/or experience performing advanced analysis in Excel (nested formulas with conditionals, PivotTables, and macros)

Outline

  • Introduction (Richard)
    • Class Schedule and Expectations
    • What is R?
    • R vs Excel
    • Getting Started
    • Overview of Data Analysis
    • NOLA Example
  • Today's Question (Julia)
    • Understanding Noise Complaints in 311 Data
    • See Finished Example
  • Data Collection (Julia)
    • Open Old Faithful Dataset in R Studio
  • Data Exploration (Julia)
    • Calculate Summary Statistics
    • Identify Columns, Levels, and Known Issues
    • Explore R Studio Console
  • Working with R (Richard)
    • Explore Data Structures and Types
    • Learn Basic Syntax
  • Morning Break (15 mins)
  • Exercise 311 Data (Julia)
  • Lunch (1 hour)
  • Working with Data (Richard)
    • Data Wrangling
    • Packages
    • Algorithms
    • Form Hypotheses
  • Data Manipulation Practical (Julia)
  • Debugging (Richard)
    • Understand Difference Between Syntax and Semantic Errors
    • Review Pro-tips for Problem-solving and Debugging
  • Code Review (Richard/Julia)
  • Wrap Up (Julia)
  • Resources (Julia)