# Datapolitan-Training/intro-stats

No description, website, or topics provided.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
images
.gitignore
index.html
slide.css
workbook.pdf

# Introduction to Statistical Analysis

## Summary

A one-day course covering the basis of descriptive statistics with open data, including basic statistical measures such as mean, median, standard deviation, and variance. The course also covers correlation, linear regression, and introduces decision modeling using open data.

## Target Audience

Employees of all levels who perform data analysis and communicate analytical findings in support city operations.

## Course Overview

This course introduces participants to the use of statistics for understanding and communicating city data. Using Excel, participants will learn how to use measures like mean, median, mode, standard deviation, and variance interval to understand the content of city data for making operational decisions. Participants will also learn how to display statistical information in meaningful ways.

## Goals

• Learn common statistical measures, including mean, median, mode, standard deviation, and variance
• Calculate correlation coefficients for bivariate data and apply the technique of simple regression analysis
• Demonstrate techniques used for forecasting
• Communicate data meaningfully to a broad audience using charts and graphs in Microsoft Excel

## Key Takeaways

• Participants will be familiar with common statistical measures
• Participants will be able to calculate correlation coefficients for bivariate data and perform simple linear regression analysis
• Participants will be familiar with the basic techniques of forecasting
• Participants will be better able to communicate analysis using charts and graphs in Microsoft Excel

## Exercise Descriptions

### Exercise 1: Calculate simple descriptive statistics (measures of central tendency)

• In a small group, calculate the mean and median for your group and compare with the class as a whole
• Report your findings to the class
• Desired outcomes
• Participants become familiar calculating mean and median in Excel
• Participants understand the value of statistics for comparison
• Participants practice communicating statistics

### Exercise 2: Calculate measure of variability in small groups and compare to class as a whole

• In a small group, calculate the measures of variability for your group and compare with the class as a whole
• Report your findings to the class
• Desired outcomes
• Participants become familiar calculating measures of variability in Excel
• Participants understand the value of statistics for comparison
• Participants practice communicating statistics

### Exercise 3: How long do noise complaints stay open in New York City?

• Prepare data in a guided exercise to calculate the time 311 Service Requests related to noise remain open
• Desired outcomes
• Participants are guided through the steps necessary to calculate the time a service request remains open
• Participants learn Excel functions and formulas if they have no previous experience
• Participants practice calculating statistics on a larger dataset than previous
• Participants communicate findings from statistical analysis

### Exercise 4: How long do pothole complaints stay open in New York City?

• Prepare the data in another guided exercise to calculate the time 311 Service Requests related to pothole complaints remain open
• Filter and clean the data as necessary to obtain reliable results
• Compare the results of this analysis with the results from the previous exercise
• Desired outcomes

### Exercise 5: Calculate the correlation between median income and recycling rate in New York City

• Calculate the correlation between median income and the recycling rate in New York City Community Districts
• Interpret the result based on the calculated coefficient of correlation
• Desired outcomes
• Participants practice calculating the coefficient of correlation on NYC data
• Participants practice communicating statistics