coursera-get-clean-data

Final project for the Coursera course "Getting and Cleaning Data"

Summary

This repository contains this README.md file, along with four key items of interest:

An R script used to read and tidy the input data (run_analysis.R)
The full tidy data set (tidy_data.csv)
A summarized, tidy data set (tidy_data_summary.csv)
A code book (codebook.md) describing the tidy data noted in 2 & 3

Source Data and Information

The original data for this project can be found here and a description of both how the data was obtained and why it could be useful can be found here. The below description assumes that you have read and are familiar with the above referenced inforamtion.

Project Requirements

This project was seeking to, and does fulfill the following requirements.

The submitted data set is tidy.
The Github repo contains the required scripts.
GitHub contains a code book that modifies and updates the available codebooks with the data to indicate all the variables and summaries calculated, along with units, and any other relevant information.
The README that explains the analysis files is clear and understandable.
The work submitted for this project is the work of the student who submitted it.

Tidy Data

The key to this assignment was ensuring that the data was transformed such that it meets the three criteria of "tidy data" from Hadley:

Each variable forms a column.
Each observation forms a row.
Each type of observational unit forms a table.

The transformations required to get our data in a tidy form were of three types:

Bind columns, to pull descriptive variables into the same table
Merge tables, to associate descriptive variable names in place of relational numbers
Reshape 561 columns of table into a table into many rows with two columns representing that inforamtion

The first two transformations are trivial, but the third requires the use of the gather() and extract_numeric functions from the package tidyr that is well suited for this exact purpose.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

coursera-get-clean-data

Summary

Source Data and Information

Project Requirements

Tidy Data

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
codebook.md		codebook.md
run_analysis.R		run_analysis.R

markbulk/coursera-get-clean-data

Folders and files

Latest commit

History

Repository files navigation

coursera-get-clean-data

Summary

Source Data and Information

Project Requirements

Tidy Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages