GitHub - pmchirco/gettingCleaningDataCourseProject: Coursera class Getting and Cleaning Data Course Project

#Coursera Course Project - Getting and Cleaning Data #

##Overview## Course Project Assignment was to download a set of files containing smartphone accelerometer data, and from these files, create one dataset containing all measurements for either the mean or the standard deviation. Once this was complete, to take the averages of each variable and each test subject, and create an independent tiday data set from it.

This was accomplished through R scriping, using both the dplyr and tidyr packages.

##Details##

First step was to read in the data and merge the two data sets. I read the X_test.txt and X_Train.txt files into independent data frames, and merged them using rbind. Unlike merge(), rbind will not alter the order of the rows.
Once the dataset was merged, I had to pull the field names from features.txt. These names were not syntactically correct for R column names, so I used the make.names function, which replaces invalid characters in names with the '.' (dot) character. After cleaning up the cases where there were several dots in a row (eg, an empty () would be converted to ..), I had a unique set of names for my dataset. I preserved the existing camelCase of the column names for readbility.
Using select(x, contains()) I was able to create a new dataset only containing data from columns where the measurement was a mean or a standard deviation. I actually created 2 datasets, one for each measurement type, and then used cbind to put them together.
In order to maintain the ID of the subject in the trial, I created a vector with the data from the subject_test.txt and subject_train.csv files. Using rbind in the same order as in the previous steps, the subject IDs would be in the exact same order as the measurement data was. I then used cbind to add this column to the beginning of the dataframe and label it 'subject'.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
Codebook.md		Codebook.md
README.md		README.md
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages