Week 3 Course Project for Coursera's Getting and Cleaning Data within the Data Analysis in R specialization.
run_analysis.R merges testing and training data sets and their associated subjects and labels from within a zip file. This is the primary file within the repo and requires the data to be downloaded separately. Data should be put in the ./data subdirectory for R script to work.
Location of data https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
Description of data: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
README_rawdata.txt provides information on the raw telemetry data.
- R script extracts the time-summarized data from the archive. This includes time-means, standard deviations, min, max, etc.
- Pulls the subject and activity labels and merges them with the testing and training data sets.
- Appends the testing data to the training data.
- Relabels the data frame according to the archived "features.txt" list, and removes "()" from name.
- Creates factor variable with informative labels for "activity".
- Keeps columns corresponding to the mean and standard deviation.
- Creates new ensemble mean for each subject (1 to 30) and each activity (1 to 6). Outputs this dataset.