- Unzipped UCI dataset, (a full description is available at the main Website )
- R (has only been tested with R 3.1.2)
- dplyr package
- Put
run_analysis.R
and the unzipped UCI dataset folder in the same working directory - In R, enter the following commands
> source('run_analysis.R')
> result <- make_tidy()
- To observe the dataset of interest enter the following command
> result$d2
make_tidy(dir = "UCI HAR Dataset/")
assumes the directory name (dir
) is unchanged. If an absolute path is provided, the function can be run from anywhere on the filesystem.make_tidy
combines the data and test set X values (measurements), y values (activity labels), subject identifiers into a single table keeping only those X values that correspond to a mean or standard deviation. The y values are replaced with the character representations inactivity_labels.txt
This data frame is storedresult$d1
.make_tidy
for each valid (activity label, subject id) pair computes the mean for each measurement still inresult$d1
. The names of these columns have been modified to be prepended withMEAN_
to indicate they correspond to means. The result of this computation is stored inresult$d2
.
activity
corresponds to the verbal description of the activity labelsubject
corresponds to the subject responsible for producing the data in a given rowMEAN_*
corresponds to the mean of measurements from the original UCI that corresponded to means or standard deviations. (seeUCI HAR Dataset/features.txt
for the features containing case insensitivemean
andstd
strings in the name)