- run_analysis.R - used to reproduce the transformations and analysis perfomed on the original dataset.
- README.md
- Codebook.txt
- Load the dplyr library
- Assign names of test files, training files, list of activities, variable names and subjects into variables.
- Load the x test data using read.table with header=FALSE
- Load the x train data using read.table with header=FALSE
- Load the y test data using read.table with header=FALSE
- Load the y train data using read.table with header=FALSE
- Load variable names (features) using read.table with header=FALSE, stringsAsFactors=FALSE
- Load activities with read.table with header=FALSE
- Load test subject with read.table with header=FALSE
- Load train subject with read.table with header=FALSE
- Add column names to activities data frame
- Add the column names to the x test and x train data frames
- Add column names to the test subject and train subject data frames, called subjects
- Use rbind to merge the x train and x test data frames
- Use rbind to merge the y train and y test data frames
- Make the column names in the merged x data data frame unique.
- Use the following regular expression to select only the columns with mean and standard deviation data "\.(mean|std)\." with ignore.case=TRUE
- Add column name to merged y data, called activity
- Use rbind to merge the train subject and test subjects
- Use cbind to merge the subjects, y data and x data
- Use mutate to modify variable activity to a factor "activity = as.factor(activity)
- Set the levels for the activity factor
- Tidy up the variables names by replacing duplicate Body e.g. "BodyBody" with "Body"
- Use a custom function "tidy_columnnames" to tidy the variables names
- A regex was used to replace .. or ... with a single period
- A regex was used to remove a period at the end of a variable name
- A new data frame was created by grouping by subject and activity and then using summarise_all to calculate the mean of all variables.
- Output the results of step 27 to a file called "tidy_ds.txt" with row.name=FALSE