This repo contains my solution for the course "Getting and Cleaning Data". The goal of the project is to generate a tidy data set, based on the original data from the "Human Acitvity Recognition Using Samrtphones" study.
The study provides a train and test dataset, with the following details:
- Subject - a number to identify an invidiual in the research (1 ~ 30)
- Activity - an activity performed by the Subject (Walking, Standing, etc.)
- Measurements - a list of measured parameters for each activity and subject (@see: Codebook.md for details)
In this project, we're only interested in Mean and Standard Deviation of the measurements.
The R-script run_analysis.R is written to parse the original data files, and generate a summarized version of it in a tidy format. It follows the following steps in order to achive the goal:
- Initilization (loading packages, etc.)
- Function definitions
- Load feature list for measurements
- Convert the feature list into R-friendly column names
- Load activity ids and labels
- Load Measurements, Subjects, and Activity for Train and Test data
- Extract only columns that are Mean and Standard Deviation
- Merge Measurements, Subjects, and Activity together
- Merge Train and Test data
- Summarize
- Generate the output file