This repository contains a script "run_analysis.R", output of the script "TidyData.txt" and a code book "CodeBook.md" to analyze the human activity data collected from Samsung Galaxy S II smartphone.
- Checkout this project and copy the "run_analysis.R" file into the working directory of R.
- This script uses a library "reshape2". Make sure that you install this package using the command "install.packages("reshape2")" if its not already installed in your local R.
- This script uses the dataset downloaded from https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip.
- Extract the zip file to the working directory and copy the run_analysis.R to the same working directory of R.
<Working directory>
./run_analysis.R
./UCI HAR Dataset/
./UCI HAR Dataset/features.txt
./UCI HAR Dataset/activity_labels.txt
./UCI HAR Dataset/test/subject_test.txt
./UCI HAR Dataset/test/X_test.txt
./UCI HAR Dataset/test/y_test.txt
./UCI HAR Dataset/train/subject_train.txt
./UCI HAR Dataset/train/X_train.txt
./UCI HAR Dataset/train/y_train.txt
- After setting up the project, run the following command at R command prompt "source('./run_analysis.R')" in the R working directory.
- The script will load the information from the features.txt, activity_info.txt and the files from test and train folders in the dataset
- The feature names in features.txt is not in a user friendly format. So the script will clean the names using more descriptive column names
- The script will the load all activity names into a variable
- Test data is then loaded and combined to prepare a summary_test object
- Train data is loaded and combined to a summary_train object
- The script will combine both the test data and training data
- Script will merge the data with the activity names so that we can use activity names instead of activity ids
- After the full merged data is prepared, script will then extract all mean and std columns to another dataset
- A tidy data is then prepared with the mean of each of the observations per subject and per activity name
- The tidy data is then saved to an external file called "TidyData.txt"
- Script will print the first 5 lines of the tidy data as a sample.
- A file "TidyData.txt" is saved in the working directory with the tidy data
- The first 5 lines of tidy data are displayed at the command prompt
Use of this dataset in publications must be acknowledged by referencing the following publication [1]
[1] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. International Workshop of Ambient Assisted Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012
This dataset is distributed AS-IS and no responsibility implied or explicit can be addressed to the authors or their institutions for its use or misuse. Any commercial use is prohibited.
Jorge L. Reyes-Ortiz, Alessandro Ghio, Luca Oneto, Davide Anguita. November 2012.