The main folder UCI HAR Dataset contains:
- a train data folder of
- X_train.txt: Train feature data set, consisting of 561 measurements/features from accelerometer and gyroscope
- y_train.txt: Train activity data set (identified by activity_id)
- subject_train.txt: Subject train data set (identified by subject_id)
- a test data folder of
- X_test.txt: Test feature data set, consisting of 561 measurements/features from accelerometer and gyroscope
- y_test.txt: Test activity data set (identified by activity_id)
- subject_test.txt: Subject test data set (identified by subject_id)
-
activity_labels.txt: Data set with the Activity_id and Activity_Label relationship (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING)
-
features.txt: All of the features obtained, namely a multiplication of
Name Time Freq. Body Linear Acceleration 1 1 Gravity Linear Acceleration 1 0 Body Linear Jerk 1 1 Body Angular Velocity 1 1 Body Angular Acceleration 1 0 Body Linear Acceleration Magnitude 1 1 Gravity Linear Acceleration Magnitude 1 0 Body Linear Jerk Magnitude 1 1 Body Angular Velocity Magnitude 1 1 Body Angular Acceleration Magnitude 1 1 with
Function Description mean Mean value std Standard deviation mad Median absolute value max Largest values in array min Smallest value in array sma Signal magnitude area energy Average sum of the squares iqr Interquartile range entropy Signal Entropy arCoeff Autorregresion coefficients correlation Correlation coefficient maxFreqInd Largest frequency component meanFreq Frequency signal weighted average skewness Frequency signal Skewness kurtosis Frequency signal Kurtosis energyBand Energy of a frequency interval angle Angle between two vectors Description of function from: https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2013-84.pdf
features_info.txt: A description of the base feature list
README.txt: a general readme file
- Merge the training and the test sets to create one data set.
- Extract only the measurements on the mean and standard deviation for each measurement.
- Use descriptive activity names to name the activities in the data set.
- Appropriately label the data set with descriptive variable names.
- From the data set in step 4, create a second, independent tidy data set with the average of each variable for each activity and each subject.
- README.md: General readme file for project.
- CodeBook.md: Code Book describing the variables, the data, and the transformations performed to clean up the data.
- run_analysis.R: R script which does the procedure described earlier to tidy up data.
- tidy_data.txt: Tidy data set file, output as a txt file.
- tidy_data.xls: Tidy data set file, output as xls file for those who see easier data in xls files :)
- Output files not uploaded into the repository.
Download the script to the home directory ("~/")
Execute the following commands (required libraries and the zipped data file are automatically used and if not present, are downloaded and extracted/installed)
- Curl must be properly set-up in file system when using the script to also fetch zipped data file into working directory, otherwise download and extract the zipped file externally into working directory ("~/").
source("run_analysis.R") run_analysis()
- To view the text file in a readable way, issue
tidydata <- read.table("tidy_data.txt", header = TRUE) #tidy_data.txt must be in current working directory! View(tidydata)
Read CodeBook.md for a description of the transformations used as well as the variables and data.