Skip to content

pml-training.csv #103

@Alifianajwa

Description

@Alifianajwa

pml-training.csv

  1. Preprocessing Data
    • Missing Values:
    na_percentage <- sapply(data, function(x) mean(is.na(x)))
    data_clean <- data[, na_percentage < 0.9] # Hapus kolom dengan >90% NA

• Delete Non-Predictive Coloum:
data_clean <- data_clean %>%
select(-c(X, user_name, raw_timestamp_part_1, raw_timestamp_part_2,
cvtd_timestamp, new_window, num_window))

• Conversion Variabel:
data_clean$classe <- as.factor(data_clean$classe)

  1. Feature Selection and Data Division
    • Data Partition (70% training, 30% testing):
    set.seed(123)
    trainIndex <- createDataPartition(data_clean$classe, p = 0.7, list = FALSE)
    training <- data_clean[trainIndex, ]
    testing <- data_clean[-trainIndex, ]

  2. Model Building with Random Forest
    • Model Training with 5-fold Cross Validation:
    trControl <- trainControl(method = "cv", number = 5)
    model_rf <- train(classe ~ .,
    data = training,
    method = "rf",
    trControl = trControl,
    verbose = FALSE)

• Model results:
Accuracy: 0.9923
Out-of-Sample Error Estimate: 0.77%

  1. Model Validation and Evaluation
    • Prediction on Testing Set:
    predictions <- predict(model_rf, newdata = testing)
    confusionMatrix(predictions, testing$classe)

• Evaluation Results:
Accuracy : 0.993
95% CI : (0.991, 0.994)

  1. Out-of-Sample Error Estimation
    • Estimated using cross-validation during model training:
    model_rf$results$Accuracy # Menunjukkan akurasi validasi silang

•Out-of-Sample Error = 1 - Accuracy = 1 - 0.9923 = 0.0077

•Conclusion:
Model Built: Random Forest with 99.23% accuracy on training data and 99.3% on testing set.
Error Estimation: Out-of-sample error estimated at 0.77% using cross validation.
Prediction Quality: The model was highly accurate in classifying the 5 activity types (A-E).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions