In [1]:
# ================================
# 09 MODEL EVALUATION (TEST SET)
# ================================

library(readr)
library(dplyr)

setwd("C:/Users/Graf David/R/FinalProject")

df <- read_csv("dataset/train.csv", show_col_types = FALSE)

df_model <- df %>% select(-`v.id`)

set.seed(42)

sample_size <- floor(0.75 * nrow(df_model))
train_index <- sample(seq_len(nrow(df_model)), size = sample_size)

train_data <- df_model[train_index, ]
test_data  <- df_model[-train_index, ]



Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



In [2]:
reduced_model <- lm(
  `current price` ~ 
    `on road old` + 
    `on road now` + 
    km + 
    condition + 
    years,
  data = train_data
)


In [3]:
test_predictions <- predict(reduced_model, newdata = test_data)


In [4]:
# Реальні значення
actual <- test_data$`current price`

# Помилки
errors <- actual - test_predictions

MAE  <- mean(abs(errors))
RMSE <- sqrt(mean(errors^2))
R2_test <- 1 - sum(errors^2) / sum((actual - mean(actual))^2)

cat("MAE :", MAE, "\n")
cat("RMSE:", RMSE, "\n")
cat("R2 on test:", R2_test, "\n")


MAE : 7609.46 
RMSE: 9011.59 
R2 on test: 0.995078 


In [5]:
# ================================
# FULL SUMMARY — STEP 09
# ================================

cat("\n==============================\n")
cat("FINAL MODEL EVALUATION SUMMARY\n")
cat("==============================\n")

cat("\n--- MODEL FORMULA ---\n")
print(formula(reduced_model))

cat("\n--- TRAIN QUALITY ---\n")
cat("Train R²:", summary(reduced_model)$r.squared, "\n")
cat("Train Adj R²:", summary(reduced_model)$adj.r.squared, "\n")

cat("\n--- TEST QUALITY ---\n")
cat("MAE:", MAE, "\n")
cat("RMSE:", RMSE, "\n")
cat("Test R²:", R2_test, "\n")

cat("\n--- INTERPRETATION ---\n")
cat("The closer MAE and RMSE are to 0, and R² is to 1, the better the model predicts unseen data.\n")
cat("Very high test R² indicates strong generalization ability.\n")

cat("\n==============================\n")
cat("END OF STEP 09 — PROJECT FINISHED\n")
cat("==============================\n")



FINAL MODEL EVALUATION SUMMARY

--- MODEL FORMULA ---
`current price` ~ `on road old` + `on road now` + km + condition + 
    years

--- TRAIN QUALITY ---
Train R²: 0.9952688 
Train Adj R²: 0.995237 

--- TEST QUALITY ---
MAE: 7609.46 
RMSE: 9011.59 
Test R²: 0.995078 

--- INTERPRETATION ---
The closer MAE and RMSE are to 0, and R² is to 1, the better the model predicts unseen data.
Very high test R² indicates strong generalization ability.

END OF STEP 09 — PROJECT FINISHED
