![Imgur](https://i.imgur.com/SB15XHN.png)

## ①　Introduction
This is an information about a golf course weather and management conditions.

🎯　Your goal is to make a prediction of the daily spending of that golf course.

You are free to add/remove sections and made any modification to this notebook. Only your final submission will be graded. This notebook will not be graded.

In this notebook, you will mainly work on Section ⑤ (Preprocessing) and Section ⑦ (Model Building).

## ②　Setting Up the Environment
These are all the libraries used in the lecture.

In [1]:
# Basic Libraries (L1)
import pandas as pd
pd.set_option('display.max_columns', None)
import numpy  as np
import warnings
warnings.filterwarnings('ignore')

# Data Preprocessing (L2)
from sklearn.preprocessing import StandardScaler, RobustScaler, MinMaxScaler, OneHotEncoder

# Data Exploration (L3)
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import randint, norm, lognorm, expon, uniform, gamma
from scipy.stats import probplot, chi2_contingency

# Basic Classifiers & Regressors (L4-5)
from sklearn.linear_model import LogisticRegression,     LinearRegression
from sklearn.linear_model import Lasso, LassoCV,         ElasticNet, ElasticNetCV
from sklearn.naive_bayes  import BernoulliNB,            GaussianNB
from sklearn.neighbors    import KNeighborsClassifier,   KNeighborsRegressor
from sklearn.dummy        import DummyClassifier,        DummyRegressor
from sklearn.tree         import DecisionTreeClassifier, DecisionTreeRegressor, plot_tree
from sklearn.svm          import SVC,                    SVR

# Ensemble Classifiers & Regressors (L6)
from sklearn.ensemble     import GradientBoostingClassifier, GradientBoostingRegressor
from sklearn.ensemble     import RandomForestClassifier,     RandomForestRegressor
from sklearn.ensemble     import StackingClassifier,         StackingRegressor
from sklearn.ensemble     import AdaBoostClassifier,         AdaBoostRegressor
from sklearn.ensemble     import BaggingClassifier,          BaggingRegressor
from sklearn.ensemble     import VotingClassifier,           VotingRegressor
# from catboost             import CatBoostClassifier,         CatBoostRegressor
# import lightgbm as lgb
# import xgboost  as xgb

# Classification & Regression Metrics (L7)
from sklearn.metrics      import accuracy_score, precision_score, recall_score, f1_score
from sklearn.metrics      import confusion_matrix
from sklearn.metrics      import roc_curve, roc_auc_score

from sklearn.metrics      import mean_absolute_error, mean_absolute_percentage_error
from sklearn.metrics      import mean_squared_error, mean_squared_log_error
from sklearn.metrics      import r2_score

# Model Calibration (L7)
from sklearn.calibration  import calibration_curve
from sklearn.metrics      import brier_score_loss, log_loss

# Model Selection & Validation (L7)
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score, cross_val_predict
from sklearn.model_selection import KFold, StratifiedKFold, LeaveOneOut

# Hyperparameter Optimization (L7)
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV

# Multiclass Classification (L8)
from sklearn.multiclass      import OneVsOneClassifier,    OneVsRestClassifier
from sklearn.metrics         import classification_report, precision_recall_fscore_support

# Model Weighting (L8)
from sklearn.utils.class_weight import compute_sample_weight

# Resampling Techniques (L8)
from imblearn.over_sampling  import RandomOverSampler,  SMOTE,    ADASYN
from imblearn.under_sampling import RandomUnderSampler, NearMiss, TomekLinks
from imblearn.combine        import SMOTEENN

# Pipeline (L9)
from imblearn.pipeline       import make_pipeline, Pipeline
from sklearn.compose         import ColumnTransformer

# Feature Selection (L9)
from sklearn.feature_selection import SelectKBest, SelectPercentile, SelectFromModel
from sklearn.feature_selection import f_classif, f_regression, chi2
from sklearn.feature_selection import mutual_info_classif, mutual_info_regression
from sklearn.feature_selection import RFE, RFECV, SequentialFeatureSelector

# Dimensionality Reduction (L9)
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.decomposition         import PCA, KernelPCA
from sklearn.manifold              import TSNE

## ③　Loading Training Dataset

In [2]:
# Load the train dataset
train_url = "/kaggle/input/prediction-of-golf-course-spending/train.csv"
df_train = pd.read_csv(train_url, index_col=0)

target_col = 'Spending'

X_train, y_train = df_train.drop(target_col, axis=1), df_train[target_col]

df_train

Unnamed: 0_level_0,Season,DayLength,Weekend,Temperature,ChanceOfRain,AtmosphericPressure,WindSpeed,Evaporation,WindDirection,Humidity,CloudCover,Precipitation,Visibility,DewPoint,UVIndex,WindGusts,LightningStrikes,AirQualityIndex,SolarRadiation,SoilMoisture,BarometricTrend,GrassGrowth,FrostOccurrence,LeafWetnessDuration,SolarNoonTime,AirTempVariation,GroundTemperature,RelativeHumidity,AtmosphericStability,TotalRainfall,MaxWindSpeed,SoilAcidity,SunshineDuration,WaterTableDepth,PollutionIndex,SpecialEvents,DailyPlayers,CourseCondition,EquipmentUsage,MaintenanceLevel,GreenSpeed,StaffingLevel,ProShopSales,FoodBeverageSales,CustomerSatisfaction,WaterUsage,IrrigationEfficiency,PestControl,GrassHealth,DrivingRangeUsage,GolfCartMaintenance,ClubhouseTraffic,PracticeGreenUsage,LockerRoomUsage,MerchandiseSales,OnlineBookingRate,LandscapingQuality,TournamentPreparations,GolfCourseVisitorCount,TeeTimeUtilization,CourseWetnessLevel,GolfLessonAttendance,MerchandiseInventoryTurnover,ParkingLotUsage,CaddieServiceUsage,RestaurantBarRevenue,InflationRate2011,Spending
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1
2011-01-01,Winter,8.99,1,-1.22,29.77,1006.61,11.06,3.16,287.48,44.51,36.76,4.78,1.08,6.32,2,24.57,1,4.98,2.69,59.05,-0.38,2.49,1,0.57,6.62,0.80,-4.74,21.85,-0.22,4.61,9.43,6.92,4.06,2.75,7.65,0,42,0.35,0.47,3,9.77,13,619.79,758.79,2.34,65.88,0.34,0,0.15,0.47,3,58,0.64,0.58,360.05,0.90,0.82,0,90,83.23,0.39,25,1.82,76.72,0.67,1287.99,-0.000094,1037.49
2011-01-02,Winter,9.01,1,-1.44,92.61,996.12,18.78,2.37,146.61,53.34,51.17,9.46,0.67,7.87,1,41.48,1,1.90,1.84,74.05,0.27,4.09,1,0.87,6.67,0.84,-6.76,24.75,0.20,9.51,12.78,7.39,4.76,4.72,7.07,0,24,0.03,0.17,3,10.64,8,441.47,527.83,1.07,58.75,0.36,0,0.34,0.18,1,47,0.29,0.24,271.14,0.63,0.86,0,76,47.40,0.61,18,0.92,40.23,0.29,874.09,0.000055,1355.80
2011-01-03,Winter,9.02,0,-1.46,0.50,1012.61,8.06,2.96,52.59,34.90,38.91,3.66,1.43,4.55,2,17.04,1,8.24,4.46,47.66,-0.46,2.78,1,0.26,6.70,0.94,-1.91,17.76,-0.62,3.44,6.95,6.71,5.30,2.19,25.53,0,32,0.57,0.30,2,9.10,10,532.06,636.99,3.32,69.61,0.21,0,0.16,0.31,1,52,0.43,0.36,314.04,0.76,0.39,1,83,65.73,0.35,20,1.38,60.42,0.45,832.14,0.001808,757.05
2011-01-04,Winter,9.04,0,-0.40,0.00,1016.08,6.76,5.02,286.07,30.28,10.44,1.08,4.74,3.03,3,12.32,1,9.96,3.76,19.26,-0.46,0.42,1,0.17,6.80,0.96,-2.30,17.14,-0.47,0.98,10.76,5.82,2.94,1.06,24.02,0,47,0.72,0.54,2,7.73,14,629.55,813.31,3.82,100.73,0.39,2,0.01,0.57,3,61,0.73,0.63,368.63,0.93,0.65,3,94,88.14,0.18,28,1.88,84.24,0.73,976.23,0.000199,362.70
2011-01-05,Winter,9.06,0,-1.24,0.00,1019.57,4.31,2.67,326.05,23.60,38.84,2.92,2.03,2.67,2,9.11,0,13.66,7.94,35.63,-1.33,3.61,1,0.15,6.89,0.86,5.01,14.33,-1.06,2.24,6.30,6.42,6.13,1.41,1.26,0,39,0.82,0.42,2,8.39,12,597.07,728.76,4.26,54.14,0.15,0,0.29,0.43,0,55,0.56,0.52,351.24,0.86,0.31,1,87,78.65,0.32,24,1.75,70.01,0.61,1116.69,0.002617,561.60
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2021-12-27,Winter,8.94,0,-1.61,45.08,1003.67,12.31,2.19,82.66,43.98,37.26,5.40,1.03,6.02,2,29.88,1,4.46,2.03,61.55,-0.71,2.55,1,0.53,6.38,0.62,-5.99,20.11,-0.45,5.08,8.36,7.00,3.70,3.01,9.48,0,36,0.22,0.36,2,9.85,12,559.92,697.45,1.89,40.06,0.36,1,0.13,0.37,0,54,0.52,0.46,328.02,0.84,0.44,0,85,74.61,0.46,23,1.51,63.01,0.51,927.78,0.199259,660.51
2021-12-28,Winter,8.95,0,-1.13,0.85,1009.78,9.17,3.06,349.48,38.00,31.44,3.39,1.36,5.09,2,20.79,1,6.55,3.34,50.18,-0.65,2.04,1,0.33,6.41,0.71,-3.15,18.38,-0.74,3.19,5.92,6.70,4.11,2.26,5.84,0,32,0.47,0.31,2,9.21,10,525.18,627.06,2.84,65.72,0.32,1,0.07,0.30,3,51,0.43,0.34,313.25,0.78,0.51,2,81,64.52,0.33,21,1.34,53.24,0.43,924.05,0.200693,774.02
2021-12-29,Winter,8.96,0,0.18,0.00,1016.18,4.74,3.07,268.00,33.44,38.80,1.76,2.10,3.84,2,12.84,1,14.00,7.19,32.82,-2.05,3.33,0,0.23,6.48,0.95,5.57,19.14,-2.18,1.60,1.46,6.16,6.03,1.10,17.20,0,40,0.95,0.46,2,8.35,13,596.82,736.18,4.80,62.88,0.35,0,0.22,0.43,2,55,0.59,0.52,349.78,0.88,0.59,1,87,79.07,0.26,24,1.74,71.36,0.59,663.30,0.202556,884.20
2021-12-30,Winter,8.97,0,0.08,38.83,1003.65,12.28,3.46,22.64,47.48,31.47,4.70,1.04,7.25,2,30.43,1,4.44,1.68,60.48,-0.48,1.99,0,0.66,6.52,0.97,-5.40,23.89,-0.49,4.51,8.66,6.94,2.84,2.80,8.43,0,31,0.23,0.28,2,9.76,10,519.75,630.10,1.84,73.52,0.45,1,0.09,0.29,0,51,0.41,0.35,312.21,0.76,0.60,1,81,66.13,0.39,21,1.39,51.64,0.41,830.41,0.199159,642.89


## ④　Exploratory Data Analysis

In [3]:
# Perform EDA here if you want.
print(df_train.info())
df_train.describe()

<class 'pandas.core.frame.DataFrame'>
Index: 4018 entries, 2011-01-01 to 2021-12-31
Data columns (total 68 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   Season                        4018 non-null   object 
 1   DayLength                     4018 non-null   float64
 2   Weekend                       4018 non-null   int64  
 3   Temperature                   4018 non-null   float64
 4   ChanceOfRain                  4018 non-null   float64
 5   AtmosphericPressure           4018 non-null   float64
 6   WindSpeed                     4018 non-null   float64
 7   Evaporation                   4018 non-null   float64
 8   WindDirection                 4018 non-null   float64
 9   Humidity                      4018 non-null   float64
 10  CloudCover                    4018 non-null   float64
 11  Precipitation                 4018 non-null   float64
 12  Visibility                    4018 non-null   float6

Unnamed: 0,DayLength,Weekend,Temperature,ChanceOfRain,AtmosphericPressure,WindSpeed,Evaporation,WindDirection,Humidity,CloudCover,Precipitation,Visibility,DewPoint,UVIndex,WindGusts,LightningStrikes,AirQualityIndex,SolarRadiation,SoilMoisture,BarometricTrend,GrassGrowth,FrostOccurrence,LeafWetnessDuration,SolarNoonTime,AirTempVariation,GroundTemperature,RelativeHumidity,AtmosphericStability,TotalRainfall,MaxWindSpeed,SoilAcidity,SunshineDuration,WaterTableDepth,PollutionIndex,SpecialEvents,DailyPlayers,CourseCondition,EquipmentUsage,MaintenanceLevel,GreenSpeed,StaffingLevel,ProShopSales,FoodBeverageSales,CustomerSatisfaction,WaterUsage,IrrigationEfficiency,PestControl,GrassHealth,DrivingRangeUsage,GolfCartMaintenance,ClubhouseTraffic,PracticeGreenUsage,LockerRoomUsage,MerchandiseSales,OnlineBookingRate,LandscapingQuality,TournamentPreparations,GolfCourseVisitorCount,TeeTimeUtilization,CourseWetnessLevel,GolfLessonAttendance,MerchandiseInventoryTurnover,ParkingLotUsage,CaddieServiceUsage,RestaurantBarRevenue,InflationRate2011,Spending
count,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0,4018.0
mean,12.209821,0.285714,9.724793,23.876799,1013.222698,8.958484,9.89556,181.759878,60.088823,50.105007,6.004619,1.143835,9.982524,2.961424,17.917008,2.981085,9.399826,6.098681,62.586565,0.014587,5.972934,0.214286,2.005398,11.952454,2.871018,9.692827,39.805052,0.012359,5.882972,13.254281,7.000102,6.009734,3.079572,9.358564,0.028123,29.598805,0.501575,0.283848,2.001244,9.994652,9.475859,499.841979,604.847982,3.012688,200.43773,0.49953,1.29119,0.496299,0.283771,1.025884,49.836984,0.397529,0.338444,301.155244,0.715438,0.498524,1.001244,80.104032,60.169617,0.284002,20.084121,1.242867,50.346274,0.401078,804.856807,0.10053,1238.975772
std,2.234768,0.45181,9.419895,32.491773,9.860262,4.617397,7.054592,104.305728,20.176069,22.085891,4.183233,0.622029,5.085232,1.686614,12.249006,1.727702,7.397384,4.317412,16.105858,0.981916,3.417313,0.410377,2.032099,3.503898,1.60706,9.606616,19.899321,0.983785,4.223931,6.83584,0.493492,2.31601,1.615904,7.414271,0.165346,11.159616,0.289911,0.160739,0.773438,0.998671,2.843882,102.075562,152.74036,1.159374,141.23657,0.288933,1.101214,0.287042,0.159312,1.005252,7.033265,0.202773,0.178505,49.930111,0.157714,0.288928,0.999252,9.05564,19.87066,0.161077,4.508216,0.43351,22.488098,0.202807,391.558151,0.057941,677.517564
min,8.92,0.0,-7.43,0.0,976.06,0.27,0.09,0.26,3.65,1.33,0.04,0.14,-7.22,0.0,0.11,0.0,0.53,0.03,6.16,-3.3,0.31,0.0,0.0,6.0,0.05,-10.61,0.56,-3.56,0.06,0.09,5.24,2.01,0.43,0.49,0.0,0.0,0.0,0.0,1.0,6.72,5.0,132.12,16.62,1.0,2.52,0.0,0.0,0.0,0.0,0.0,27.0,0.0,0.0,113.14,0.12,0.0,0.0,53.0,3.16,0.0,7.0,0.5,0.35,0.0,0.0,-0.000616,178.82
25%,10.05,0.0,0.7325,0.0,1006.5225,5.47,4.8625,90.4575,45.595,33.035,2.9,0.71,6.57,2.0,8.47,2.0,4.57,2.96,51.425,-0.68,3.42,0.0,0.58,8.88,1.63,0.7925,24.2425,-0.66,2.78,8.02,6.66,4.0,1.94,4.5725,0.0,22.0,0.25,0.16,1.0,9.32,7.0,428.785,503.0025,2.0,96.77,0.25,0.0,0.25,0.16,0.0,45.0,0.23,0.2,266.8025,0.61,0.25,0.0,74.0,46.02,0.16,17.0,0.87,33.4325,0.25,537.0525,0.048462,766.3325
50%,12.235,0.0,9.865,0.91,1013.225,8.45,8.31,182.255,61.39,50.365,5.09,1.0,9.98,3.0,15.595,3.0,7.37,5.14,63.44,0.04,5.31,0.0,1.41,12.0,2.64,9.48,37.97,0.02,4.91,12.495,7.0,6.04,2.74,7.4,0.0,30.0,0.5,0.26,2.0,10.0,9.0,500.295,603.725,3.04,168.23,0.5,1.0,0.49,0.26,1.0,50.0,0.38,0.32,301.3,0.74,0.5,1.0,80.0,61.75,0.26,20.0,1.22,50.635,0.39,806.995,0.100271,1087.105
75%,14.38,1.0,18.4675,49.175,1020.01,11.87,13.1575,271.5025,75.9075,67.6375,8.07,1.42,13.29,4.0,24.845,4.0,11.76,8.1875,74.7375,0.6975,7.84,0.0,2.76,15.03,3.92,18.2475,53.9275,0.68,7.84,17.61,7.33,8.0,3.83,11.68,0.0,37.0,0.76,0.39,3.0,10.67,12.0,570.8125,707.8175,4.01,267.37,0.75,2.0,0.74,0.39,2.0,54.0,0.54,0.46,335.18,0.84,0.75,2.0,86.0,75.6975,0.39,23.0,1.62,68.015,0.54,1062.605,0.152783,1533.6075
max,15.43,1.0,29.39,99.84,1047.68,32.18,58.26,360.0,99.27,98.58,32.37,5.73,28.73,12.0,79.61,11.0,131.88,34.73,97.35,3.51,22.27,1.0,22.91,18.0,8.59,32.08,97.97,3.86,27.71,47.13,8.82,10.0,15.99,85.83,1.0,70.0,1.0,0.87,3.0,14.22,14.0,868.62,1156.77,5.0,1084.49,1.0,3.0,1.0,0.9,3.0,74.0,0.96,0.92,491.66,0.99,1.0,3.0,115.0,98.85,0.9,40.0,2.0,99.84,0.97,2222.54,0.202587,7494.29


## ⑤　Preprocessing Pipeline

In [4]:
numerical_cols = X_train.select_dtypes(include=['int64', 'float64']).columns
categorical_cols = X_train.select_dtypes(include=['object', 'category']).columns

preprocessor = ColumnTransformer([
    ('std', StandardScaler(), numerical_cols),
    ('ohe', OneHotEncoder(), categorical_cols)
    ], remainder='passthrough')


## ⑥　Training Pipeline

In [5]:
# ② Pipeline: Preprocess, VotingRegressor
training_pipeline = make_pipeline(
    preprocessor,
    VotingRegressor(estimators=[
        ('rf', RandomForestRegressor()),
        ('adaboost', AdaBoostRegressor()),
        ('lr', LinearRegression()),
        ('knn', KNeighborsRegressor())
        # Add more regression models if needed
    ])
)

## ⑦　Hyperparameter Tuning

In [6]:
# ③ Define the parameter space
param_grid = {
    'votingregressor__rf__n_estimators': [10, 50, 100],
    'votingregressor__rf__max_depth': [None, 10, 20],
    'votingregressor__adaboost__n_estimators': [50, 100, 200],
    'votingregressor__adaboost__learning_rate': [0.01, 0.1, 0.5, 1.0],
    'votingregressor__lr__fit_intercept': [True, False],
    'votingregressor__knn__n_neighbors': [3, 5, 10],
    'votingregressor__knn__weights': ['uniform', 'distance']
    # Add other hyperparameters for the base models
}

# ④ Use RandomizedSearchCV for hyperparameter tuning
random_search = RandomizedSearchCV(training_pipeline,
                                   param_distributions=param_grid,
                                   n_iter=12,
                                   scoring='r2',
                                   cv=KFold(n_splits=15),
                                  n_jobs=-1
                                  )
random_search.fit(X_train, y_train)

# Perform the search
#random_search.fit(pd.df_train(X_train), y_train)

# Tuning Results
print("Best parameters:\n", random_search.best_params_)
print("\nBest score:\n", -random_search.best_score_)


Best parameters:
 {'votingregressor__rf__n_estimators': 100, 'votingregressor__rf__max_depth': 10, 'votingregressor__lr__fit_intercept': True, 'votingregressor__knn__weights': 'uniform', 'votingregressor__knn__n_neighbors': 10, 'votingregressor__adaboost__n_estimators': 100, 'votingregressor__adaboost__learning_rate': 0.1}

Best score:
 -0.877391304193


## ⑧　Cross-Validation with Best Parameters

In [7]:
# Extract best parameters and update the pipeline
training_pipeline.set_params(**random_search.best_params_)

# Perform cross-validation to check Performance
kf = KFold(n_splits=5)
scores = cross_val_score(training_pipeline, X_train, y_train, cv=kf, scoring='neg_mean_absolute_percentage_error')

print(f"MAPE: {-scores.mean().round(4)} ± {2*scores.std().round(4)}")


MAPE: 0.1516 ± 0.0154


## ⑨　Making Predictions on Test Data 

In [8]:
# Load the test dataset
test_url  = "/kaggle/input/prediction-of-golf-course-spending/test.csv"
df_test = pd.read_csv(test_url, index_col=0)

X_test = df_test.copy()

# Fit the model to the training set
training_pipeline.fit(X_train,y_train)

# Make Prediction of the Unseen Data
y_pred = training_pipeline.predict(X_test)

## ⑩　Submitting the Prediction

In [9]:
# Make DataFrame for Submission
submission = pd.DataFrame(y_pred, index=df_test.index, columns=[target_col])

# Make a CSV file for Submission
submission.to_csv('submission.csv')

submission

Unnamed: 0_level_0,Spending
Date,Unnamed: 1_level_1
2022-01-01,1391.412026
2022-01-02,1240.595902
2022-01-03,769.807973
2022-01-04,1071.211100
2022-01-05,1085.861287
...,...
2023-12-27,688.874917
2023-12-28,737.215902
2023-12-29,663.440538
2023-12-30,1321.142973
