## Instructions {-}

1. This is the template you may use to submit your code and report for the prediction problems on Kaggle.

2. You may modify the template if you deem fit, but it should have the information asked below.

In [4]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_squared_error,r2_score,roc_curve,auc,  confusion_matrix, accuracy_score
from sklearn.tree import DecisionTreeRegressor,DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV, ParameterGrid, RandomizedSearchCV, cross_val_score, train_test_split, KFold
from sklearn.ensemble import  StackingRegressor, VotingRegressor, BaggingRegressor,BaggingClassifier,RandomForestRegressor,RandomForestClassifier, AdaBoostRegressor
from sklearn.linear_model import LinearRegression, LogisticRegression,  Lasso, LassoCV
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import MinMaxScaler, FunctionTransformer, StandardScaler
from sklearn.impute import KNNImputer
from sklearn.decomposition import PCA
from sklearn.base import BaseEstimator, RegressorMixin
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import ElasticNetCV
import itertools as it
import xgboost as xgb
import lightgbm as lgb
from lightgbm import LGBMRegressor
from catboost import CatBoostRegressor
from pyearth import Earth
import time as time


## A.1) Data cleaning

Mention the data cleaning steps taken to prepare your data for developing the model. This may include imputing missing values, dealing with outliers, combining levels of categorical variable(s), etc.

* Put your data cleaning/preparation code with comments here
* The code should begin from reading the train data
* The code should end when you obtain the data used to develop the model in A.4

In [2]:
# Read train and test data
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')

# Pull id of test rows to be used later
test_id = test.id

# Drop id column
train = train.drop(columns = 'id')
test = test.drop(columns = 'id')

# Define X, y, and X_test
X = train.drop(columns = 'y')
y = train['y']
X_test = test 

# Standardize train and test data
sc = StandardScaler()
sc.fit(X)
X_train = sc.transform(X)
X_test = sc.transform(X_test)

# Imputing missing values
imputer = KNNImputer(n_neighbors=10)
X_train = pd.DataFrame(imputer.fit_transform(X_train),columns = X.columns)
X_test = pd.DataFrame(imputer.fit_transform(X_test),columns = test.columns)

# Log y
log_y = np.log(y)

## A.2) Exploratory data analysis

Mention any major insights you obtained from the data, which you used to develop the model. PLease put your code or visualizations here if needed.

- Insight 1
- Insight 2
- .........

I performed plentiful EDA, which included exploring the null values, outliers, multicollinearity, and PCA. However, none of this EDA really ended up being that helpful in creating the best model. I initially thought reducing the dimensionality of the dataset would be useful, but in the end, it really came down to trying different combinations of models, which you will see in the rest of this report.

## A.3) Feature selection/reduction

Mention the steps for feature selection/reduction. PLease put your code or visualizations here if needed.

- step 1
- step 2
- .........

### CatBoost (for feature reduction)

Step 1: train default CatBoost model on entire dataset 

In [5]:
cat_base = CatBoostRegressor()
cat_base.fit(X_train, log_y, verbose=False)

<catboost.core.CatBoostRegressor at 0x159338cda30>

Step 2: remove features from dataset that had 0 importance in CatBoost model

In [6]:
# Get the feature importances from the CatBoost model
cat_feature_importances = cat_base.feature_importances_

# Create a DataFrame to store the feature names and importances
cat_feature_df = pd.DataFrame({'Feature': X_train.columns, 'Importance': cat_feature_importances})

# Sort the DataFrame by feature importances in descending order
cat_sorted_features = cat_feature_df.sort_values(by='Importance', ascending=False)

In [7]:
# Check feature importances
cat_sorted_features[:649]

Unnamed: 0,Feature,Importance
701,x702,5.472942
104,x105,4.259369
566,x567,4.101641
145,x146,2.046978
117,x118,1.718619
...,...,...
169,x170,0.003143
311,x312,0.003062
763,x764,0.002961
438,x439,0.002726


In [8]:
# Take the top X features
cat_top_features = cat_sorted_features.head(649)

# Get the feature names as a list
cat_top_feature_names = cat_top_features['Feature'].tolist()

# Filter the train and test data using the top feature names
cat_X_train_filtered = X_train[cat_top_feature_names]
cat_X_test_filtered = X_test.loc[:,cat_X_train_filtered.columns]

Step 3: train default model on filtered dataset

In [9]:
cat = CatBoostRegressor()
cat.fit(cat_X_train_filtered, log_y, verbose=False)

<catboost.core.CatBoostRegressor at 0x1594332bfa0>

Step 4: remove non-important features once again

In [10]:
# Get the feature importances from the CatBoost model
cat_feature_importances_2 = cat.feature_importances_

# Create a DataFrame to store the feature names and importances
cat_feature_df_2 = pd.DataFrame({'Feature': cat_X_train_filtered.columns, 'Importance': cat_feature_importances_2})

# Sort the DataFrame by feature importances in descending order
cat_sorted_features_2 = cat_feature_df_2.sort_values(by='Importance', ascending=False)

In [11]:
cat_sorted_features_2[:607]

Unnamed: 0,Feature,Importance
0,x702,5.292004
1,x105,4.598000
2,x567,4.248394
9,x102,2.511596
6,x670,1.965405
...,...,...
553,x254,0.004488
488,x237,0.003348
634,x069,0.003284
538,x313,0.002926


In [12]:
# Take the top X features
cat_top_features_2 = cat_sorted_features_2.head(607)

# Get the feature names as a list
cat_top_feature_names_2 = cat_top_features_2['Feature'].tolist()

# Filter the training data using the top feature names
cat_X_train_filtered_2 = cat_X_train_filtered[cat_top_feature_names_2]

cat_X_test_filtered_2 = cat_X_test_filtered.loc[:,cat_X_train_filtered_2.columns]

### MARS (for feature reduction)

Step 1: train MARS on entire train dataset

In [13]:
mars = Earth(max_degree = 1, feature_importance_type = 'rss')
mars.fit(X_train, log_y)

  pruning_passer.run()
  coef, resid = np.linalg.lstsq(B, weighted_y[:, i])[0:2]


Earth(feature_importance_type='rss', max_degree=1)

Step 2: Check most important features of MARS model

In [14]:
# Retrieve the feature importances
mars_feature_importances = mars.feature_importances_

# Retrieve the feature names 
feature_names = X_train.columns

# Sort the feature names based on importances
mars_sorted_indices = np.argsort(mars_feature_importances)[::-1]
mars_sorted_feature_names = [feature_names[i] for i in mars_sorted_indices]
mars_sorted_importances = mars_feature_importances[mars_sorted_indices]

# Create an array with feature names and importances in order
mars_feature_importances_array = np.column_stack((mars_sorted_feature_names, mars_sorted_importances))

In [15]:
mars_feature_importances_array[:21]

array([['x102', '0.3688501777512344'],
       ['x118', '0.26081660159946174'],
       ['x174', '0.0950616792231861'],
       ['x253', '0.08579451113222046'],
       ['x337', '0.03754954121775787'],
       ['x567', '0.030405954505762188'],
       ['x355', '0.02414998260861697'],
       ['x105', '0.017977164800277192'],
       ['x218', '0.0168931035702206'],
       ['x092', '0.012819683876068367'],
       ['x265', '0.011986861462570298'],
       ['x286', '0.009458921602312494'],
       ['x420', '0.00905861059359245'],
       ['x111', '0.007339808440261188'],
       ['x321', '0.004961874311978705'],
       ['x667', '0.0018224337499960486'],
       ['x733', '0.0016584939187161709'],
       ['x527', '0.0016222828346693404'],
       ['x106', '0.0012873259351471695'],
       ['x154', '0.0004849868659502246'],
       ['x211', '0.0']], dtype='<U32')

Step 3: Define new training dataset with only MARS model important features

In [16]:
X_train_mars = X_train[mars_sorted_feature_names[:21]]
X_test_mars = X_test.loc[:,X_train_mars.columns]

## A.4) Developing the model

Mention the logical sequence of steps taken to obtain the final model. 

- Model 1
- Model 2
- .........

### CatBoost

Train tuned CatBoost model on new filtered dataset (tuned using RandomizedSearchCV and on original training dataset)

In [17]:
cat = CatBoostRegressor(iterations = 750,
                        depth = 6,
                        l2_leaf_reg = 4,
                        border_count = 256,
                        learning_rate = 0.08,
                        random_state = 1)
cat.fit(cat_X_train_filtered_2, log_y, verbose=False)

<catboost.core.CatBoostRegressor at 0x15939635d00>

### Light GBM

Train tuned Light GBM model on MARS feature importance dataset (tuned using RandomizedSearchCV and on that same MARS training dataset)

In [18]:
lgbm = LGBMRegressor(
    objective = 'regression',
    n_estimators = 1200,
    num_leaves=15,
    learning_rate=0.01,
    colsample_bytree=0.6,
    subsample = 1.0,
    reg_lambda = 0,
    reg_alpha = 0,
    max_depth = 9
)

# Train the LGBMRegressor model
lgbm.fit(X_train_mars, log_y)

LGBMRegressor(colsample_bytree=0.6, learning_rate=0.01, max_depth=9,
              n_estimators=1200, num_leaves=15, objective='regression',
              reg_alpha=0, reg_lambda=0)

### Ensemble

Ensemble Catboost and Light GBM models, trained on entire original training dataset

In [25]:
#Stacking using ElasticNetCV as the meta-model
en=StackingRegressor(estimators = [('cat',cat),('lgbm',lgbm)],
                     final_estimator=ElasticNetCV(),
                     cv = KFold(n_splits = 5, shuffle = True))
en.fit(X_train,log_y)

0:	learn: 0.9763003	total: 95.2ms	remaining: 1m 11s
1:	learn: 0.9573929	total: 152ms	remaining: 56.9s
2:	learn: 0.9393380	total: 209ms	remaining: 52s
3:	learn: 0.9237356	total: 267ms	remaining: 49.9s
4:	learn: 0.9078259	total: 336ms	remaining: 50s
5:	learn: 0.8937153	total: 399ms	remaining: 49.4s
6:	learn: 0.8798322	total: 472ms	remaining: 50.1s
7:	learn: 0.8671812	total: 543ms	remaining: 50.4s
8:	learn: 0.8572035	total: 622ms	remaining: 51.2s
9:	learn: 0.8485108	total: 693ms	remaining: 51.2s
10:	learn: 0.8395626	total: 766ms	remaining: 51.5s
11:	learn: 0.8320192	total: 834ms	remaining: 51.3s
12:	learn: 0.8236551	total: 896ms	remaining: 50.8s
13:	learn: 0.8156931	total: 959ms	remaining: 50.4s
14:	learn: 0.8092155	total: 1.03s	remaining: 50.5s
15:	learn: 0.8028273	total: 1.09s	remaining: 50.2s
16:	learn: 0.7975328	total: 1.16s	remaining: 49.9s
17:	learn: 0.7912517	total: 1.22s	remaining: 49.5s
18:	learn: 0.7857633	total: 1.28s	remaining: 49.4s
19:	learn: 0.7814837	total: 1.35s	remaining

164:	learn: 0.5840080	total: 11.5s	remaining: 40.6s
165:	learn: 0.5830680	total: 11.5s	remaining: 40.5s
166:	learn: 0.5817976	total: 11.6s	remaining: 40.5s
167:	learn: 0.5806624	total: 11.6s	remaining: 40.4s
168:	learn: 0.5797861	total: 11.7s	remaining: 40.3s
169:	learn: 0.5783547	total: 11.8s	remaining: 40.2s
170:	learn: 0.5778123	total: 11.8s	remaining: 40.1s
171:	learn: 0.5769373	total: 11.9s	remaining: 40s
172:	learn: 0.5757468	total: 12s	remaining: 40s
173:	learn: 0.5750215	total: 12.1s	remaining: 39.9s
174:	learn: 0.5743229	total: 12.1s	remaining: 39.8s
175:	learn: 0.5740878	total: 12.2s	remaining: 39.7s
176:	learn: 0.5729731	total: 12.2s	remaining: 39.6s
177:	learn: 0.5716974	total: 12.3s	remaining: 39.6s
178:	learn: 0.5714596	total: 12.4s	remaining: 39.5s
179:	learn: 0.5704470	total: 12.4s	remaining: 39.4s
180:	learn: 0.5695190	total: 12.5s	remaining: 39.3s
181:	learn: 0.5683514	total: 12.6s	remaining: 39.2s
182:	learn: 0.5670016	total: 12.6s	remaining: 39.1s
183:	learn: 0.5660

324:	learn: 0.4723418	total: 22.4s	remaining: 29.3s
325:	learn: 0.4714513	total: 22.4s	remaining: 29.2s
326:	learn: 0.4707754	total: 22.5s	remaining: 29.1s
327:	learn: 0.4702217	total: 22.6s	remaining: 29.1s
328:	learn: 0.4694777	total: 22.6s	remaining: 29s
329:	learn: 0.4687847	total: 22.7s	remaining: 28.9s
330:	learn: 0.4684392	total: 22.8s	remaining: 28.8s
331:	learn: 0.4677824	total: 22.8s	remaining: 28.8s
332:	learn: 0.4672782	total: 22.9s	remaining: 28.7s
333:	learn: 0.4664729	total: 23s	remaining: 28.6s
334:	learn: 0.4661223	total: 23s	remaining: 28.6s
335:	learn: 0.4655774	total: 23.1s	remaining: 28.5s
336:	learn: 0.4648380	total: 23.2s	remaining: 28.4s
337:	learn: 0.4641248	total: 23.2s	remaining: 28.3s
338:	learn: 0.4635395	total: 23.3s	remaining: 28.3s
339:	learn: 0.4629901	total: 23.4s	remaining: 28.2s
340:	learn: 0.4626461	total: 23.4s	remaining: 28.1s
341:	learn: 0.4617720	total: 23.5s	remaining: 28s
342:	learn: 0.4609995	total: 23.6s	remaining: 28s
343:	learn: 0.4603579	

484:	learn: 0.3839080	total: 33.2s	remaining: 18.2s
485:	learn: 0.3838721	total: 33.3s	remaining: 18.1s
486:	learn: 0.3836522	total: 33.3s	remaining: 18s
487:	learn: 0.3832831	total: 33.4s	remaining: 17.9s
488:	learn: 0.3827801	total: 33.5s	remaining: 17.9s
489:	learn: 0.3825362	total: 33.5s	remaining: 17.8s
490:	learn: 0.3819752	total: 33.6s	remaining: 17.7s
491:	learn: 0.3815722	total: 33.7s	remaining: 17.7s
492:	learn: 0.3811530	total: 33.8s	remaining: 17.6s
493:	learn: 0.3807989	total: 33.8s	remaining: 17.5s
494:	learn: 0.3801999	total: 33.9s	remaining: 17.5s
495:	learn: 0.3798610	total: 34s	remaining: 17.4s
496:	learn: 0.3795342	total: 34s	remaining: 17.3s
497:	learn: 0.3790641	total: 34.1s	remaining: 17.3s
498:	learn: 0.3785091	total: 34.2s	remaining: 17.2s
499:	learn: 0.3778870	total: 34.2s	remaining: 17.1s
500:	learn: 0.3772774	total: 34.3s	remaining: 17.1s
501:	learn: 0.3766518	total: 34.4s	remaining: 17s
502:	learn: 0.3760826	total: 34.4s	remaining: 16.9s
503:	learn: 0.375613

645:	learn: 0.3183846	total: 44.6s	remaining: 7.18s
646:	learn: 0.3181199	total: 44.7s	remaining: 7.11s
647:	learn: 0.3176619	total: 44.8s	remaining: 7.04s
648:	learn: 0.3170033	total: 44.8s	remaining: 6.97s
649:	learn: 0.3165723	total: 44.9s	remaining: 6.91s
650:	learn: 0.3160671	total: 45s	remaining: 6.84s
651:	learn: 0.3156935	total: 45s	remaining: 6.77s
652:	learn: 0.3153027	total: 45.1s	remaining: 6.7s
653:	learn: 0.3148260	total: 45.2s	remaining: 6.63s
654:	learn: 0.3144450	total: 45.2s	remaining: 6.56s
655:	learn: 0.3141139	total: 45.3s	remaining: 6.49s
656:	learn: 0.3137908	total: 45.4s	remaining: 6.42s
657:	learn: 0.3132775	total: 45.4s	remaining: 6.35s
658:	learn: 0.3129592	total: 45.5s	remaining: 6.28s
659:	learn: 0.3126328	total: 45.6s	remaining: 6.21s
660:	learn: 0.3121953	total: 45.6s	remaining: 6.14s
661:	learn: 0.3118319	total: 45.7s	remaining: 6.07s
662:	learn: 0.3113790	total: 45.8s	remaining: 6s
663:	learn: 0.3110444	total: 45.8s	remaining: 5.93s
664:	learn: 0.310653

55:	learn: 0.6876843	total: 5.01s	remaining: 1m 2s
56:	learn: 0.6862393	total: 5.1s	remaining: 1m 1s
57:	learn: 0.6847871	total: 5.17s	remaining: 1m 1s
58:	learn: 0.6831879	total: 5.25s	remaining: 1m 1s
59:	learn: 0.6819810	total: 5.33s	remaining: 1m 1s
60:	learn: 0.6803533	total: 5.41s	remaining: 1m 1s
61:	learn: 0.6792714	total: 5.49s	remaining: 1m
62:	learn: 0.6779601	total: 5.57s	remaining: 1m
63:	learn: 0.6767741	total: 5.64s	remaining: 1m
64:	learn: 0.6755850	total: 5.72s	remaining: 1m
65:	learn: 0.6740128	total: 5.79s	remaining: 60s
66:	learn: 0.6731823	total: 5.86s	remaining: 59.7s
67:	learn: 0.6721160	total: 5.93s	remaining: 59.4s
68:	learn: 0.6710044	total: 6s	remaining: 59.2s
69:	learn: 0.6703295	total: 6.07s	remaining: 59s
70:	learn: 0.6690910	total: 6.13s	remaining: 58.6s
71:	learn: 0.6680189	total: 6.19s	remaining: 58.3s
72:	learn: 0.6665605	total: 6.27s	remaining: 58.1s
73:	learn: 0.6657027	total: 6.33s	remaining: 57.8s
74:	learn: 0.6647918	total: 6.4s	remaining: 57.6s
7

216:	learn: 0.5217118	total: 17.4s	remaining: 42.6s
217:	learn: 0.5208711	total: 17.4s	remaining: 42.5s
218:	learn: 0.5196286	total: 17.5s	remaining: 42.4s
219:	learn: 0.5189501	total: 17.6s	remaining: 42.3s
220:	learn: 0.5182766	total: 17.6s	remaining: 42.2s
221:	learn: 0.5174885	total: 17.7s	remaining: 42.1s
222:	learn: 0.5167417	total: 17.8s	remaining: 42.1s
223:	learn: 0.5157339	total: 17.9s	remaining: 42s
224:	learn: 0.5151780	total: 17.9s	remaining: 41.9s
225:	learn: 0.5140763	total: 18s	remaining: 41.8s
226:	learn: 0.5133531	total: 18.1s	remaining: 41.7s
227:	learn: 0.5123099	total: 18.2s	remaining: 41.6s
228:	learn: 0.5121027	total: 18.2s	remaining: 41.5s
229:	learn: 0.5113321	total: 18.3s	remaining: 41.4s
230:	learn: 0.5104458	total: 18.4s	remaining: 41.3s
231:	learn: 0.5096935	total: 18.4s	remaining: 41.2s
232:	learn: 0.5088972	total: 18.5s	remaining: 41.1s
233:	learn: 0.5081794	total: 18.6s	remaining: 41s
234:	learn: 0.5069800	total: 18.7s	remaining: 40.9s
235:	learn: 0.5059

377:	learn: 0.4144315	total: 29.6s	remaining: 29.2s
378:	learn: 0.4137435	total: 29.7s	remaining: 29.1s
379:	learn: 0.4130078	total: 29.8s	remaining: 29s
380:	learn: 0.4124514	total: 29.9s	remaining: 28.9s
381:	learn: 0.4117371	total: 30s	remaining: 28.9s
382:	learn: 0.4109152	total: 30s	remaining: 28.8s
383:	learn: 0.4104939	total: 30.1s	remaining: 28.7s
384:	learn: 0.4097531	total: 30.2s	remaining: 28.6s
385:	learn: 0.4090748	total: 30.3s	remaining: 28.6s
386:	learn: 0.4089986	total: 30.4s	remaining: 28.5s
387:	learn: 0.4085579	total: 30.5s	remaining: 28.4s
388:	learn: 0.4079789	total: 30.6s	remaining: 28.4s
389:	learn: 0.4071461	total: 30.6s	remaining: 28.3s
390:	learn: 0.4066351	total: 30.7s	remaining: 28.2s
391:	learn: 0.4059061	total: 30.8s	remaining: 28.2s
392:	learn: 0.4055202	total: 30.9s	remaining: 28.1s
393:	learn: 0.4052704	total: 31s	remaining: 28s
394:	learn: 0.4042770	total: 31.1s	remaining: 27.9s
395:	learn: 0.4034103	total: 31.1s	remaining: 27.8s
396:	learn: 0.4025878	

538:	learn: 0.3226482	total: 43s	remaining: 16.9s
539:	learn: 0.3221854	total: 43.1s	remaining: 16.8s
540:	learn: 0.3219017	total: 43.2s	remaining: 16.7s
541:	learn: 0.3218566	total: 43.3s	remaining: 16.6s
542:	learn: 0.3215305	total: 43.4s	remaining: 16.5s
543:	learn: 0.3210300	total: 43.4s	remaining: 16.5s
544:	learn: 0.3207058	total: 43.5s	remaining: 16.4s
545:	learn: 0.3201580	total: 43.6s	remaining: 16.3s
546:	learn: 0.3194558	total: 43.7s	remaining: 16.2s
547:	learn: 0.3188256	total: 43.8s	remaining: 16.1s
548:	learn: 0.3183309	total: 43.9s	remaining: 16.1s
549:	learn: 0.3179513	total: 43.9s	remaining: 16s
550:	learn: 0.3177119	total: 44s	remaining: 15.9s
551:	learn: 0.3172583	total: 44.2s	remaining: 15.8s
552:	learn: 0.3167564	total: 44.3s	remaining: 15.8s
553:	learn: 0.3162375	total: 44.3s	remaining: 15.7s
554:	learn: 0.3157445	total: 44.4s	remaining: 15.6s
555:	learn: 0.3153555	total: 44.5s	remaining: 15.5s
556:	learn: 0.3148592	total: 44.6s	remaining: 15.4s
557:	learn: 0.3144

697:	learn: 0.2594029	total: 56.1s	remaining: 4.18s
698:	learn: 0.2589780	total: 56.2s	remaining: 4.1s
699:	learn: 0.2585832	total: 56.2s	remaining: 4.02s
700:	learn: 0.2581152	total: 56.3s	remaining: 3.94s
701:	learn: 0.2576361	total: 56.4s	remaining: 3.86s
702:	learn: 0.2572205	total: 56.5s	remaining: 3.77s
703:	learn: 0.2570385	total: 56.5s	remaining: 3.69s
704:	learn: 0.2570152	total: 56.6s	remaining: 3.61s
705:	learn: 0.2566284	total: 56.7s	remaining: 3.53s
706:	learn: 0.2562594	total: 56.8s	remaining: 3.45s
707:	learn: 0.2559190	total: 56.9s	remaining: 3.37s
708:	learn: 0.2553302	total: 56.9s	remaining: 3.29s
709:	learn: 0.2550232	total: 57s	remaining: 3.21s
710:	learn: 0.2547212	total: 57.1s	remaining: 3.13s
711:	learn: 0.2541768	total: 57.2s	remaining: 3.05s
712:	learn: 0.2541516	total: 57.3s	remaining: 2.97s
713:	learn: 0.2537653	total: 57.3s	remaining: 2.89s
714:	learn: 0.2534232	total: 57.4s	remaining: 2.81s
715:	learn: 0.2529252	total: 57.5s	remaining: 2.73s
716:	learn: 0.2

108:	learn: 0.6297506	total: 11s	remaining: 1m 4s
109:	learn: 0.6282359	total: 11.1s	remaining: 1m 4s
110:	learn: 0.6270882	total: 11.2s	remaining: 1m 4s
111:	learn: 0.6263576	total: 11.3s	remaining: 1m 4s
112:	learn: 0.6251960	total: 11.4s	remaining: 1m 4s
113:	learn: 0.6237521	total: 11.5s	remaining: 1m 4s
114:	learn: 0.6221900	total: 11.6s	remaining: 1m 4s
115:	learn: 0.6215668	total: 11.7s	remaining: 1m 3s
116:	learn: 0.6198155	total: 11.8s	remaining: 1m 3s
117:	learn: 0.6185381	total: 11.9s	remaining: 1m 3s
118:	learn: 0.6170442	total: 12s	remaining: 1m 3s
119:	learn: 0.6156175	total: 12.1s	remaining: 1m 3s
120:	learn: 0.6144786	total: 12.2s	remaining: 1m 3s
121:	learn: 0.6137641	total: 12.2s	remaining: 1m 3s
122:	learn: 0.6125710	total: 12.3s	remaining: 1m 2s
123:	learn: 0.6117427	total: 12.4s	remaining: 1m 2s
124:	learn: 0.6109445	total: 12.6s	remaining: 1m 2s
125:	learn: 0.6095273	total: 12.7s	remaining: 1m 2s
126:	learn: 0.6082657	total: 12.8s	remaining: 1m 2s
127:	learn: 0.60

268:	learn: 0.4725250	total: 27.9s	remaining: 49.9s
269:	learn: 0.4718568	total: 28s	remaining: 49.8s
270:	learn: 0.4717553	total: 28.1s	remaining: 49.7s
271:	learn: 0.4710462	total: 28.2s	remaining: 49.6s
272:	learn: 0.4702600	total: 28.4s	remaining: 49.6s
273:	learn: 0.4698167	total: 28.5s	remaining: 49.5s
274:	learn: 0.4687922	total: 28.6s	remaining: 49.4s
275:	learn: 0.4681944	total: 28.7s	remaining: 49.2s
276:	learn: 0.4675418	total: 28.8s	remaining: 49.1s
277:	learn: 0.4666027	total: 28.9s	remaining: 49s
278:	learn: 0.4664576	total: 29s	remaining: 48.9s
279:	learn: 0.4663273	total: 29.1s	remaining: 48.8s
280:	learn: 0.4657986	total: 29.2s	remaining: 48.7s
281:	learn: 0.4650180	total: 29.3s	remaining: 48.6s
282:	learn: 0.4643900	total: 29.4s	remaining: 48.5s
283:	learn: 0.4635925	total: 29.5s	remaining: 48.4s
284:	learn: 0.4624905	total: 29.6s	remaining: 48.3s
285:	learn: 0.4614978	total: 29.7s	remaining: 48.2s
286:	learn: 0.4604249	total: 29.8s	remaining: 48.1s
287:	learn: 0.4595

428:	learn: 0.3694923	total: 44.5s	remaining: 33.3s
429:	learn: 0.3692797	total: 44.7s	remaining: 33.2s
430:	learn: 0.3685740	total: 44.8s	remaining: 33.1s
431:	learn: 0.3685099	total: 44.8s	remaining: 33s
432:	learn: 0.3679988	total: 45s	remaining: 32.9s
433:	learn: 0.3672597	total: 45.1s	remaining: 32.8s
434:	learn: 0.3666812	total: 45.2s	remaining: 32.7s
435:	learn: 0.3659847	total: 45.3s	remaining: 32.6s
436:	learn: 0.3654238	total: 45.4s	remaining: 32.5s
437:	learn: 0.3649072	total: 45.5s	remaining: 32.4s
438:	learn: 0.3643667	total: 45.6s	remaining: 32.3s
439:	learn: 0.3637376	total: 45.7s	remaining: 32.2s
440:	learn: 0.3631278	total: 45.8s	remaining: 32.1s
441:	learn: 0.3625194	total: 45.9s	remaining: 32s
442:	learn: 0.3620840	total: 46s	remaining: 31.9s
443:	learn: 0.3613794	total: 46.1s	remaining: 31.8s
444:	learn: 0.3608185	total: 46.2s	remaining: 31.7s
445:	learn: 0.3604884	total: 46.3s	remaining: 31.6s
446:	learn: 0.3598203	total: 46.4s	remaining: 31.5s
447:	learn: 0.359698

587:	learn: 0.2936226	total: 59.8s	remaining: 16.5s
588:	learn: 0.2930909	total: 59.8s	remaining: 16.4s
589:	learn: 0.2926770	total: 59.9s	remaining: 16.3s
590:	learn: 0.2922646	total: 1m	remaining: 16.1s
591:	learn: 0.2918367	total: 1m	remaining: 16s
592:	learn: 0.2915643	total: 1m	remaining: 15.9s
593:	learn: 0.2912140	total: 1m	remaining: 15.8s
594:	learn: 0.2909787	total: 1m	remaining: 15.7s
595:	learn: 0.2905517	total: 1m	remaining: 15.6s
596:	learn: 0.2899380	total: 1m	remaining: 15.5s
597:	learn: 0.2895345	total: 1m	remaining: 15.4s
598:	learn: 0.2892616	total: 1m	remaining: 15.3s
599:	learn: 0.2889003	total: 1m	remaining: 15.2s
600:	learn: 0.2884756	total: 1m	remaining: 15.1s
601:	learn: 0.2880711	total: 1m 1s	remaining: 15s
602:	learn: 0.2875410	total: 1m 1s	remaining: 14.9s
603:	learn: 0.2874708	total: 1m 1s	remaining: 14.8s
604:	learn: 0.2870138	total: 1m 1s	remaining: 14.7s
605:	learn: 0.2866109	total: 1m 1s	remaining: 14.6s
606:	learn: 0.2863127	total: 1m 1s	remaining: 14.

747:	learn: 0.2361573	total: 1m 16s	remaining: 206ms
748:	learn: 0.2358275	total: 1m 17s	remaining: 103ms
749:	learn: 0.2354000	total: 1m 17s	remaining: 0us
0:	learn: 0.9780157	total: 98.5ms	remaining: 1m 13s
1:	learn: 0.9580182	total: 160ms	remaining: 59.7s
2:	learn: 0.9396020	total: 228ms	remaining: 56.7s
3:	learn: 0.9238504	total: 295ms	remaining: 55s
4:	learn: 0.9082167	total: 373ms	remaining: 55.6s
5:	learn: 0.8939037	total: 445ms	remaining: 55.1s
6:	learn: 0.8810061	total: 520ms	remaining: 55.2s
7:	learn: 0.8686781	total: 608ms	remaining: 56.4s
8:	learn: 0.8589842	total: 685ms	remaining: 56.4s
9:	learn: 0.8500012	total: 763ms	remaining: 56.5s
10:	learn: 0.8405516	total: 849ms	remaining: 57.1s
11:	learn: 0.8330028	total: 931ms	remaining: 57.3s
12:	learn: 0.8246007	total: 1.03s	remaining: 58.2s
13:	learn: 0.8171709	total: 1.13s	remaining: 59.5s
14:	learn: 0.8103353	total: 1.22s	remaining: 59.7s
15:	learn: 0.8049658	total: 1.3s	remaining: 59.6s
16:	learn: 0.7989419	total: 1.4s	remai

159:	learn: 0.5706198	total: 14.3s	remaining: 52.8s
160:	learn: 0.5693759	total: 14.4s	remaining: 52.7s
161:	learn: 0.5682113	total: 14.5s	remaining: 52.6s
162:	learn: 0.5672961	total: 14.6s	remaining: 52.6s
163:	learn: 0.5663717	total: 14.7s	remaining: 52.5s
164:	learn: 0.5648025	total: 14.8s	remaining: 52.3s
165:	learn: 0.5640105	total: 14.9s	remaining: 52.3s
166:	learn: 0.5629697	total: 14.9s	remaining: 52.1s
167:	learn: 0.5622910	total: 15s	remaining: 52.1s
168:	learn: 0.5612970	total: 15.1s	remaining: 52s
169:	learn: 0.5600378	total: 15.2s	remaining: 51.9s
170:	learn: 0.5593479	total: 15.3s	remaining: 51.9s
171:	learn: 0.5583238	total: 15.4s	remaining: 51.8s
172:	learn: 0.5573727	total: 15.5s	remaining: 51.7s
173:	learn: 0.5564898	total: 15.6s	remaining: 51.6s
174:	learn: 0.5558921	total: 15.7s	remaining: 51.5s
175:	learn: 0.5547130	total: 15.8s	remaining: 51.4s
176:	learn: 0.5535770	total: 15.8s	remaining: 51.3s
177:	learn: 0.5527339	total: 15.9s	remaining: 51.2s
178:	learn: 0.55

320:	learn: 0.4385296	total: 28.9s	remaining: 38.7s
321:	learn: 0.4379421	total: 29s	remaining: 38.5s
322:	learn: 0.4373554	total: 29.1s	remaining: 38.4s
323:	learn: 0.4367337	total: 29.2s	remaining: 38.3s
324:	learn: 0.4361698	total: 29.2s	remaining: 38.2s
325:	learn: 0.4355259	total: 29.3s	remaining: 38.1s
326:	learn: 0.4349478	total: 29.4s	remaining: 38.1s
327:	learn: 0.4342303	total: 29.5s	remaining: 37.9s
328:	learn: 0.4335516	total: 29.6s	remaining: 37.9s
329:	learn: 0.4328419	total: 29.7s	remaining: 37.8s
330:	learn: 0.4324381	total: 29.8s	remaining: 37.7s
331:	learn: 0.4318505	total: 29.8s	remaining: 37.6s
332:	learn: 0.4311306	total: 29.9s	remaining: 37.5s
333:	learn: 0.4304886	total: 30s	remaining: 37.4s
334:	learn: 0.4297020	total: 30.1s	remaining: 37.3s
335:	learn: 0.4290616	total: 30.2s	remaining: 37.2s
336:	learn: 0.4283359	total: 30.3s	remaining: 37.1s
337:	learn: 0.4282156	total: 30.3s	remaining: 37s
338:	learn: 0.4271689	total: 30.4s	remaining: 36.9s
339:	learn: 0.4262

480:	learn: 0.3423796	total: 43.4s	remaining: 24.3s
481:	learn: 0.3420538	total: 43.5s	remaining: 24.2s
482:	learn: 0.3415771	total: 43.6s	remaining: 24.1s
483:	learn: 0.3408338	total: 43.7s	remaining: 24s
484:	learn: 0.3405202	total: 43.8s	remaining: 23.9s
485:	learn: 0.3398571	total: 43.9s	remaining: 23.9s
486:	learn: 0.3395116	total: 44s	remaining: 23.8s
487:	learn: 0.3389422	total: 44.1s	remaining: 23.7s
488:	learn: 0.3383881	total: 44.2s	remaining: 23.6s
489:	learn: 0.3378747	total: 44.3s	remaining: 23.5s
490:	learn: 0.3374736	total: 44.4s	remaining: 23.4s
491:	learn: 0.3370577	total: 44.5s	remaining: 23.3s
492:	learn: 0.3364749	total: 44.6s	remaining: 23.2s
493:	learn: 0.3360927	total: 44.7s	remaining: 23.2s
494:	learn: 0.3355851	total: 44.8s	remaining: 23.1s
495:	learn: 0.3349321	total: 44.9s	remaining: 23s
496:	learn: 0.3348801	total: 45s	remaining: 22.9s
497:	learn: 0.3340637	total: 45.1s	remaining: 22.8s
498:	learn: 0.3335883	total: 45.2s	remaining: 22.7s
499:	learn: 0.333008

639:	learn: 0.2719026	total: 58.9s	remaining: 10.1s
640:	learn: 0.2713168	total: 59s	remaining: 10s
641:	learn: 0.2710454	total: 59.1s	remaining: 9.94s
642:	learn: 0.2704783	total: 59.2s	remaining: 9.85s
643:	learn: 0.2699518	total: 59.3s	remaining: 9.76s
644:	learn: 0.2694827	total: 59.4s	remaining: 9.66s
645:	learn: 0.2690853	total: 59.5s	remaining: 9.57s
646:	learn: 0.2686854	total: 59.5s	remaining: 9.48s
647:	learn: 0.2682469	total: 59.6s	remaining: 9.39s
648:	learn: 0.2678069	total: 59.7s	remaining: 9.29s
649:	learn: 0.2674602	total: 59.8s	remaining: 9.2s
650:	learn: 0.2670060	total: 59.9s	remaining: 9.11s
651:	learn: 0.2667292	total: 1m	remaining: 9.02s
652:	learn: 0.2664561	total: 1m	remaining: 8.93s
653:	learn: 0.2662139	total: 1m	remaining: 8.83s
654:	learn: 0.2657450	total: 1m	remaining: 8.74s
655:	learn: 0.2653491	total: 1m	remaining: 8.65s
656:	learn: 0.2649484	total: 1m	remaining: 8.56s
657:	learn: 0.2646041	total: 1m	remaining: 8.47s
658:	learn: 0.2640725	total: 1m	remain

49:	learn: 0.7043530	total: 4.75s	remaining: 1m 6s
50:	learn: 0.7027038	total: 4.84s	remaining: 1m 6s
51:	learn: 0.7003554	total: 4.93s	remaining: 1m 6s
52:	learn: 0.6986850	total: 5.03s	remaining: 1m 6s
53:	learn: 0.6975243	total: 5.14s	remaining: 1m 6s
54:	learn: 0.6957346	total: 5.24s	remaining: 1m 6s
55:	learn: 0.6947906	total: 5.35s	remaining: 1m 6s
56:	learn: 0.6928339	total: 5.45s	remaining: 1m 6s
57:	learn: 0.6914257	total: 5.54s	remaining: 1m 6s
58:	learn: 0.6895917	total: 5.63s	remaining: 1m 5s
59:	learn: 0.6883189	total: 5.72s	remaining: 1m 5s
60:	learn: 0.6869694	total: 5.81s	remaining: 1m 5s
61:	learn: 0.6858077	total: 5.91s	remaining: 1m 5s
62:	learn: 0.6840968	total: 6s	remaining: 1m 5s
63:	learn: 0.6830559	total: 6.09s	remaining: 1m 5s
64:	learn: 0.6809195	total: 6.23s	remaining: 1m 5s
65:	learn: 0.6795158	total: 6.36s	remaining: 1m 5s
66:	learn: 0.6784006	total: 6.46s	remaining: 1m 5s
67:	learn: 0.6766571	total: 6.55s	remaining: 1m 5s
68:	learn: 0.6754812	total: 6.65s	

211:	learn: 0.5207541	total: 19.9s	remaining: 50.6s
212:	learn: 0.5203111	total: 20s	remaining: 50.5s
213:	learn: 0.5194783	total: 20.1s	remaining: 50.4s
214:	learn: 0.5184006	total: 20.2s	remaining: 50.3s
215:	learn: 0.5170835	total: 20.3s	remaining: 50.2s
216:	learn: 0.5162934	total: 20.4s	remaining: 50.1s
217:	learn: 0.5153839	total: 20.5s	remaining: 50s
218:	learn: 0.5145863	total: 20.6s	remaining: 49.9s
219:	learn: 0.5134597	total: 20.6s	remaining: 49.7s
220:	learn: 0.5126391	total: 20.7s	remaining: 49.6s
221:	learn: 0.5116140	total: 20.8s	remaining: 49.5s
222:	learn: 0.5107206	total: 20.9s	remaining: 49.4s
223:	learn: 0.5099599	total: 21s	remaining: 49.3s
224:	learn: 0.5088062	total: 21.1s	remaining: 49.2s
225:	learn: 0.5080769	total: 21.2s	remaining: 49.1s
226:	learn: 0.5068715	total: 21.3s	remaining: 49s
227:	learn: 0.5061221	total: 21.4s	remaining: 48.9s
228:	learn: 0.5052075	total: 21.4s	remaining: 48.8s
229:	learn: 0.5044857	total: 21.5s	remaining: 48.7s
230:	learn: 0.503653

370:	learn: 0.4093118	total: 34.3s	remaining: 35s
371:	learn: 0.4092830	total: 34.3s	remaining: 34.9s
372:	learn: 0.4088407	total: 34.4s	remaining: 34.8s
373:	learn: 0.4083230	total: 34.5s	remaining: 34.7s
374:	learn: 0.4074150	total: 34.6s	remaining: 34.6s
375:	learn: 0.4067264	total: 34.7s	remaining: 34.5s
376:	learn: 0.4066168	total: 34.8s	remaining: 34.5s
377:	learn: 0.4064589	total: 34.9s	remaining: 34.4s
378:	learn: 0.4061840	total: 35s	remaining: 34.3s
379:	learn: 0.4051216	total: 35.1s	remaining: 34.2s
380:	learn: 0.4043519	total: 35.2s	remaining: 34.1s
381:	learn: 0.4035581	total: 35.3s	remaining: 34s
382:	learn: 0.4029635	total: 35.4s	remaining: 33.9s
383:	learn: 0.4024096	total: 35.5s	remaining: 33.8s
384:	learn: 0.4016870	total: 35.6s	remaining: 33.7s
385:	learn: 0.4013244	total: 35.6s	remaining: 33.6s
386:	learn: 0.4008286	total: 35.7s	remaining: 33.5s
387:	learn: 0.4007937	total: 35.8s	remaining: 33.4s
388:	learn: 0.4002918	total: 35.9s	remaining: 33.3s
389:	learn: 0.3997

531:	learn: 0.3227838	total: 49s	remaining: 20.1s
532:	learn: 0.3223090	total: 49.1s	remaining: 20s
533:	learn: 0.3219692	total: 49.2s	remaining: 19.9s
534:	learn: 0.3215726	total: 49.3s	remaining: 19.8s
535:	learn: 0.3211740	total: 49.4s	remaining: 19.7s
536:	learn: 0.3207476	total: 49.5s	remaining: 19.6s
537:	learn: 0.3203250	total: 49.5s	remaining: 19.5s
538:	learn: 0.3198189	total: 49.6s	remaining: 19.4s
539:	learn: 0.3190880	total: 49.7s	remaining: 19.3s
540:	learn: 0.3186549	total: 49.8s	remaining: 19.2s
541:	learn: 0.3180127	total: 49.9s	remaining: 19.1s
542:	learn: 0.3175132	total: 50s	remaining: 19.1s
543:	learn: 0.3168489	total: 50.1s	remaining: 19s
544:	learn: 0.3162977	total: 50.2s	remaining: 18.9s
545:	learn: 0.3157773	total: 50.3s	remaining: 18.8s
546:	learn: 0.3155024	total: 50.4s	remaining: 18.7s
547:	learn: 0.3150744	total: 50.5s	remaining: 18.6s
548:	learn: 0.3144134	total: 50.6s	remaining: 18.5s
549:	learn: 0.3139092	total: 50.7s	remaining: 18.4s
550:	learn: 0.313443

691:	learn: 0.2580513	total: 1m 3s	remaining: 5.33s
692:	learn: 0.2576323	total: 1m 3s	remaining: 5.24s
693:	learn: 0.2572800	total: 1m 3s	remaining: 5.14s
694:	learn: 0.2568460	total: 1m 3s	remaining: 5.05s
695:	learn: 0.2564923	total: 1m 3s	remaining: 4.96s
696:	learn: 0.2560477	total: 1m 3s	remaining: 4.86s
697:	learn: 0.2557809	total: 1m 4s	remaining: 4.77s
698:	learn: 0.2554244	total: 1m 4s	remaining: 4.68s
699:	learn: 0.2550772	total: 1m 4s	remaining: 4.59s
700:	learn: 0.2547542	total: 1m 4s	remaining: 4.5s
701:	learn: 0.2542620	total: 1m 4s	remaining: 4.4s
702:	learn: 0.2539027	total: 1m 4s	remaining: 4.31s
703:	learn: 0.2538701	total: 1m 4s	remaining: 4.22s
704:	learn: 0.2533352	total: 1m 4s	remaining: 4.13s
705:	learn: 0.2529109	total: 1m 4s	remaining: 4.03s
706:	learn: 0.2524921	total: 1m 4s	remaining: 3.94s
707:	learn: 0.2520861	total: 1m 4s	remaining: 3.85s
708:	learn: 0.2520604	total: 1m 4s	remaining: 3.76s
709:	learn: 0.2516410	total: 1m 5s	remaining: 3.66s
710:	learn: 0.

103:	learn: 0.6309043	total: 8.5s	remaining: 52.8s
104:	learn: 0.6300789	total: 8.58s	remaining: 52.7s
105:	learn: 0.6293223	total: 8.65s	remaining: 52.6s
106:	learn: 0.6285492	total: 8.71s	remaining: 52.4s
107:	learn: 0.6277158	total: 8.81s	remaining: 52.4s
108:	learn: 0.6262067	total: 8.88s	remaining: 52.2s
109:	learn: 0.6244891	total: 8.95s	remaining: 52s
110:	learn: 0.6230040	total: 9.02s	remaining: 51.9s
111:	learn: 0.6219789	total: 9.09s	remaining: 51.8s
112:	learn: 0.6211712	total: 9.17s	remaining: 51.7s
113:	learn: 0.6200896	total: 9.26s	remaining: 51.6s
114:	learn: 0.6191313	total: 9.35s	remaining: 51.6s
115:	learn: 0.6175697	total: 9.44s	remaining: 51.6s
116:	learn: 0.6164765	total: 9.52s	remaining: 51.5s
117:	learn: 0.6154228	total: 9.61s	remaining: 51.5s
118:	learn: 0.6141259	total: 9.68s	remaining: 51.3s
119:	learn: 0.6125897	total: 9.77s	remaining: 51.3s
120:	learn: 0.6115365	total: 9.85s	remaining: 51.2s
121:	learn: 0.6101256	total: 9.94s	remaining: 51.2s
122:	learn: 0.6

264:	learn: 0.4765964	total: 21s	remaining: 38.3s
265:	learn: 0.4760024	total: 21s	remaining: 38.2s
266:	learn: 0.4753213	total: 21.1s	remaining: 38.2s
267:	learn: 0.4744499	total: 21.2s	remaining: 38.1s
268:	learn: 0.4735999	total: 21.3s	remaining: 38s
269:	learn: 0.4729275	total: 21.3s	remaining: 37.9s
270:	learn: 0.4724705	total: 21.4s	remaining: 37.9s
271:	learn: 0.4717062	total: 21.5s	remaining: 37.8s
272:	learn: 0.4710852	total: 21.6s	remaining: 37.7s
273:	learn: 0.4700083	total: 21.6s	remaining: 37.6s
274:	learn: 0.4690275	total: 21.7s	remaining: 37.5s
275:	learn: 0.4683164	total: 21.8s	remaining: 37.4s
276:	learn: 0.4676311	total: 21.9s	remaining: 37.4s
277:	learn: 0.4673733	total: 21.9s	remaining: 37.3s
278:	learn: 0.4668028	total: 22s	remaining: 37.2s
279:	learn: 0.4661042	total: 22.1s	remaining: 37.1s
280:	learn: 0.4651629	total: 22.2s	remaining: 37s
281:	learn: 0.4640082	total: 22.3s	remaining: 36.9s
282:	learn: 0.4634499	total: 22.3s	remaining: 36.8s
283:	learn: 0.4629729	

424:	learn: 0.3752306	total: 32.5s	remaining: 24.8s
425:	learn: 0.3745760	total: 32.6s	remaining: 24.8s
426:	learn: 0.3737668	total: 32.6s	remaining: 24.7s
427:	learn: 0.3733674	total: 32.7s	remaining: 24.6s
428:	learn: 0.3729274	total: 32.8s	remaining: 24.5s
429:	learn: 0.3723785	total: 32.8s	remaining: 24.4s
430:	learn: 0.3721413	total: 32.9s	remaining: 24.4s
431:	learn: 0.3715910	total: 33s	remaining: 24.3s
432:	learn: 0.3708611	total: 33.1s	remaining: 24.2s
433:	learn: 0.3703658	total: 33.1s	remaining: 24.1s
434:	learn: 0.3697180	total: 33.2s	remaining: 24.1s
435:	learn: 0.3694799	total: 33.3s	remaining: 24s
436:	learn: 0.3689628	total: 33.4s	remaining: 23.9s
437:	learn: 0.3687653	total: 33.4s	remaining: 23.8s
438:	learn: 0.3683763	total: 33.5s	remaining: 23.7s
439:	learn: 0.3678190	total: 33.6s	remaining: 23.7s
440:	learn: 0.3674527	total: 33.6s	remaining: 23.6s
441:	learn: 0.3669006	total: 33.7s	remaining: 23.5s
442:	learn: 0.3664216	total: 33.8s	remaining: 23.4s
443:	learn: 0.36

585:	learn: 0.2960102	total: 44.2s	remaining: 12.4s
586:	learn: 0.2955462	total: 44.3s	remaining: 12.3s
587:	learn: 0.2950018	total: 44.4s	remaining: 12.2s
588:	learn: 0.2945361	total: 44.4s	remaining: 12.1s
589:	learn: 0.2940521	total: 44.5s	remaining: 12.1s
590:	learn: 0.2936788	total: 44.6s	remaining: 12s
591:	learn: 0.2933304	total: 44.7s	remaining: 11.9s
592:	learn: 0.2930864	total: 44.7s	remaining: 11.8s
593:	learn: 0.2926365	total: 44.8s	remaining: 11.8s
594:	learn: 0.2923042	total: 44.9s	remaining: 11.7s
595:	learn: 0.2918961	total: 44.9s	remaining: 11.6s
596:	learn: 0.2914728	total: 45s	remaining: 11.5s
597:	learn: 0.2911413	total: 45.1s	remaining: 11.5s
598:	learn: 0.2906523	total: 45.2s	remaining: 11.4s
599:	learn: 0.2903724	total: 45.2s	remaining: 11.3s
600:	learn: 0.2903032	total: 45.3s	remaining: 11.2s
601:	learn: 0.2899268	total: 45.4s	remaining: 11.2s
602:	learn: 0.2894429	total: 45.4s	remaining: 11.1s
603:	learn: 0.2891620	total: 45.5s	remaining: 11s
604:	learn: 0.2887

744:	learn: 0.2379674	total: 56s	remaining: 376ms
745:	learn: 0.2375097	total: 56s	remaining: 300ms
746:	learn: 0.2372921	total: 56.1s	remaining: 225ms
747:	learn: 0.2372706	total: 56.1s	remaining: 150ms
748:	learn: 0.2371519	total: 56.2s	remaining: 75.1ms
749:	learn: 0.2367815	total: 56.3s	remaining: 0us


StackingRegressor(cv=KFold(n_splits=5, random_state=RandomState(MT19937) at 0x1593962A540,
   shuffle=True),
                  estimators=[('cat',
                               <catboost.core.CatBoostRegressor object at 0x0000015939635D00>),
                              ('lgbm',
                               LGBMRegressor(colsample_bytree=0.6,
                                             learning_rate=0.01, max_depth=9,
                                             n_estimators=1200, num_leaves=15,
                                             objective='regression',
                                             reg_alpha=0, reg_lambda=0))],
                  final_estimator=ElasticNetCV())

In [26]:
en.final_estimator_.coef_

array([0.48790138, 0.55635312])

### MARS (fitted on residuals)

Calculate training data residuals of ensemble model, and fit MARS model to them. Then, add predicted residuals to the catboost (not ensemble) predictions.

In [27]:
# Calculate the residuals for training data
resid = log_y - en.predict(X_train)

# Fit a MARS model to the residuals
mars_resid = Earth()
mars_resid.fit(X_train, resid)

# Use the MARS model to predict the residuals for training data
mars_resid_pred = mars_resid.predict(X_train)

# Combine the CatBoost predictions with the MARS predictions for training data
# combined_pred_train = cat.predict(X_train) + mars_resid_pred

# Use the combined predictions on test data (since we don't have true y_test)
combined_pred_test = cat.predict(X_test) + mars_resid.predict(X_test)

  pruning_passer.run()
  coef, resid = np.linalg.lstsq(B, weighted_y[:, i])[0:2]


### Predictions

In [51]:
# Make prediction equal to combined predictions of Catboost and Mars Resid
y_pred = combined_pred_test

# Exponentiate logged y
y_pred = np.exp(y_pred)

# Scale up predictions by 1.30
y_pred = y_pred*1.30

# Clip predictions between 0 and 100
y_pred = np.clip(y_pred,0,100)

#Export to csv
predictions_regression = pd.DataFrame({"id" : test_id, "y": y_pred})
predictions_regression.to_csv("Predictions_Regression.csv",index = False)

## A.5) Discussion

Please provide details of the models/approaches you attempted but encountered challenges or unfavorable outcomes. If feasible, kindly explain the reasons behind their ineffectiveness or lack of success. Additionally, highlight the significant challenges or issues you encountered during the process.

We attempted to create models using the following methods: Lasso, Neural Networks, XGBoost, AdaBoost, Mars, as well as various ensembles that included combinations of these. 

We had some initial breakthroughs ensembling lasso with catboost and lgbm. However, our best eventual models did not include it. We were a bit surprised by the lack of success of neural networks and xgboost. It was incredibly frustrating with neural networks, because we learned about this method in STAT-362 this quarter. We tried employing various tools in order to make the neural network complex, considering that this method has proven to be an industry standard at this point. However, none of our attempts with neural networks were able to accomplish an RMSE below 10. Similar things can be said with xgb as well. We found it surprising that lgb was more effective alongside catboost, considering that lgb is much less computationally expensive than xgb.

Another issue I encountered was figuring which features to keep. Originally, I attempted to reduce dimensionality by eliminating features with high correlation with each other. We thought that this would address multicollinearity. However, it appeared that this was less productive than training our models (at least initially) on the entire dataset. We also though PCA might be worth our time, which is another concept we learned in STAT-362. However, we again found that this limited the success of our models.

On last issue that we should point out is the reproducability our code. When going through various iterations, it was difficult to keep track of which model iterations were more successful. While we did our best to keep track of our progress, we made some mistakes regarding setting the seed of our model/data preparation at a few stages. As a result, some of our best performing models were based on models that did not have a random_state set. When reviewing the code in this report, you might find that you are getting slightly different results from ours on Kaggle. This is due to this random_state issue. We can assure that the code presented is exactly how we went about creating our best model(s), and that the variance in result is purely based on the random_state (or lack thereof) for the various steps in the model creation.

## A.6) Conclusion

* Do you feel that you gain valuable experience, skills, and/or knowledge? If yes, please explain what they were. If no, please explain.
* What are things you liked/disliked about the project and/or work on the project?

We thought that this was an incredible experience, as we had exposure to various types of models, which also forced us to think critically in order to improve our performance. The idea of collaborating within the group was also great experience, considering that most engineering jobs require strong collaboration skills. It was also incredibly rewarding to see our submissions reach the top 5 best scores, which included beating Krish's score. 

However, there were certainly some frustrating aspects of the project. A lot of the time, we felt like we were spinning our wheels, unable to improve our scores which were already towards the bottom of the leaderboard at the time. It was also incredibly difficult keeping track of which iterations of models/feature selection we had attempted. It was also frustrating that the two models that worked best for us weren't even ones that we had learned in any of our classes. We figured that we would have at least been exposed to the models that would provide the best shot at accomplishing a good score.

## Please make sure your github repo has all the code and  ensure that your code is capable of reproducing the outcomes you have submitted. It is important to avoid any form of academic misconduct or cheating by using your peer's submission file