# <center> Bike Sharing Demand Prediction using SVM </center>

___

SVM Model objective is to find the maximum margin classifier. The maximum margin classifier helps to reduces the hypothesis space, effect of high dimensionality and computation. 

The points which maximum margin classifier touches are called support vectors. These vectors alone are enough to classify all other points.

<img src="svm.png" alt="https://becominghuman.ai/ensemble-learning-bagging-and-boosting-d20f38be9b1e" width="700" height="700">

___

## Bike Sharing Demand Data

We will use [Bike Sharing Demand Dataset](https://www.kaggle.com/c/bike-sharing-demand/data). We are provided hourly rental data spanning two years. The training set is comprised of the first 19 days of each month, while the test set is the 20th to the end of the month. 

| Feature | Description |
| --- | --- |
| datetime | hourly date + timestamp |
| season |  1 = spring, 2 = summer, 3 = fall, 4 = winter |
| holiday | whether the day is considered a holiday |
| workingday | whether the day is neither a weekend nor holiday |
| weather | 1 = Clear, Few clouds, Partly cloudy, Partly cloudy |
|  | 2 = Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist |
|  | 3 = Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds |
|  | 4 = Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog |
| temp | temperature in Celsius |
| atemp | "feels like" temperature in Celsius |
| humidity | relative humidity |
| windspeed | wind speed |
| casual | number of non-registered user rentals initiated |
| registered | number of registered user rentals initiated |
| count | number of total rentals |

**The goal is to predict the total count of bikes rented during each hour covered by the test set, using only information available prior to the rental period.**
___

## Load the libraries

In [1]:
import pandas as pd

# skip warnings
import warnings
warnings.filterwarnings('ignore')

## Load Data

In [3]:
### parse ["datetime"] columns as dates while reading the data.Check parse_dates in pd.read_csv
### parsing dateime will read sting of datetime and stores it in datetimeformat.

bike_sharing = pd.read_csv('bike-sharing-demand.csv',
                           parse_dates = ["datetime"])

bike_sharing.shape

(10886, 12)

There are a total of 10866 observations in the data set and 12 features. 

## Exploratory Data Analysis

Let us take a look at a portion of the data.

In [4]:
bike_sharing.tail()

Unnamed: 0,datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed,casual,registered,count
10881,2012-12-19 19:00:00,4,0,1,1,15.58,19.695,50,26.0027,7,329,336
10882,2012-12-19 20:00:00,4,0,1,1,14.76,17.425,57,15.0013,10,231,241
10883,2012-12-19 21:00:00,4,0,1,1,13.94,15.91,61,15.0013,4,164,168
10884,2012-12-19 22:00:00,4,0,1,1,13.94,17.425,61,6.0032,12,117,129
10885,2012-12-19 23:00:00,4,0,1,1,13.12,16.665,66,8.9981,4,84,88


Let us check if all columns are in appropriate data format.

In [5]:
bike_sharing.dtypes

datetime      datetime64[ns]
season                 int64
holiday                int64
workingday             int64
weather                int64
temp                 float64
atemp                float64
humidity               int64
windspeed            float64
casual                 int64
registered             int64
count                  int64
dtype: object

All features have the appropriate data types.

We will extract all time features from datetime field.

In [6]:
### Seperate and create new variables for year,month ,Day,Day of week ,Hour from datetime column.

bike_sharing['Year'] = bike_sharing['datetime'].dt.year
bike_sharing['Month'] = bike_sharing['datetime'].dt.month
bike_sharing['Day'] = bike_sharing['datetime'].dt.day
bike_sharing['Day Of Week'] = bike_sharing['datetime'].dt.dayofweek
bike_sharing['Hour'] = bike_sharing['datetime'].dt.hour

bike_sharing.head()

Unnamed: 0,datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed,casual,registered,count,Year,Month,Day,Day Of Week,Hour
0,2011-01-01 00:00:00,1,0,0,1,9.84,14.395,81,0.0,3,13,16,2011,1,1,5,0
1,2011-01-01 01:00:00,1,0,0,1,9.02,13.635,80,0.0,8,32,40,2011,1,1,5,1
2,2011-01-01 02:00:00,1,0,0,1,9.02,13.635,80,0.0,5,27,32,2011,1,1,5,2
3,2011-01-01 03:00:00,1,0,0,1,9.84,14.395,75,0.0,3,10,13,2011,1,1,5,3
4,2011-01-01 04:00:00,1,0,0,1,9.84,14.395,75,0.0,0,1,1,2011,1,1,5,4


We will set datetime features as index.

In [7]:
bike_sharing.columns

Index(['datetime', 'season', 'holiday', 'workingday', 'weather', 'temp',
       'atemp', 'humidity', 'windspeed', 'casual', 'registered', 'count',
       'Year', 'Month', 'Day', 'Day Of Week', 'Hour'],
      dtype='object')

In [8]:
## Set datetime column to index

bike_sharing.set_index('datetime', inplace = True)

In [9]:
bike_sharing.columns

Index(['season', 'holiday', 'workingday', 'weather', 'temp', 'atemp',
       'humidity', 'windspeed', 'casual', 'registered', 'count', 'Year',
       'Month', 'Day', 'Day Of Week', 'Hour'],
      dtype='object')

Let us seperate our numerical and categorical attributes.

In [10]:
# create a list for all categorical column names
cat_cols = ['season', 'holiday', 'workingday', 'weather']

# create a list for all numerical column names
num_cols = bike_sharing.columns[~bike_sharing.columns.isin(cat_cols)]

# remove target column('count')
num_cols = num_cols.drop('count')

num_cols

Index(['temp', 'atemp', 'humidity', 'windspeed', 'casual', 'registered',
       'Year', 'Month', 'Day', 'Day Of Week', 'Hour'],
      dtype='object')

We will convert all categorical variable to category type.

In [11]:
### Apply Type conversion 


bike_sharing[cat_cols] = bike_sharing[cat_cols].apply(lambda x : x.astype('category'))

We will check there are no invalid values in any of the categorical variables.

In [12]:
### Check unique values of each categorical column

for x in cat_cols :
    print(x, '----->>', bike_sharing[x].unique(), '\n')

season ----->> [1, 2, 3, 4]
Categories (4, int64): [1, 2, 3, 4] 

holiday ----->> [0, 1]
Categories (2, int64): [0, 1] 

workingday ----->> [0, 1]
Categories (2, int64): [0, 1] 

weather ----->> [1, 2, 3, 4]
Categories (4, int64): [1, 2, 3, 4] 



There are no invalid values.

Let us check numerical data as well.

In [13]:
#numeric data

bike_sharing[num_cols].describe()

Unnamed: 0,temp,atemp,humidity,windspeed,casual,registered,Year,Month,Day,Day Of Week,Hour
count,10886.0,10886.0,10886.0,10886.0,10886.0,10886.0,10886.0,10886.0,10886.0,10886.0,10886.0
mean,20.23086,23.655084,61.88646,12.799395,36.021955,155.552177,2011.501929,6.521495,9.992559,3.013963,11.541613
std,7.79159,8.474601,19.245033,8.164537,49.960477,151.039033,0.500019,3.444373,5.476608,2.004585,6.915838
min,0.82,0.76,0.0,0.0,0.0,0.0,2011.0,1.0,1.0,0.0,0.0
25%,13.94,16.665,47.0,7.0015,4.0,36.0,2011.0,4.0,5.0,1.0,6.0
50%,20.5,24.24,62.0,12.998,17.0,118.0,2012.0,7.0,10.0,3.0,12.0
75%,26.24,31.06,77.0,16.9979,49.0,222.0,2012.0,10.0,15.0,5.0,18.0
max,41.0,45.455,100.0,56.9969,367.0,886.0,2012.0,12.0,19.0,6.0,23.0


In [14]:
## Check missing values

bike_sharing.isnull().sum()

season         0
holiday        0
workingday     0
weather        0
temp           0
atemp          0
humidity       0
windspeed      0
casual         0
registered     0
count          0
Year           0
Month          0
Day            0
Day Of Week    0
Hour           0
dtype: int64

> The data looks clean. There are no NA values as count for all is equal to number of rows.

# Model Building

___

We will seperate our independent variables and target variable.

In [15]:
# independent variables
X = bike_sharing.drop('count', axis = 1)

# dependent variable
y = bike_sharing['count']

### Function to prepare data

We will write a function to prepare data for following functions.

- Train and Test Split

In [16]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

num_scaler = StandardScaler()

def prepare_data(X, y, split_size = 0.3) :
    
    
    ## train test split
    X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                        test_size = split_size)  
    
    ## fit num_scaler
    num_scaler.fit(X_train[num_cols])
    
    print(X_train.shape)
    print(X_test.shape)
    
    return X_train, X_test, y_train, y_test

### Function to pre-process data

We will write a function to pre-process data for following functions. This function will be called to transform both train and test datasets.

- Scale the numeric features
- Dummify the categorical features

In [17]:
def preprocess_data(data, scale = False) :
    
    # scale numeric features
    if scale == True :
        #tranform numeric data using num_scaler
        data[num_cols] = num_scaler.transform(data[num_cols])
    
    # dummify categorical features
    
    data = pd.get_dummies(data, drop_first = False)

    return data

### Function for Model Fit & Predict

We will write a function for following functions. 

- Fit the model on train data
- Perform cross-validation when needed
- Predict on train and test data

In [18]:
from sklearn.model_selection import GridSearchCV

def model_building(X, y, test, model, params = None, k = 1) :
    
    if params == None :
        
        ## Fit model 
        model.fit(X, y)
        
        # return fitted model & train-test predictions
        return (model, model.predict(X), model.predict(test))
    
    else :
        
        model_cv = GridSearchCV(model, param_grid = params, cv = k)
        
        ## Fit model_cv using 
        model_cv.fit(X, y)
        
        ## check best estimator 
        model = model_cv.best_estimator_
        
        print(model_cv.best_estimator_)
        
        # return and extra object for all cross validation operations
        
        return (model_cv, model, model.predict(X), model.predict(test))
    

### Function to Evaluate Model

In [19]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

def model_evaluation(y_train, pred_train, y_test, pred_test) :
    
    print('''
            =========================================
               MAE and MSE FOR TRAIN DATA
            =========================================''')
    print("Mean Absolute Error : ", mean_absolute_error(y_train, pred_train), 
          "\nMean Squared Error : ", mean_squared_error(y_train, pred_train))
    
    print('''
            =========================================
               MAE and MSE FOR TEST DATA
            =========================================''')
    print("Mean Absolute Error : ", mean_absolute_error(y_test, pred_test), 
          "\nMean Squared Error : ", mean_squared_error(y_test, pred_test))

# Model Building

___


### Let us build a [Support Vector Regressor](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html)

Call the train test split function

In [20]:
## Use prepare_data and get X_train, X_test, y_train, y_test where split size = 0.2

X_train, X_test, y_train, y_test = prepare_data(X, y, 0.2)



(8708, 15)
(2178, 15)


Build SVM Model without scaling

In [21]:
from sklearn.svm import SVR


# Call the pre-process function for both train and test data.
X_train_nsc = preprocess_data(X_train)
X_test_nsc = preprocess_data(X_test)

# Call the model building function
model, pred_train, pred_test = model_building(X_train_nsc, y_train,
                                              X_test_nsc, SVR())

# Call the model evaluation function for both train and test data to view model performance
model_evaluation(y_train, pred_train, y_test, pred_test)


               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  81.82331204672703 
Mean Squared Error :  12593.867470420071

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  82.82599371633397 
Mean Squared Error :  12806.120227457975


Build SVM Model with scaling

In [22]:
# Call the pre-process function for both train and test data.
X_train = preprocess_data(X_train,scale= True)
X_test = preprocess_data(X_test,scale= True)

# Call the model building function
model, pred_train, pred_test = model_building(X_train, y_train,
                                              X_test, SVR())

# Call the model evaluation function for both train and test data to view model performance
model_evaluation(y_train, pred_train, y_test, pred_test)


               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  30.636437775236402 
Mean Squared Error :  3868.603004733329

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  31.144750564339866 
Mean Squared Error :  3883.0530287593747


We can see the model performance improves drastically by standardising the numerical data.

Let us tune parameters to build non-linear models.

#### Parameter Tuning in SVM

**C parameter** is a regularization parameter that controls the trade off between the achieving a low training error and a low testing error that is the ability to generalize your classifier to unseen data. 

The C parameter trades off misclassification of training examples against simplicity of the decision surface. A low C makes the decision surface smooth, while a high C aims at classifying all training examples correctly by giving the model freedom to select more samples as support vectors.



**Gamma** is the parameter of a RBF Kernel (to handle non-linear classification).

> 𝑘(𝑥𝑛,𝑥𝑚)=𝑒𝑥𝑝(−𝛾||𝑥𝑛−𝑥𝑚||^2)


>gamma =1 / (2*sigma ^2)


Intuitively, the gamma parameter defines how far the influence of a single training example reaches, with low values meaning ‘far’ and high values meaning ‘close’. The gamma parameters can be seen as the inverse of the radius of influence of samples selected by the model as support vectors.

In [23]:
parameters = {'C' : [0.01,0.1], 
              'degree' : [2, 3],
              'kernel' : ['poly']}

# Call the model building function
model_cv, model, pred_train, pred_test = model_building(X_train, y_train, X_test, SVR(), parameters, 10)

# Call the model evaluation function for both train and test data to view model performance
model_evaluation(y_train, pred_train, y_test, pred_test)

SVR(C=0.1, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='scale',
    kernel='poly', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  100.38608331916744 
Mean Squared Error :  18552.62865022805

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  102.28883427672325 
Mean Squared Error :  19355.336189968562


In [24]:
parameters = {'C' : [0.01, 0.1], 
              'gamma' : [0.01], 
              'kernel' : ['rbf']}

# Call the model building function
model_cv, model, pred_train, pred_test = model_building(X_train, y_train, X_test, SVR(), parameters, 10)

# Call the model evaluation function for both train and test data to view model performance
model_evaluation(y_train, pred_train, y_test, pred_test)

SVR(C=0.1, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma=0.01,
    kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  118.85946103552257 
Mean Squared Error :  27641.34767899835

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  120.88827563275152 
Mean Squared Error :  28348.932585372015


Model performance deteriotes in higher dimension. This is an indication that data is linearly disrtibuted.

# Stacking

___

Let us build some more models and stack their outputs.

- Linear Regression
- Knn
- Decision Tree

In [25]:
bike_sharing.head()

Unnamed: 0_level_0,season,holiday,workingday,weather,temp,atemp,humidity,windspeed,casual,registered,count,Year,Month,Day,Day Of Week,Hour
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2011-01-01 00:00:00,1,0,0,1,9.84,14.395,81,0.0,3,13,16,2011,1,1,5,0
2011-01-01 01:00:00,1,0,0,1,9.02,13.635,80,0.0,8,32,40,2011,1,1,5,1
2011-01-01 02:00:00,1,0,0,1,9.02,13.635,80,0.0,5,27,32,2011,1,1,5,2
2011-01-01 03:00:00,1,0,0,1,9.84,14.395,75,0.0,3,10,13,2011,1,1,5,3
2011-01-01 04:00:00,1,0,0,1,9.84,14.395,75,0.0,0,1,1,2011,1,1,5,4


In [26]:
from sklearn.linear_model import LinearRegression, ElasticNet
from sklearn.neighbors import KNeighborsRegressor


In case you have not noticed yet, this dataset has casual users count and registered users count which adds up to give our target feature count.  
`count = casual + registered`

We can build a simple linear model with only these two features and see the results.

In [27]:
# linear model
X_train_cr = X_train_nsc[['casual', 'registered']]
X_test_cr = X_test_nsc[['casual', 'registered']]

model_cr, pred_train_cr, pred_test_cr = model_building(X_train_cr, y_train, X_test_cr,
                                                       LinearRegression())

model_evaluation(y_train, pred_train_cr, y_test, pred_test_cr)


               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  3.4323536714317443e-14 
Mean Squared Error :  3.246857249261681e-27

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  3.650401071104984e-14 
Mean Squared Error :  3.704599378895989e-27


In [28]:
model_cr.coef_

array([1., 1.])

As we can see we get the coefficient as 1. This means `1*casual + 1*registered = count` which is what we expected.

We will remove one of these features and train various models.

In [29]:
X_ncr = X.drop(['casual'], axis=1)

X_train_ncr, X_test_ncr, y_train, y_test = train_test_split(X_ncr, y, test_size = 0.4)

X_train_ncr = preprocess_data(X_train_ncr)
X_test_ncr = preprocess_data(X_test_ncr)

In [30]:
# linear model
model_lr, pred_train_lr, pred_test_lr = model_building(X_train_ncr, y_train, X_test_ncr,
                                                       LinearRegression())

model_evaluation(y_train, pred_train_lr, y_test, pred_test_lr)


               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  23.049990663086493 
Mean Squared Error :  1115.1933149054435

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  22.36155336819624 
Mean Squared Error :  1091.162388753879


### Elastic Net

>minimize (1 / (2 * n_samples) * ||y - Xw||^2 +
            alpha * l1_ratio * ||w||_1 + 
            0.5 * alpha * (1 - l1_ratio) * ||w||^2



𝐿1_ratio is the ratio between 𝐿1 and 𝐿2 penalty, ranging from 0 (ridge) to 1 (lasso)



alpha :
Constant that multiplies the penalty terms. Defaults to 1.0. See the notes for the exact mathematical meaning of this parameter.``alpha = 0`` is equivalent to an ordinary least square, solved by the LinearRegression object. For numerical reasons, using alpha = 0 with the Lasso object is not advised. Given this, you should use the LinearRegression object.

l1_ratio : 
The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1. For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1 penalty. For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.

In [31]:
# ElasticNet model
parameters={'alpha' : [0.2, 0.5],'l1_ratio' : [0, 0.5, 1]}

model_elr_cv, model_elr, pred_train_elr, pred_test_elr = model_building(X_train_ncr, y_train,
                                                                        X_test_ncr, 
                                                                        ElasticNet(), 
                                                                        params=parameters,k=10)

model_evaluation(y_train, pred_train_elr, 
                 y_test, pred_test_elr)

ElasticNet(alpha=0.2, copy_X=True, fit_intercept=True, l1_ratio=1,
           max_iter=1000, normalize=False, positive=False, precompute=False,
           random_state=None, selection='cyclic', tol=0.0001, warm_start=False)

               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  22.962134113567807 
Mean Squared Error :  1118.5610506612582

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  22.269895910128398 
Mean Squared Error :  1094.2662953347515


### KNN model

**p** :
    - Power parameter for the Minkowski metric. When p = 1, this is
      equivalent to using manhattan_distance (l1), and euclidean_distance


**weights** :
    - 'uniform' : uniform weights.  All points in each neighborhood
      are weighted equally.
    - 'distance' : weight points by the inverse of their distance.
      in this case, closer neighbors of a query point will have a
      greater influence than neighbors which are further away.

In [32]:
##model building

model_knn_cv, model_knn, pred_train_knn, pred_test_knn = model_building(X_train_ncr, y_train,
                                                                        X_test_ncr, 
                                                                        KNeighborsRegressor(), 
                                                                        {'n_neighbors' : [4, 5, 6], 
                                                                        'weights' : ['uniform', 
                                                                                     'distance'],
                                                                        'p' : [1, 2]}, 10)

## Evaluation

model_evaluation(y_train, pred_train_knn,
                 y_test, pred_test_knn)

KNeighborsRegressor(algorithm='auto', leaf_size=30, metric='minkowski',
                    metric_params=None, n_jobs=None, n_neighbors=6, p=1,
                    weights='distance')

               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  0.0 
Mean Squared Error :  0.0

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  14.37229114509106 
Mean Squared Error :  654.7279348104759


In [33]:
## Decision Tree
from sklearn.tree import DecisionTreeRegressor


## Model 
model_dt_cv, model_dt, pred_train_dt, pred_test_dt = model_building(X_train_ncr, y_train,
                                                                    X_test_ncr, 
                                                       DecisionTreeRegressor(),
                                                       {'max_depth' : [7, 8, 9, 10, 12]}, 10)

## Evaluation

model_evaluation(y_train, pred_train_dt, y_test, pred_test_dt)

DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=9,
                      max_features=None, max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, presort='deprecated',
                      random_state=None, splitter='best')

               MAE and MSE FOR TRAIN DATA
Mean Absolute Error :  9.323044793552373 
Mean Squared Error :  245.9314390674715

               MAE and MSE FOR TEST DATA
Mean Absolute Error :  13.637142645796814 
Mean Squared Error :  547.5675109994015


Although Knn and DT seem like they have overfitted because train error is very low compared to test error, yet we can see by using grid search that training a simpler model is increasing the test error. These are the best model for this data.

Let us average these classifier prediction to build a stacked model. 

In [34]:
pred_test_stack = (pred_test_lr + pred_test_elr + pred_test_knn + pred_test_dt)/4

print("Mean Absolute Error : ", mean_absolute_error(y_test, pred_test_stack), 
      "\nMean Squared Error : ", mean_squared_error(y_test, pred_test_stack))

Mean Absolute Error :  15.539996400253484 
Mean Squared Error :  598.5034886909848


As the linear models didn't give good result, we will use only Knn and DT predictions.

In [35]:
pred_test_stack = (pred_test_knn + pred_test_dt)/2

print("Mean Absolute Error : ", mean_absolute_error(y_test, pred_test_stack), 
      "\nMean Squared Error : ", mean_squared_error(y_test, pred_test_stack))

Mean Absolute Error :  12.206868722590444 
Mean Squared Error :  449.40331875272864


As we can see that the stacked model has considerably reduced the error.

___

In [36]:
import pickle

# Save to file in the current working directory
pkl_filename = "pickle_model_DT.pkl"


with open(pkl_filename, 'wb') as file:
    pickle.dump(model_dt, file)

# Load from file
with open(pkl_filename, 'rb') as file:
    pickle_model = pickle.load(file)
    

pickle_model

DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=9,
                      max_features=None, max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, presort='deprecated',
                      random_state=None, splitter='best')

In [37]:
test_preds_dt=pickle_model.predict(X_test_ncr)



In [38]:
test_preds_dt.shape

(4355,)

In [39]:
y_test.shape

(4355,)