# Task 10 : Benchmark Top ML Algorithms

This task tests your ability to use different ML algorithms when solving a specific problem.


### Dataset
Predict Loan Eligibility for Dream Housing Finance company

Dream Housing Finance company deals in all kinds of home loans. They have presence across all urban, semi urban and rural areas. Customer first applies for home loan and after that company validates the customer eligibility for loan.

Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have provided a dataset to identify the customers segments that are eligible for loan amount so that they can specifically target these customers.

Train: https://raw.githubusercontent.com/subashgandyer/datasets/main/loan_train.csv

Test: https://raw.githubusercontent.com/subashgandyer/datasets/main/loan_test.csv

## Task Requirements
### You can have the following Classification models built using different ML algorithms
- Decision Tree
- KNN
- Logistic Regression
- SVM
- Random Forest
- Any other algorithm of your choice

### Use GridSearchCV for finding the best model with the best hyperparameters

- ### Build models
- ### Create Parameter Grid
- ### Run GridSearchCV
- ### Choose the best model with the best hyperparameter
- ### Give the best accuracy
- ### Also, benchmark the best accuracy that you could get for every classification algorithm asked above

#### Your final output will be something like this:
- Best algorithm accuracy
- Best hyperparameter accuracy for every algorithm

**Table 1 (Algorithm wise best model with best hyperparameter)**

Algorithm   |     Accuracy   |   Hyperparameters
- DT
- KNN
- LR
- SVM
- RF
- anyother

**Table 2 (Best overall)**

Algorithm    |   Accuracy    |   Hyperparameters



### Submission
- Submit Notebook containing all saved ran code with outputs
- Document with the above two tables

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as matplot
import seaborn as sns
%matplotlib inline

In [3]:
train=pd.read_csv("loan_train.csv")
train.head()

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849,0.0,,360.0,1.0,Urban,Y
1,LP001003,Male,Yes,1,Graduate,No,4583,1508.0,128.0,360.0,1.0,Rural,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000,0.0,66.0,360.0,1.0,Urban,Y
3,LP001006,Male,Yes,0,Not Graduate,No,2583,2358.0,120.0,360.0,1.0,Urban,Y
4,LP001008,Male,No,0,Graduate,No,6000,0.0,141.0,360.0,1.0,Urban,Y


In [4]:
test=pd.read_csv("loan_test.csv")
test.head()

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area
0,LP001015,Male,Yes,0,Graduate,No,5720,0,110.0,360.0,1.0,Urban
1,LP001022,Male,Yes,1,Graduate,No,3076,1500,126.0,360.0,1.0,Urban
2,LP001031,Male,Yes,2,Graduate,No,5000,1800,208.0,360.0,1.0,Urban
3,LP001035,Male,Yes,2,Graduate,No,2340,2546,100.0,360.0,,Urban
4,LP001051,Male,No,0,Not Graduate,No,3276,0,78.0,360.0,1.0,Urban


In [5]:
train_temp=train.copy()
test_temp=test.copy()

In [6]:
train.columns

Index(['Loan_ID', 'Gender', 'Married', 'Dependents', 'Education',
       'Self_Employed', 'ApplicantIncome', 'CoapplicantIncome', 'LoanAmount',
       'Loan_Amount_Term', 'Credit_History', 'Property_Area', 'Loan_Status'],
      dtype='object')

In [7]:
test.columns

Index(['Loan_ID', 'Gender', 'Married', 'Dependents', 'Education',
       'Self_Employed', 'ApplicantIncome', 'CoapplicantIncome', 'LoanAmount',
       'Loan_Amount_Term', 'Credit_History', 'Property_Area'],
      dtype='object')

we are predicting loan status

In [8]:
train.isna().sum()

Loan_ID               0
Gender               13
Married               3
Dependents           15
Education             0
Self_Employed        32
ApplicantIncome       0
CoapplicantIncome     0
LoanAmount           22
Loan_Amount_Term     14
Credit_History       50
Property_Area         0
Loan_Status           0
dtype: int64

In [9]:
test.isna().sum()

Loan_ID               0
Gender               11
Married               0
Dependents           10
Education             0
Self_Employed        23
ApplicantIncome       0
CoapplicantIncome     0
LoanAmount            5
Loan_Amount_Term      6
Credit_History       29
Property_Area         0
dtype: int64

impute data

In [10]:
train['Gender'].fillna(train['Gender'].mode()[0],inplace=True)
train['Married'].fillna(train['Married'].mode()[0],inplace=True)
train['Dependents'].fillna(train['Dependents'].mode()[0],inplace=True)
train['Self_Employed'].fillna(train['Self_Employed'].mode()[0],inplace=True)
train['Credit_History'].fillna(train['Credit_History'].mode()[0],inplace=True)

#Test
test['Gender'].fillna(train['Gender'].mode()[0], inplace=True)
test['Dependents'].fillna(train['Dependents'].mode()[0], inplace=True)
test['Self_Employed'].fillna(train['Self_Employed'].mode()[0], inplace=True)
test['Credit_History'].fillna(train['Credit_History'].mode()[0], inplace=True)
test['Loan_Amount_Term'].fillna(train['Loan_Amount_Term'].mode()[0], inplace=True)
test['LoanAmount'].fillna(train['LoanAmount'].median(), inplace=True)

In [11]:
train['Loan_Amount_Term'].value_counts()

Loan_Amount_Term
360.0    512
180.0     44
480.0     15
300.0     13
240.0      4
84.0       4
120.0      3
60.0       2
36.0       2
12.0       1
Name: count, dtype: int64

In [12]:
train['Loan_Amount_Term'].fillna(train['Loan_Amount_Term'].mode()[0], inplace=True)

In [13]:
train['LoanAmount'].fillna(train['LoanAmount'].median(),inplace=True) 

In [14]:
train.isnull().sum()

Loan_ID              0
Gender               0
Married              0
Dependents           0
Education            0
Self_Employed        0
ApplicantIncome      0
CoapplicantIncome    0
LoanAmount           0
Loan_Amount_Term     0
Credit_History       0
Property_Area        0
Loan_Status          0
dtype: int64

In [15]:
train = train.drop('Loan_ID',axis=1)
test = test.drop('Loan_ID',axis=1)

In [16]:

train['Loan_Status'].replace('Y',1,inplace=True)
train['Loan_Status'].replace('N',0,inplace=True)

In [17]:
X = train.drop('Loan_Status',axis=1)
y = train['Loan_Status']

In [18]:
y.head()

0    1
1    0
2    1
3    1
4    1
Name: Loan_Status, dtype: int64

In [19]:
X = pd.get_dummies(X)
train = pd.get_dummies(train)
test = pd.get_dummies(test)

In [20]:
from sklearn.model_selection import train_test_split
x_train, x_cv, y_train, y_cv = train_test_split(X,y, test_size =0.3)

In [21]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

LR = LogisticRegression()
LR.fit(x_train,y_train)

In [22]:
pred_cv = LR.predict(x_cv)

accuracy_score(y_cv,pred_cv)

0.8054054054054054

In [23]:
pred_test = LR.predict(test)

- Decision Tree
- KNN
- Logistic Regression
- SVM
- Random Forest
- Any other algorithm of your choice

In [24]:
from sklearn.model_selection import RepeatedStratifiedKFold
rskf = RepeatedStratifiedKFold(n_splits=5, random_state=1)

In [25]:
from sklearn.model_selection import GridSearchCV

In [26]:
from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier(random_state=1)


In [155]:
paramgrid = {'max_depth': list(range(1, 20, 2)), 'n_estimators': list(range(1, 200, 20))}
grid_search=GridSearchCV(rfc,cv=rskf,param_grid = paramgrid)

In [156]:
grid_search.fit(x_train,y_train)

In [157]:
rfc_df = pd.concat([pd.DataFrame(grid_search.cv_results_["params"]),pd.DataFrame(grid_search.cv_results_["mean_test_score"], columns=["Accuracy"])],axis=1)

In [158]:
best_rfc_df = rfc_df.sort_values(by=['Accuracy'], ascending=False)

In [141]:
paramgrid = {
    'penalty': ['l1', 'l2'],
    'C': [0.1, 1, 10, 100]
}
grid_search2=GridSearchCV(LR,cv=rskf,param_grid = paramgrid)

In [142]:
grid_search2.fit(x_train,y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

In [145]:
LR_df = pd.concat([pd.DataFrame(grid_search2.cv_results_["params"]),pd.DataFrame(grid_search2.cv_results_["mean_test_score"], columns=["Accuracy"])],axis=1)


In [146]:
best_lr_df = LR_df.sort_values(by=['Accuracy'], ascending=False)

In [147]:
best_lr_df

Unnamed: 0,C,penalty,Accuracy
5,10.0,l2,0.802813
3,1.0,l2,0.79957
7,100.0,l2,0.796525
1,0.1,l2,0.770854
0,0.1,l1,
2,1.0,l1,
4,10.0,l1,
6,100.0,l1,


In [37]:
paramgrid = {
    'penalty': ['l1', 'l2'],
    'C': [0.1, 1, 10, 100]
}
grid_search2=GridSearchCV(LR,cv=rskf,param_grid = paramgrid)

In [39]:
from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier()
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier()
from sklearn.svm import SVC
svc = SVC()

In [40]:
paramgrid = {'max_features': ['auto', 'sqrt', 'log2'],
              'ccp_alpha': [0.1, .01, .001],
              'max_depth' : [5, 6, 7, 8, 9],
              'criterion' :['gini', 'entropy']
             }
grid_search3=GridSearchCV(dt,cv=rskf,param_grid = paramgrid)

In [41]:
grid_search3.fit(x_train,y_train)

1500 fits failed out of a total of 4500.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
1500 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\model_selection\_validation.py", line 732, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\base.py", line 1144, in wrapper
    estimator._validate_params()
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\base.py", line 637, in _validate_params
    validate_parameter_constraints(
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python31

In [42]:
dt_df = pd.concat([pd.DataFrame(grid_search3.cv_results_["params"]),pd.DataFrame(grid_search3.cv_results_["mean_test_score"], columns=["Accuracy"])],axis=1)

best_dt_df = dt_df.sort_values(by=['Accuracy'], ascending=False)

In [43]:
best_dt_df

Unnamed: 0,ccp_alpha,criterion,max_depth,max_features,Accuracy
41,0.010,gini,8,log2,0.772358
43,0.010,gini,9,sqrt,0.770192
56,0.010,entropy,8,log2,0.769746
49,0.010,entropy,6,sqrt,0.769250
37,0.010,gini,7,sqrt,0.767198
...,...,...,...,...,...
75,0.001,entropy,5,auto,
78,0.001,entropy,6,auto,
81,0.001,entropy,7,auto,
84,0.001,entropy,8,auto,


In [67]:

grid_search4=GridSearchCV(knn,cv=rskf,param_grid = {'n_neighbors': [1 , 5, 10],
                                                    'weights':['uniform', 'distance'],
                                                    'algorithm':['auto', 'ball_tree', 'kd_tree', 'brute']
                                                            })

In [68]:
grid_search4.fit(X,y)

Traceback (most recent call last):
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\model_selection\_validation.py", line 813, in _score
    scores = scorer(estimator, X_test, y_test)
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics\_scorer.py", line 527, in __call__
    return estimator.score(*args, **kwargs)
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\base.py", line 705, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\neighbors\_classification.py", line 246, in predict
    if self._fit_method == "brute" and ArgKminClassMode.is_usable_for(
  File "c:\Users\chunl\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics\_pairwise_distances_reduction\_dispatcher.py", line 471, in is_usable_for
    ArgKmin.is_usable_fo

In [69]:
knn_df = pd.concat([pd.DataFrame(grid_search4.cv_results_["params"]),pd.DataFrame(grid_search4.cv_results_["mean_test_score"], columns=["Accuracy"])],axis=1)

best_knn_df = knn_df.sort_values(by=['Accuracy'], ascending=False)

In [70]:
best_knn_df

Unnamed: 0,algorithm,n_neighbors,weights,Accuracy
23,brute,10,distance,0.634388
5,auto,10,distance,0.634388
11,ball_tree,10,distance,0.634388
17,kd_tree,10,distance,0.634388
16,kd_tree,10,uniform,0.632591
10,ball_tree,10,uniform,0.632591
8,ball_tree,5,uniform,0.627534
14,kd_tree,5,uniform,0.627534
21,brute,5,distance,0.612576
9,ball_tree,5,distance,0.612576


In [74]:
grid_search5=GridSearchCV(svc,cv=rskf,param_grid = {'C': [ 0.5, 1, 10], 
         'kernel': ['rbf'],
           'class_weight': [None, 'balanced']})

In [75]:
grid_search5.fit(x_train,y_train)

In [76]:
svc_df = pd.concat([pd.DataFrame(grid_search5.cv_results_["params"]),pd.DataFrame(grid_search5.cv_results_["mean_test_score"], columns=["Accuracy"])],axis=1)

best_svc_df = svc_df.sort_values(by=['Accuracy'], ascending=False)

In [86]:
best_dt_df.head()


Unnamed: 0,ccp_alpha,criterion,max_depth,max_features,Accuracy
41,0.01,gini,8,log2,0.772358
43,0.01,gini,9,sqrt,0.770192
56,0.01,entropy,8,log2,0.769746
49,0.01,entropy,6,sqrt,0.76925
37,0.01,gini,7,sqrt,0.767198


In [97]:
dt_para = best_dt_df

In [99]:
dt_para.info()

<class 'pandas.core.frame.DataFrame'>
Index: 90 entries, 41 to 87
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   ccp_alpha     90 non-null     float64
 1   criterion     90 non-null     object 
 2   max_depth     90 non-null     int64  
 3   max_features  90 non-null     object 
 4   Accuracy      60 non-null     float64
dtypes: float64(2), int64(1), object(2)
memory usage: 4.2+ KB


In [102]:
DAcc = dt_para[['Accuracy']]
dt_para = dt_para.drop(columns=['Accuracy'])

In [117]:
DAcc

Unnamed: 0,Accuracy
41,0.772358
43,0.770192
56,0.769746
49,0.769250
37,0.767198
...,...
75,
78,
81,
84,


In [107]:
dt_para =dt_para.astype(str)

In [108]:
dt_para['Parameters'] = dt_para.apply(lambda x: ''.join(x), axis=1)

In [112]:
dt_para =dt_para.drop(columns=['ccp_alpha'])
dt_para =dt_para.drop(columns=['criterion'])
dt_para =dt_para.drop(columns=['max_depth'])
dt_para =dt_para.drop(columns=['max_features'])


In [118]:
best_dt = pd.concat([dt_para,DAcc],axis=1)

In [119]:
best_dt

Unnamed: 0,Parameters,Accuracy
41,0.01gini8log2,0.772358
43,0.01gini9sqrt,0.770192
56,0.01entropy8log2,0.769746
49,0.01entropy6sqrt,0.769250
37,0.01gini7sqrt,0.767198
...,...,...
75,0.001entropy5auto,
78,0.001entropy6auto,
81,0.001entropy7auto,
84,0.001entropy8auto,


In [120]:
best_knn_df

Unnamed: 0,algorithm,n_neighbors,weights,Accuracy
23,brute,10,distance,0.634388
5,auto,10,distance,0.634388
11,ball_tree,10,distance,0.634388
17,kd_tree,10,distance,0.634388
16,kd_tree,10,uniform,0.632591
10,ball_tree,10,uniform,0.632591
8,ball_tree,5,uniform,0.627534
14,kd_tree,5,uniform,0.627534
21,brute,5,distance,0.612576
9,ball_tree,5,distance,0.612576


In [121]:
DAcc = best_knn_df[['Accuracy']]
best_knn_df = best_knn_df.drop(columns=['Accuracy'])

In [125]:
best_knn_df =best_knn_df.astype(str)

In [126]:
best_knn_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 24 entries, 23 to 22
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   algorithm    24 non-null     object
 1   n_neighbors  24 non-null     object
 2   weights      24 non-null     object
dtypes: object(3)
memory usage: 768.0+ bytes


In [127]:
best_knn_df['Parameters'] = best_knn_df.apply(lambda x: ''.join(x), axis=1)

In [128]:
best_knn_df.head()

Unnamed: 0,algorithm,n_neighbors,weights,Parameters
23,brute,10,distance,brute10distance
5,auto,10,distance,auto10distance
11,ball_tree,10,distance,ball_tree10distance
17,kd_tree,10,distance,kd_tree10distance
16,kd_tree,10,uniform,kd_tree10uniform


In [129]:
best_knn_df =best_knn_df.drop(columns=['algorithm'])
best_knn_df =best_knn_df.drop(columns=['n_neighbors'])
best_knn_df =best_knn_df.drop(columns=['weights'])


In [130]:
best_knn = pd.concat([best_knn_df,DAcc],axis=1)

In [132]:
best_knn.head()

Unnamed: 0,Parameters,Accuracy
23,brute10distance,0.634388
5,auto10distance,0.634388
11,ball_tree10distance,0.634388
17,kd_tree10distance,0.634388
16,kd_tree10uniform,0.632591


In [148]:
DAcc = best_lr_df[['Accuracy']]
best_lr_df = best_lr_df.drop(columns=['Accuracy'])

In [149]:
best_lr_df = best_lr_df.astype(str)

In [150]:
best_lr_df['Parameters'] = best_lr_df.apply(lambda x: ''.join(x), axis=1)

In [151]:
best_lr_df =best_lr_df.drop(columns=['C'])
best_lr_df =best_lr_df.drop(columns=['penalty'])


In [152]:
best_lr = pd.concat([best_lr_df,DAcc],axis=1)

In [159]:
DAcc = best_rfc_df[['Accuracy']]
best_rfc_df = best_rfc_df.drop(columns=['Accuracy'])

In [160]:
best_rfc_df = best_rfc_df.astype(str)

In [161]:
best_rfc_df['Parameters'] = best_rfc_df.apply(lambda x: ''.join(x), axis=1)

In [164]:
best_rfc_df =best_rfc_df.drop(columns=['max_depth'])
best_rfc_df =best_rfc_df.drop(columns=['n_estimators'])

In [165]:
best_rfc = pd.concat([best_rfc_df,DAcc],axis=1)

In [None]:

best_svc_df

In [166]:
DAcc = best_svc_df[['Accuracy']]
best_svc_df = best_svc_df.drop(columns=['Accuracy'])

In [167]:
best_svc_df2 = best_svc_df.astype(str)

In [168]:
best_svc_df2['Parameters'] = best_svc_df2.apply(lambda x: ''.join(x), axis=1)

In [170]:
best_svc_df2 =best_svc_df2.drop(columns=['class_weight'])
best_svc_df2 =best_svc_df2.drop(columns=['kernel'])
best_svc_df2 =best_svc_df2.drop(columns=['C'])

In [172]:
best_svc = pd.concat([best_svc_df2,DAcc],axis=1)

In [173]:
best_svc

Unnamed: 0,Parameters,Accuracy
2,1.0Nonerbf,0.700944
4,10.0Nonerbf,0.700473
0,0.5Nonerbf,0.699316
1,0.5balancedrbf,0.63571
3,1.0balancedrbf,0.626386
5,10.0balancedrbf,0.601412


In [174]:
best_svc['algor'] = 'SVC'

In [175]:
best_svc

Unnamed: 0,Parameters,Accuracy,algor
2,1.0Nonerbf,0.700944,SVC
4,10.0Nonerbf,0.700473,SVC
0,0.5Nonerbf,0.699316,SVC
1,0.5balancedrbf,0.63571,SVC
3,1.0balancedrbf,0.626386,SVC
5,10.0balancedrbf,0.601412,SVC


In [176]:
best_dt['algor']='decistionTree'
best_knn['algor']='KNN'
best_lr['algor']="Log Regression"
best_rfc['algor']="RandomForest"

In [177]:
table = pd.concat([best_dt,best_rfc,best_knn,best_lr,best_svc])

In [178]:
table

Unnamed: 0,Parameters,Accuracy,algor
41,0.01gini8log2,0.772358,decistionTree
43,0.01gini9sqrt,0.770192,decistionTree
56,0.01entropy8log2,0.769746,decistionTree
49,0.01entropy6sqrt,0.769250,decistionTree
37,0.01gini7sqrt,0.767198,decistionTree
...,...,...,...
4,10.0Nonerbf,0.700473,SVC
0,0.5Nonerbf,0.699316,SVC
1,0.5balancedrbf,0.635710,SVC
3,1.0balancedrbf,0.626386,SVC


In [179]:
table2 = table.sort_values(by=['Accuracy'], ascending=False)



In [209]:
pd.set_option("display.max_rows", None)
table2.head(228)

Unnamed: 0,Parameters,Accuracy,algor
17,3141,0.810306,RandomForest
18,3161,0.810301,RandomForest
19,3181,0.810066,RandomForest
16,3121,0.809127,RandomForest
27,5141,0.807729,RandomForest
26,5121,0.807491,RandomForest
29,5181,0.807261,RandomForest
28,5161,0.807029,RandomForest
25,5101,0.806561,RandomForest
12,341,0.80655,RandomForest


In [None]:
table

In [194]:
best_df = pd.DataFrame(columns=best_dt.columns)
row = best_dt.iloc[0]
best_df = pd.concat([pd.DataFrame([row])], ignore_index=True)


In [197]:
row1= best_rfc.iloc[0]
row2 = best_svc.iloc[0]
row3 = best_lr.iloc[0]
row4 = best_knn.iloc[0]

In [195]:
best_df

Unnamed: 0,Parameters,Accuracy,algor
0,0.01gini8log2,0.772358,decistionTree


In [198]:
best_df = pd.concat([best_df, pd.DataFrame([row1])], ignore_index=True)

In [199]:
best_df

Unnamed: 0,Parameters,Accuracy,algor
0,0.01gini8log2,0.772358,decistionTree
1,3141,0.810306,RandomForest


In [200]:
best_df = pd.concat([best_df, pd.DataFrame([row2])], ignore_index=True)

In [201]:
best_df = pd.concat([best_df, pd.DataFrame([row3])], ignore_index=True)

In [202]:
best_df = pd.concat([best_df, pd.DataFrame([row4])], ignore_index=True)

In [203]:
best_df

Unnamed: 0,Parameters,Accuracy,algor
0,0.01gini8log2,0.772358,decistionTree
1,3141,0.810306,RandomForest
2,1.0Nonerbf,0.700944,SVC
3,10.0l2,0.802813,Log Regression
4,brute10distance,0.634388,KNN


In [204]:
table1 = best_df.copy()

In [205]:
table1

Unnamed: 0,Parameters,Accuracy,algor
0,0.01gini8log2,0.772358,decistionTree
1,3141,0.810306,RandomForest
2,1.0Nonerbf,0.700944,SVC
3,10.0l2,0.802813,Log Regression
4,brute10distance,0.634388,KNN
