**Load Libraries**

In [6]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.datasets import load_digits

**Load Digits Dataset**

In [8]:
digits = load_digits()
digits.data

array([[ 0.,  0.,  5., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ..., 10.,  0.,  0.],
       [ 0.,  0.,  0., ..., 16.,  9.,  0.],
       ...,
       [ 0.,  0.,  1., ...,  6.,  0.,  0.],
       [ 0.,  0.,  2., ..., 12.,  0.,  0.],
       [ 0.,  0., 10., ..., 12.,  1.,  0.]])

In [9]:
# convert digits dataset to pandas dataframe
digits_df = pd.DataFrame(digits.data, columns=digits.feature_names)

In [10]:
# Append target column to dataframe
digits_df['target'] = digits.target

In [11]:
print(digits_df.head())

   pixel_0_0  pixel_0_1  pixel_0_2  pixel_0_3  pixel_0_4  pixel_0_5  \
0        0.0        0.0        5.0       13.0        9.0        1.0   
1        0.0        0.0        0.0       12.0       13.0        5.0   
2        0.0        0.0        0.0        4.0       15.0       12.0   
3        0.0        0.0        7.0       15.0       13.0        1.0   
4        0.0        0.0        0.0        1.0       11.0        0.0   

   pixel_0_6  pixel_0_7  pixel_1_0  pixel_1_1  ...  pixel_6_7  pixel_7_0  \
0        0.0        0.0        0.0        0.0  ...        0.0        0.0   
1        0.0        0.0        0.0        0.0  ...        0.0        0.0   
2        0.0        0.0        0.0        0.0  ...        0.0        0.0   
3        0.0        0.0        0.0        8.0  ...        0.0        0.0   
4        0.0        0.0        0.0        0.0  ...        0.0        0.0   

   pixel_7_1  pixel_7_2  pixel_7_3  pixel_7_4  pixel_7_5  pixel_7_6  \
0        0.0        6.0       13.0       10.0

**Data Exploration**

In [13]:
digits_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1797 entries, 0 to 1796
Data columns (total 65 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   pixel_0_0  1797 non-null   float64
 1   pixel_0_1  1797 non-null   float64
 2   pixel_0_2  1797 non-null   float64
 3   pixel_0_3  1797 non-null   float64
 4   pixel_0_4  1797 non-null   float64
 5   pixel_0_5  1797 non-null   float64
 6   pixel_0_6  1797 non-null   float64
 7   pixel_0_7  1797 non-null   float64
 8   pixel_1_0  1797 non-null   float64
 9   pixel_1_1  1797 non-null   float64
 10  pixel_1_2  1797 non-null   float64
 11  pixel_1_3  1797 non-null   float64
 12  pixel_1_4  1797 non-null   float64
 13  pixel_1_5  1797 non-null   float64
 14  pixel_1_6  1797 non-null   float64
 15  pixel_1_7  1797 non-null   float64
 16  pixel_2_0  1797 non-null   float64
 17  pixel_2_1  1797 non-null   float64
 18  pixel_2_2  1797 non-null   float64
 19  pixel_2_3  1797 non-null   float64
 20  pixel_2_

The digits dataset has 1797 samples; has 65 columns with 64 being the features and having float data type, and 1 being the 
target having int data type. There are no missing values in all the columns. The dataset memory size is a bit large at 905.6kb

The digits dataset has all features having values that are **representative of images**. As such, for this kind of dataset, running the `.describe()`
function to give summary statistics on this type of numeric data **is of no use here**. There will also be no need to standardize the scale since the scale is already uniform across the features.

In [16]:
# digits_df.describe()

In [17]:
# The dataset features have a little varied scale across pixel intensity values . We need to normalize the scale for consistent scaling across the images before feeding our dataset to
# the machine learning models. For this, we use standard StandardScaler in sklearn.

# **Standardization**

In [18]:
from sklearn.preprocessing import StandardScaler

We should split our dataset into features and target. For this we have:

In [20]:
X = digits_df[digits.feature_names].copy()
y = digits_df['target'].copy()

In [21]:
# Now we run our StandardScaler instant, train it with the features of our dataset, and scale the feature **values** of the dataset

In [22]:
# scaler = StandardScaler()
# scaler.fit(X)
# X_scaled = scaler.transform(X.values)

In [23]:
# print(X_scaled[0])

In [24]:
# The output reveals that the features of the digit dataset are standardized and ready for modeling.

Now we split our dataset for training and testing 'using' train_test_split function in sklearn.model_selection to achieve this. We use 70% train_size and set our 
random_state to 20.

**Data Modelling**

- **Model Training**
- **Model Testing** (**Predict** y)

In [26]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=.7, random_state=20)

Let's confirm if our split is okay

In [28]:
print(f'Train Size: {round(len(X_train) / len(X) * 100)}% \nTest Size: {round(len(X_test) / len(X) * 100)}%')

Train Size: 70% 
Test Size: 30%


Now, for our modelling, we choose three models (that we used for our wine dataset, in order to cross-validate here) and train each one with our train dataset:

- **LogisticRegression**
- **Decision Tree**
- **Support Vector Machine**

For this, we import each algorithm from sklearn

In [30]:
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier

Beginning with the first model, we Instantiate it as

In [32]:
logistic_regression = LogisticRegression()

Now we train it

In [34]:
logistic_regression.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


Now we use our trained model to predict y using the test dataset

In [36]:
log_reg_pred_y = logistic_regression.predict(X_test)
print(log_reg_pred_y)

[0 7 9 5 8 1 3 3 7 0 9 4 7 4 0 3 1 8 1 3 7 8 4 6 1 0 1 0 5 4 7 1 6 7 8 4 3
 7 4 0 5 9 0 4 8 7 4 3 6 3 9 2 2 5 7 3 7 8 3 3 6 6 8 6 8 5 0 5 3 5 0 7 3 2
 9 9 3 0 2 5 5 9 2 4 5 9 7 7 2 3 0 4 6 1 9 7 1 9 8 3 1 6 7 8 1 8 4 0 1 3 6
 9 5 5 1 6 0 6 2 8 9 4 1 3 4 0 6 7 7 5 8 7 8 2 4 2 5 2 3 8 7 9 8 0 0 6 2 6
 9 0 9 0 0 8 7 5 3 4 0 5 6 2 6 0 4 8 7 9 2 4 3 6 4 4 5 2 8 0 7 7 3 2 2 9 0
 7 2 1 6 7 9 1 5 1 6 4 6 1 3 6 1 0 8 6 5 8 8 9 1 5 1 2 6 7 5 0 1 2 4 7 0 9
 6 4 7 6 5 1 2 5 5 4 6 1 7 6 1 8 9 6 2 8 5 8 3 3 9 0 3 7 9 9 1 7 0 0 5 7 3
 6 3 8 6 3 6 9 8 3 3 4 4 0 7 1 5 6 1 4 1 4 5 6 3 7 4 5 2 5 6 4 0 4 2 2 3 3
 2 9 0 9 7 4 9 8 6 3 2 4 9 4 2 4 4 8 7 0 0 1 6 3 6 8 5 6 1 3 8 3 6 3 8 6 6
 7 7 6 4 6 4 4 1 7 9 0 1 7 4 8 5 5 2 5 3 1 6 3 7 0 0 8 0 8 8 7 1 9 4 9 2 1
 5 8 1 0 6 2 5 5 9 5 7 9 1 1 8 8 1 4 3 4 6 2 9 6 5 0 1 0 9 2 5 5 7 4 0 0 3
 4 7 5 1 3 9 3 8 6 6 9 1 1 5 2 7 0 4 4 1 8 9 1 5 4 1 6 5 4 6 2 9 9 9 0 1 6
 6 1 5 5 3 9 9 9 3 8 3 8 8 1 9 1 3 1 3 9 1 7 7 1 8 4 3 1 9 8 3 4 6 2 7 2 3
 3 3 3 0 3 7 7 3 2 3 3 5 

Next to the second model, we instantiate it

In [38]:
svm = SVC()

We train it with our train dataset

In [40]:
svm.fit(X_train, y_train)

Now we use our trained model to predict y using the test dataset

In [42]:
svm_pred_y = svm.predict(X_test)
print(svm_pred_y)

[0 7 9 5 8 1 3 3 7 0 9 4 7 4 0 1 1 8 1 3 7 8 4 6 1 0 1 0 5 4 7 1 6 7 8 4 3
 7 4 0 5 9 0 4 8 7 4 3 6 3 9 2 2 5 7 3 7 8 3 8 6 6 8 6 8 5 0 5 3 5 0 7 3 2
 9 9 3 0 2 5 5 9 2 4 5 9 7 7 2 3 0 4 6 1 9 7 1 9 8 3 4 6 7 8 1 8 4 0 1 3 6
 9 5 5 1 6 0 6 2 8 9 4 1 3 4 0 6 7 7 9 8 7 8 2 4 2 5 2 3 8 8 9 8 0 0 6 2 6
 9 0 9 0 0 9 7 5 3 4 0 5 6 2 6 0 4 8 7 9 2 4 3 6 4 4 5 2 8 0 7 7 3 2 2 9 0
 7 2 1 6 7 9 1 5 1 6 4 6 1 3 6 1 0 8 6 5 8 8 9 1 5 1 2 6 7 5 0 1 2 4 7 0 9
 6 4 7 6 5 1 2 5 5 4 6 1 7 6 1 8 9 6 2 8 5 8 3 3 9 0 3 7 9 9 1 7 0 0 5 7 3
 6 3 8 6 3 6 9 8 3 3 4 4 0 7 1 5 6 1 4 1 4 5 6 3 7 4 5 2 5 6 4 0 4 2 2 3 3
 2 9 0 9 7 4 9 8 6 3 2 4 9 4 2 4 4 8 7 0 0 1 6 3 6 8 5 6 1 3 8 3 6 3 8 6 6
 7 7 6 4 6 8 4 1 7 9 0 1 7 4 8 5 5 2 5 3 4 6 3 7 0 0 8 0 8 8 7 1 9 4 9 2 1
 5 8 1 0 6 2 5 5 7 5 7 9 1 1 8 8 1 4 3 4 6 2 9 6 5 0 1 0 9 2 5 5 7 4 0 0 3
 4 7 5 4 3 9 3 8 6 6 9 1 1 5 2 7 0 4 4 1 5 9 1 5 4 1 6 5 4 6 2 9 9 9 0 1 6
 6 1 5 5 3 9 9 9 3 8 3 8 8 1 9 1 3 1 3 9 1 7 7 1 8 4 7 1 9 8 3 4 6 2 7 2 3
 3 3 3 0 3 7 7 3 2 3 3 5 

We Instatiate our third model as

In [44]:
tree = DecisionTreeClassifier()

Next we fit our model to our train dataset as

In [46]:
tree.fit(X_train, y_train)

Now we use our trained model to predict y using the test dataset

In [49]:
tree_pred_y = tree.predict(X_test)
print(tree_pred_y)

[0 7 9 5 8 0 9 3 7 0 9 4 7 4 0 3 1 8 1 3 7 8 4 6 9 0 1 0 5 4 7 1 6 7 8 4 3
 7 4 0 5 9 0 4 8 2 4 3 6 3 3 8 2 5 7 3 7 6 3 8 6 6 0 6 5 5 0 2 3 5 0 4 3 2
 9 9 3 0 2 5 3 9 2 4 5 9 7 9 2 8 0 4 6 9 1 7 1 9 9 3 4 6 7 8 1 8 4 0 1 3 6
 9 5 5 1 4 2 6 2 2 9 1 1 3 4 0 6 7 3 9 5 7 8 2 4 2 5 2 3 8 7 9 8 0 0 6 2 6
 9 0 9 0 0 9 4 5 3 4 0 5 6 2 2 0 4 8 7 9 2 4 3 6 4 4 9 2 8 0 7 7 3 2 2 9 0
 7 2 1 6 7 9 1 5 1 6 4 6 2 7 6 1 0 8 6 5 8 8 9 1 5 1 2 6 7 5 0 1 2 1 7 4 9
 7 4 7 6 5 1 2 5 5 8 6 4 7 6 1 8 9 6 3 8 5 8 3 3 9 0 3 7 5 9 1 7 0 0 5 7 3
 6 3 8 1 3 2 9 8 3 7 4 4 0 7 1 3 6 1 4 1 4 9 4 3 7 8 5 2 5 6 4 0 4 8 8 3 5
 2 9 0 5 7 4 9 8 2 3 2 4 9 4 2 8 4 9 7 0 0 1 6 3 6 8 5 6 1 8 8 3 6 3 8 6 6
 7 7 6 4 6 7 4 1 7 9 0 1 7 4 8 5 5 2 9 3 4 6 3 7 0 0 8 0 8 0 7 1 9 4 3 1 1
 9 8 1 0 6 1 5 5 8 5 7 9 1 8 8 9 9 4 3 4 6 3 9 6 5 0 1 0 9 2 5 5 7 4 0 0 9
 4 7 5 1 8 8 3 8 6 6 9 1 1 4 1 7 9 4 4 1 9 9 7 5 4 1 6 1 4 6 2 9 9 9 0 1 6
 6 1 5 5 3 1 7 9 3 8 3 8 8 1 9 1 3 1 3 9 1 4 7 1 8 5 8 1 9 8 3 4 6 7 4 2 3
 3 3 3 0 8 7 7 3 2 3 3 5 

**Model Evaluation**

Next we compare the three models for performance on predicting y based on metrics such as accuracy, precision, etc. This is achieved with classification_report 
in sklearn.metrics as 

In [51]:
from sklearn.metrics import classification_report

For the comparison, we need to bring the three models predictions(the three iterables) using dictionary as

In [53]:
model_preds_y = {'Logistics Regression': log_reg_pred_y,
              'Support Vector Classifier': svm_pred_y,
              'Decision Tree Classifier': tree_pred_y}

Now we can print out our classification_report as

In [55]:
for model, predictions_y in model_preds_y.items():
    print(f'Report for {model}:\n')
    print(classification_report(y_test, predictions_y))

Report for Logistics Regression:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        50
           1       0.93      0.98      0.96        56
           2       1.00      1.00      1.00        44
           3       0.97      0.97      0.97        63
           4       0.98      0.95      0.97        60
           5       0.94      0.98      0.96        51
           6       1.00      0.95      0.97        59
           7       0.98      0.96      0.97        53
           8       0.94      0.96      0.95        52
           9       0.94      0.94      0.94        52

    accuracy                           0.97       540
   macro avg       0.97      0.97      0.97       540
weighted avg       0.97      0.97      0.97       540

Report for Support Vector Classifier:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        50
           1       0.98      1.00      0.99        56
     

The support vector classifier model reported the highest accuracy of 98% followed by the logistics regression model. This result is in line with previous models predictions that used wine dataset where svc resulted in a perfect 100% accuracy and gave room for suspicion of over-fitting of svc. 
Having cross-validated the models with this digits dataset, we recommend SVC as the model of choice for our datasets.