# "Vehicles."

### _"Recognizing vehicle type from its silhouette" (Classification task)._

## Table of Contents


## Part 0: Introduction

### Overview
The dataset that's we see here contains 19 columns and 846 entries of data about vehicle types.
    
**Метаданные:**
    
* **Class (target)** 

* **COMPACTNESS**

* **CIRCULARITY** 

* **DISTANCE_CIRCULARITY** 

* **RADIUS_RATIO** 

* **PR.AXIS_ASPECT_RATIO** 

* **MAX.LENGTH_ASPECT_RATIO**

* **SCATTER_RATIO** 

* **ELONGATEDNESS** 

* **PR.AXIS_RECTANGULARITY** 

* **MAX.LENGTH_RECTANGULARITY** 

* **SCALED_VARIANCE_MAJOR** 

* **SCALED_VARIANCE_MINOR** 

* **SCALED_RADIUS_OF_GYRATION** 

* **SKEWNESS_ABOUT_MAJOR** 

* **SKEWNESS_ABOUT_MINOR** 

* **KURTOSIS_ABOUT_MAJOR** 

* **KURTOSIS_ABOUT_MINOR** 

* **HOLLOWS_RATIO** 


### Questions:
    
Определите класс транспортного средства по набору данных, описывающих геометрические особенности силуэтов транспортных средств, полученных по фотографиям для распознавания изображений (use multi-class classification; check balance of classes; calculate perdictions).


## [Part 1: Import, Load Data](#Part-1:-Import,-Load-Data.)
* ### Import libraries, Read data from ‘.csv’ file

## [Part 2: Exploratory Data Analysis](#Part-2:-Exploratory-Data-Analysis.)
* ### Info, Head, Describe
* ### 'Class' attribute value counts and visualisation
* ### Label encoder for 'Class' attribute
* ### Vizualisation of all attributes
* ### Correlation list and plot of each attribute
* ### Drop column 'Class'

## [Part 3: Data Wrangling and Transformation](#Part-3:-Data-Wrangling-and-Transformation.)
* ### StandardScaler
* ### Creating datasets for ML part
* ### 'Train\Test' splitting method

## [Part 4: Machine Learning](#Part-4:-Machine-Learning.)
* ### Build, train and evaluate model
    * #### SVC 
    * #### Classification report
    * #### Confusion Matrix
    * #### Misclassification plot
    * #### Comparison table between Actual 'Class' and Predicted 'Class'

## [Conclusion](#Conclusion.)

## Part 1: Import, Load Data.

* ### Import libraries

In [15]:
# import standard libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.stats import norm
%matplotlib inline
sns.set()

import sklearn.metrics as metrics
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report 
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import LabelEncoder, StandardScaler

from sklearn.svm import SVC


from vehicles_helper import *

import warnings
warnings.filterwarnings('ignore')

* ### Read data from ‘.csv’ file

In [2]:
# read data from '.csv' file
dataset = pd.read_csv('vehicles.csv') 

# initialisation of target
target = dataset['Class']

## Part 2: Exploratory Data Analysis.

* ### Info

In [3]:
# print the full summary of the dataset  
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 846 entries, 0 to 845
Data columns (total 19 columns):
 #   Column                     Non-Null Count  Dtype 
---  ------                     --------------  ----- 
 0   COMPACTNESS                846 non-null    int64 
 1   CIRCULARITY                846 non-null    int64 
 2   DISTANCE_CIRCULARITY       846 non-null    int64 
 3   RADIUS_RATIO               846 non-null    int64 
 4   PR.AXIS_ASPECT_RATIO       846 non-null    int64 
 5   MAX.LENGTH_ASPECT_RATIO    846 non-null    int64 
 6   SCATTER_RATIO              846 non-null    int64 
 7   ELONGATEDNESS              846 non-null    int64 
 8   PR.AXIS_RECTANGULARITY     846 non-null    int64 
 9   MAX.LENGTH_RECTANGULARITY  846 non-null    int64 
 10  SCALED_VARIANCE_MAJOR      846 non-null    int64 
 11  SCALED_VARIANCE_MINOR      846 non-null    int64 
 12  SCALED_RADIUS_OF_GYRATION  846 non-null    int64 
 13  SKEWNESS_ABOUT_MAJOR       846 non-null    int64 
 14  SKEWNESS_A

* ### Head

In [4]:
# preview of the first 5 lines of the loaded data 
dataset.head()

Unnamed: 0,COMPACTNESS,CIRCULARITY,DISTANCE_CIRCULARITY,RADIUS_RATIO,PR.AXIS_ASPECT_RATIO,MAX.LENGTH_ASPECT_RATIO,SCATTER_RATIO,ELONGATEDNESS,PR.AXIS_RECTANGULARITY,MAX.LENGTH_RECTANGULARITY,SCALED_VARIANCE_MAJOR,SCALED_VARIANCE_MINOR,SCALED_RADIUS_OF_GYRATION,SKEWNESS_ABOUT_MAJOR,SKEWNESS_ABOUT_MINOR,KURTOSIS_ABOUT_MAJOR,KURTOSIS_ABOUT_MINOR,HOLLOWS_RATIO,Class
0,95,48,83,178,72,10,162,42,20,159,176,379,184,70,6,16,187,197,van
1,91,41,84,141,57,9,149,45,19,143,170,330,158,72,9,14,189,199,van
2,104,50,106,209,66,10,207,32,23,158,223,635,220,73,14,9,188,196,saab
3,93,41,82,159,63,9,144,46,19,143,160,309,127,63,6,10,199,207,van
4,85,44,70,205,103,52,149,45,19,144,241,325,188,127,9,11,180,183,bus


* ### Describe

In [6]:
dataset.describe().round(2)

Unnamed: 0,COMPACTNESS,CIRCULARITY,DISTANCE_CIRCULARITY,RADIUS_RATIO,PR.AXIS_ASPECT_RATIO,MAX.LENGTH_ASPECT_RATIO,SCATTER_RATIO,ELONGATEDNESS,PR.AXIS_RECTANGULARITY,MAX.LENGTH_RECTANGULARITY,SCALED_VARIANCE_MAJOR,SCALED_VARIANCE_MINOR,SCALED_RADIUS_OF_GYRATION,SKEWNESS_ABOUT_MAJOR,SKEWNESS_ABOUT_MINOR,KURTOSIS_ABOUT_MAJOR,KURTOSIS_ABOUT_MINOR,HOLLOWS_RATIO
count,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0,846.0
mean,93.68,44.86,82.09,168.94,61.69,8.57,168.84,40.93,20.58,148.0,188.63,439.91,174.7,72.46,6.38,12.6,188.93,195.63
std,8.23,6.17,15.77,33.47,7.89,4.6,33.24,7.81,2.59,14.52,31.39,176.69,32.55,7.49,4.92,8.93,6.16,7.44
min,73.0,33.0,40.0,104.0,47.0,2.0,112.0,26.0,17.0,118.0,130.0,184.0,109.0,59.0,0.0,0.0,176.0,181.0
25%,87.0,40.0,70.0,141.0,57.0,7.0,146.25,33.0,19.0,137.0,167.0,318.25,149.0,67.0,2.0,5.0,184.0,190.25
50%,93.0,44.0,80.0,167.0,61.0,8.0,157.0,43.0,20.0,146.0,178.5,364.0,173.0,71.5,6.0,11.0,188.0,197.0
75%,100.0,49.0,98.0,195.0,65.0,10.0,198.0,46.0,23.0,159.0,217.0,587.0,198.0,75.0,9.0,19.0,193.0,201.0
max,119.0,59.0,112.0,333.0,138.0,55.0,265.0,61.0,29.0,188.0,320.0,1018.0,268.0,135.0,22.0,41.0,206.0,211.0


* ### 'Class' attribute value counts and visualisation

In [16]:
# target attribute value counts and visualisation
target_attributes_counts(dataset)

NameError: name 'target_attributes_counts' is not defined

Our dataset is balanced.

* ### Label encoder for 'Class' attribute

In [23]:
# label encoder for 'Class' attribute


* ###  Vizualisation of all attributes

In [24]:
# vizualisation of all attributes


* ###  Correlation list and plot of each attribute

In [25]:
# correlation list and plot of each attribute


* ### Drop column 'Class'

## Part 3: Data Wrangling and Transformation.

* ### StandardScaler

In [26]:
# StandardScaler 


* ### Creating datasets for ML part

In [27]:
# set 'X' for features' and y' for the target ('Class').


* ### 'Train\Test' split

In [28]:
# apply 'Train\Test' splitting method


In [29]:
# print shape of X_train and y_train


In [30]:
# print shape of X_test and y_test


## Part 4: Machine Learning.

* ### Build, train and evaluate model

* SVC model

* ### Classification report

* ### Confusion matrix

In [31]:
# confusion matrix of SVC model


* ### Misclassification plot

In [32]:
# misclassification vehicle plot 


* ### Comparison table between Actual 'Class' and Predicted 'Class'

In [33]:
# comparison table between Actual 'Class' and Predicted 'Class'



##  Conclusion.

In [34]:
# submission of .csv file with test predictions
