## Classification Model using Decision Trees with Python

In this exercise, I use the car ratings dataset to create a rating model using Decision Trees with Python. The data contains six attributes that are considered to classify cars into different levels of acceptability. 

### Steps:

- Separate the data into training and testing. <br>
- Plot the generated decision tree. <br>
- Calculate the classification accuracy obtained by the generated model. <br>

The dataset contains the attributes and their possible values described below, and is available at the link below:

https://raw.githubusercontent.com/higoramario/univesp-com410-aprendizado-de-maquinas/main/carros-avaliacao.csv

The dataset for this exercise is based on the Car Evaluation dataset, which is in:
https://archive.ics.uci.edu/ml/datasets/Car+Evaluation


### Attributes:
Portuguese - English
- preco - price: price of the car (muitoalto (very high), alta (high), média (medium), baixa (low)) <br>
- manutencao - maintenance: maintenance price (muitoalto (very high), alto(high), médio (medium), baixo (low)) <br>
- portas - doors: number of doors (2, 3, 4, 5more) <br>
- pessoas - people: number of doors (2, 4, 5more) <br>
- bagageiro - luggage compartment: luggage compartment size (large, medio (medium), pequeno (small)) <br>
- seguranca security: car security (alta (high), media (medium), baixa (low)) <br>
- aceitabilidade (atributo -alvo) - acceptability (target attribute): level of acceptability (inaceitavel (unacceptable), - - - - aceitavel (acceptable), bom (good), muito bom (very good))

### Libraries 
Pandas <br>
Matplotlib <br>
Scikit-Learn

### Libraries

In [None]:
import pandas as pd #

import matplotlib.pyplot as plt

from sklearn import tree # to use decision tree classifier

from sklearn.tree import DecisionTreeClassifier, plot_tree # plot decision tree

from sklearn.model_selection import train_test_split # split training and testing

from sklearn.metrics import accuracy_score # check the accuracy

import warnings

warnings.simplefilter('ignore') # suppress all warnings

### Data

In [None]:
url = 'https://raw.githubusercontent.com/higoramario/univesp-com410-aprendizado-de-maquinas/main/carros-avaliacao.csv'

cars = pd.read_csv(url)

cars.head()

### Transforming attributes into numbers for use in the Decision Tree.

In [None]:
cars['preco'] = cars['preco'].map({'muitoalto':3,'alto':2,'medio':1,'baixo':0})

cars['manutencao'] = cars['manutencao'].map({'muitoalto':3,'alto':2,'medio':1,'baixo':0})

cars['portas'] = cars['portas'].map({'2':2,'3':3,'4':4,'5mais':5})

cars['pessoas'] = cars['pessoas'].map({'2':2,'4':4,'5mais':5})

cars['bagageiro'] = cars['bagageiro'].map({'grande':2,'medio':1,'pequeno':0})

cars['seguranca'] = cars['seguranca'].map({'alta':2,'media':1,'baixa':0})

cars.head(10)

### Separating attributes and classes for training and testing.

In [None]:
attribute_names = ['preco','manutencao','portas','pessoas','bagageiro','seguranca']

attributes = cars[attribute_names]

classes = cars['aceitabilidade'] # target attribute

### Separating the dataset for training and testing.

In [None]:
#training_attributes, test_attributes, training_classes, test classes = train_test_split(attributes, classes, test_size=0.1, random_state=10)

training_attributes, test_attributes, training_classes, test_classes = train_test_split(attributes, classes, test_size=0.1, random_state=10)

### Creating the model

In [None]:
tree = DecisionTreeClassifier()

tree = tree.fit(training_attributes,training_classes)

### Plotting the decision tree in full size to be able to visualize in detail. In the Python notebook, click on the picture to toggle between the enlarged and reduced view.

In [None]:
plt.figure(figsize=(300,160))

plot_tree(tree, filled=True, rounded=True, feature_names=attribute_names)

plt.show()

### Testing the perfect case where all values are as good as possible.

In [None]:
print(tree.predict([[0,0,5,5,2,2]])) #very good

### If security is low, it is classified as unacceptable.

In [None]:
print(tree.predict([[0,0,5,5,2,0]]))

### Checking classification accuracy.

In [None]:
classes_predict = tree.predict(test_attributes)

accuracy = accuracy_score(test_classes,classes_predict)

print('Classification accuracy: {}'.format(round(accuracy,3)*100)+'%')