## Exercício 1 - Dataset Digits do sklearn

Acesso: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits

# 1) Importar o pacote "sklearn.datasets" e o "load_digits"
# 2) Carregar o dataset através do método: load_digits()
# 3) Observe as keys do dataset usando o método "keys"
# 4) A chave "data" são as features e a chave "target" é o y. Separe os dados em 2 variáveis diferentes
# 5) Verificar a dimensionalidade das features através da variável shape
# 6) Separe o conjunto de dados em treinamento e teste usando o método: "train_test_split"
# 7) Treinar MLP (2 topologias diferentes)
# 8) Treinar Árvore de Decisão (com Entropia e Gini)
# 9) Treinar Árvore de Decisão com max_depth = 2
# 10) Treinar KNN (com duas características diferentes)
# 11) Mostrar a taxa de acerto de todos os modelos

In [1]:
dataset_link = "https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits"

In [4]:
from sklearn.datasets import load_digits

In [5]:
diabetes_dataset = load_digits()

In [6]:
diabetes_dataset.keys()

dict_keys(['data', 'target', 'frame', 'feature_names', 'target_names', 'images', 'DESCR'])

In [7]:
X, y = diabetes_dataset.data, diabetes_dataset.target

In [8]:
X.shape, y.shape

((1797, 64), (1797,))

In [9]:
from sklearn.model_selection import train_test_split

In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [13]:
# import MLPRegressor
from sklearn.neural_network import MLPRegressor
# import DecisionTreeRegressor
from sklearn.tree import DecisionTreeRegressor
# import KNeighborsRegressor
from sklearn.neighbors import KNeighborsRegressor
# import LinearRegression
from sklearn.linear_model import LinearRegression

In [14]:
# create MLPRegressor object
mlp_relu = MLPRegressor(activation='relu', hidden_layer_sizes=(100, 100), max_iter=1000, solver='adam')
mlp_logistic = MLPRegressor(activation='logistic', hidden_layer_sizes=(200, 200), max_iter=1500, solver='sgd')
# create DecisionTreeRegressor object
dt_depth_5 = DecisionTreeRegressor(max_depth=5)
dt_depth_2 = DecisionTreeRegressor(max_depth=2)
# create KNeighborsRegressor object
knn = KNeighborsRegressor(n_neighbors=5)
knn_distance_weights = KNeighborsRegressor(n_neighbors=5, weights='distance')
# create LinearRegression object
lr = LinearRegression()

In [15]:
# fit the model
mlp_relu.fit(X_train, y_train)
mlp_logistic.fit(X_train, y_train)
dt_depth_5.fit(X_train, y_train)
dt_depth_2.fit(X_train, y_train)
knn.fit(X_train, y_train)
knn_distance_weights.fit(X_train, y_train)
lr.fit(X_train, y_train)

In [16]:
# predict the model
mlp_relu_pred = mlp_relu.predict(X_test)
mlp_logistic_pred = mlp_logistic.predict(X_test)
dt_depth_5_pred = dt_depth_5.predict(X_test)
dt_depth_2_pred = dt_depth_2.predict(X_test)
knn_pred = knn.predict(X_test)
knn_distance_weights_pred = knn_distance_weights.predict(X_test)
lr_pred = lr.predict(X_test)

In [17]:
# import mean_squared_error
from sklearn.metrics import mean_squared_error

In [18]:
# calculate the mean squared error
mlp_relu_mse = mean_squared_error(y_test, mlp_relu_pred)
mlp_logistic_mse = mean_squared_error(y_test, mlp_logistic_pred)
dt_depth_5_mse = mean_squared_error(y_test, dt_depth_5_pred)
dt_depth_2_mse = mean_squared_error(y_test, dt_depth_2_pred)
knn_mse = mean_squared_error(y_test, knn_pred)
knn_distance_weights_mse = mean_squared_error(y_test, knn_distance_weights_pred)
lr_mse = mean_squared_error(y_test, lr_pred)

In [19]:
# print the mean squared error
print("MLPRegressor with relu activation function and 2 hidden layers: ", mlp_relu_mse)
print("MLPRegressor with logistic activation function and 2 hidden layers: ", mlp_logistic_mse)
print("DecisionTreeRegressor with max_depth=5: ", dt_depth_5_mse)
print("DecisionTreeRegressor with max_depth=2: ", dt_depth_2_mse)
print("KNeighborsRegressor: ", knn_mse)
print("KNeighborsRegressor with distance weights: ", knn_distance_weights_mse)
print("LinearRegression: ", lr_mse)

MLPRegressor with relu activation function and 2 hidden layers:  1.041402049011317
MLPRegressor with logistic activation function and 2 hidden layers:  0.5292079059360502
DecisionTreeRegressor with max_depth=5:  4.1029786319530075
DecisionTreeRegressor with max_depth=2:  6.440709447421173
KNeighborsRegressor:  0.35033333333333333
KNeighborsRegressor with distance weights:  0.3362684529035735
LinearRegression:  3.744336842561966


In [None]:
## Exercício 2 - Dataset Diabetes do sklearn (Base de Dados de Regressão)

# Acesso: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html#sklearn.datasets.load_diabetes

# 1) Importar o pacote "sklearn.datasets" e o "load_diabetes"
# 2) Carregar o dataset através do método: load_diabetes()
# 3) Observe as keys do dataset usando o método "keys"
# 4) A chave "data" são as features e a chave "target" é o y. Separe os dados em 2 variáveis diferentes
# 5) Verificar a dimensionalidade das features através da variável shape
# 6) Separe o conjunto de dados em treinamento e teste usando o método: "train_test_split"
# 7) Treinar MLP (2 topologias diferentes) - MLPRegressor()
# 8) Treinar Árvore de Decisão - DecisionTreeRegressor()
# 9) Treinar Árvore de Decisão com max_depth = 2 - DecisionTreeRegressor()
# 10) Treinar KNN (com duas características diferentes) - KNeighborsRegressor()
# 11) Treinar o modelo de Regressão Linear - LinearRegression()
# 12) Usar o MSE (Mean Squared Error) pra avaliar os modelos:
# Exemplo: print("MSE: %.2f" % mean_squared_error(teste_y, prediction))