As análises abaixo foram balizadas nas análises do notebook TCC_clasificacao_imagens_baselines, onde foram criados dezenas de modelos preliminares, que serão aperfeiçoados neste arquivo.

Os modelos escolhidos foram: 

* KNeighborsClassifier => weights="distance", algorithm= "kd_tree" e "brute", p= 1 
* MLPClassifier => solver='lbfgs', learning_rate="constant" e "invscaling" 
* LogisticRegression => solver = "liblinear"e"lbfgs", penalty= "l1"e"l2"
* RandomForestClassifier => criterion= "gini", class_weight = "balanced"
* SGDClassifier => loss="squared_hinge" e "hinge", penalty = "elasticnet" e "l2"

In [2]:
import skimage
import numpy as np
import pandas as pd
import os
from sklearn.model_selection import train_test_split, StratifiedKFold, GridSearchCV

#como o stratifiedkFold deu resultado pior, outra solução é StratifiedShuffleSplit pois mantem a qtd de observações
# https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedShuffleSplit.html#sklearn.model_selection.StratifiedShuffleSplit
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn import preprocessing
from sklearn.metrics import classification_report
import time
from sklearn import svm
from sklearn.linear_model import SGDClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression

manual do sklearn de como tunar os paremetros

https://scikit-learn.org/stable/modules/grid_search.html#grid-search

## Carrega a base e divide em treino e teste

In [3]:
inicioGeral = time.time()
bases_prontas_path = os.path.join("D:\\","FIA","TCC","BASES","")
print(bases_prontas_path)

D:\FIA\TCC\BASES\


In [4]:
df = pd.read_csv(bases_prontas_path+'mask_dataset_vgg16_preprocess_input_224_224_3_feature_extracted.csv')
X, y = df.drop(['im_path', 'class'], axis=1), df['class'].values

label_transformer = preprocessing.LabelEncoder()
label_transformer.fit(y)
y = label_transformer.transform(y)

# define o cros validator
#https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html#sklearn.model_selection.StratifiedKFold
# cv = StratifiedKFold(2)
cv = StratifiedShuffleSplit(n_splits=7, test_size=0.3, random_state=42)

# define as metricas
#https://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter
#https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score
score_list = ["accuracy","f1_weighted"]

# Modelos

In [5]:
def treinar_modelos(modelo,parametros):
    # define o GridSearchCV               
    models_fit = GridSearchCV(modelo, parametros, cv=cv, scoring=score_list, refit="f1_weighted",verbose=1)
#     models_fit = GridSearchCV(modelo, parametros, scoring=score_list, refit="f1_weighted",verbose=1)

    #Treina o GridSearchCV
    models_fit.fit(X, y)
    try:

        #Imprime as informações do melhor modelo
        print(f"O melhor modelo foi:\n{models_fit.best_estimator_}\nscore: {models_fit.best_score_}\nParametros:\n{models_fit.best_params_}")
        return models_fit
    except:
        return models_fit 

#     print("done!! :)")
#     return lista_modelos,lista_tempos

## Stochastic Gradient Descent

In [23]:
sgd_clf1 = SGDClassifier(random_state=42, tol=1e-3, loss="hinge", penalty = "l2")

# define os parametros e os valores para testar
#https://numpy.org/doc/stable/reference/generated/numpy.logspace.html
# alphas = np.logspace(-4, -0.5, 30)
alphas = np.logspace(-6, -0.5, 80)
parameters={'alpha': alphas,
            'tol':[1e-6,1e-5,1e-4,1e-3,1e-2],
            "max_iter":[100,1000,10000]}

In [24]:
inicioGeral_modelo = time.time()
treinar_modelos(sgd_clf1,parameters)
fimGeral = time.time()
tempo_Total = (fimGeral-inicioGeral_modelo)/60
print(f"tempo para rodar todo o Notebook é aproximadamente {int(tempo_Total/60)} horas e {round(tempo_Total%60,1)} minutos")

"""
O melhor modelo foi:
SGDClassifier(alpha=0.31622776601683794, max_iter=100, random_state=42,
              tol=0.0001)
score: 0.8102045534826345



O melhor modelo foi:
SGDClassifier(alpha=0.31622776601683794, max_iter=100, random_state=42,
              tol=1e-06)
score: 0.8357270988509956
Parametros:
{'alpha': 0.31622776601683794, 'max_iter': 100, 'tol': 1e-06}
tempo para rodar todo o Notebook é aproximadamente 13 horas e 16.1 minutos
"""

Fitting 7 folds for each of 1200 candidates, totalling 8400 fits


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.














































































































































































[Parallel(n_jobs=1)]: Done 8400 out of 8400 | elapsed: 796.0min finished


O melhor modelo foi:
SGDClassifier(alpha=0.31622776601683794, max_iter=100, random_state=42,
              tol=1e-06)
score: 0.8357270988509956
Parametros:
{'alpha': 0.31622776601683794, 'max_iter': 100, 'tol': 1e-06}
tempo para rodar todo o Notebook é aproximadamente 13 horas e 16.1 minutos




'\nO melhor modelo foi:\nSGDClassifier(alpha=0.31622776601683794, max_iter=100, random_state=42,\n              tol=0.0001)\nscore: 0.8102045534826345\n\n'

## k-nearest neighbors algorithm

In [6]:
KNN2 = KNeighborsClassifier(n_neighbors=3, weights="distance", algorithm= "brute", p= 2, metric = "braycurtis", n_jobs=4)

# define os parametros e os valores para testar
parameters = {'n_neighbors': [3, 6, 9]
             }

In [7]:
inicioGeral_modelo = time.time()
treinar_modelos(KNN2,parameters)
fimGeral = time.time()
tempo_Total = (fimGeral-inicioGeral_modelo)/60
print(f"tempo para rodar todo o Notebook é aproximadamente {int(tempo_Total/60)} horas e {round(tempo_Total%60,1)} minutos")

"""
O melhor modelo foi:
KNeighborsClassifier(algorithm='brute', metric='braycurtis', n_jobs=4,
                     n_neighbors=6, weights='distance')
score: 0.8779086211845684
Parametros:
{'n_neighbors': 6}
tempo para rodar todo o Notebook é aproximadamente 0 horas e 7.0 minutos

"""

Fitting 7 folds for each of 3 candidates, totalling 21 fits


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


O melhor modelo foi:
KNeighborsClassifier(algorithm='brute', metric='braycurtis', n_jobs=4,
                     n_neighbors=6, weights='distance')
score: 0.8779086211845684
Parametros:
{'n_neighbors': 6}
tempo para rodar todo o Notebook é aproximadamente 0 horas e 7.0 minutos


[Parallel(n_jobs=1)]: Done  21 out of  21 | elapsed:  7.0min finished


## LogisticRegression

#### [Documentação](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)

In [28]:
RegLog = LogisticRegression(random_state=42)
LogisticRegression(random_state=42, solver = "liblinear", penalty= "l1", class_weight = "balanced")
parameters={"C": [1,50,200,2000],
            "l1_ratio": [0,1,0.25,0.5,0.75],
            "max_iter":[100,10000]}


In [29]:
inicioGeral_modelo = time.time()
treinar_modelos(RegLog,parameters)
fimGeral = time.time()
tempo_Total = (fimGeral-inicioGeral_modelo)/60
print(f"tempo para rodar todo o Notebook é aproximadamente {int(tempo_Total/60)} horas e {round(tempo_Total%60,1)} minutos")

"""
O melhor modelo foi:
LogisticRegression(C=200, l1_ratio=0, multi_class='multinomial',
                   random_state=42)
score: 0.8028218564443405



O melhor modelo foi:
LogisticRegression(C=50, l1_ratio=0, max_iter=10000, random_state=42)
score: 0.8353560779807866
Parametros:
{'C': 50, 'l1_ratio': 0, 'max_iter': 10000}
tempo para rodar todo o Notebook é aproximadamente 13 horas e 49.4 minutos
"""

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Fitting 7 folds for each of 40 candidates, totalling 280 fits


  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to t

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty=

  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty=

  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    htt

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty=

  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty=

  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    htt

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty=

  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty=

  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    htt

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  "(penalty={})".format(self.penalty))
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver

O melhor modelo foi:
LogisticRegression(C=50, l1_ratio=0, max_iter=10000, random_state=42)
score: 0.8353560779807866
Parametros:
{'C': 50, 'l1_ratio': 0, 'max_iter': 10000}
tempo para rodar todo o Notebook é aproximadamente 13 horas e 49.4 minutos


"\nO melhor modelo foi:\nLogisticRegression(C=200, l1_ratio=0, multi_class='multinomial',\n                   random_state=42)\nscore: 0.8028218564443405\n"

## Neural network

In [30]:
from sklearn.neural_network import MLPClassifier

#### [Documentação](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier)

In [None]:
clf_MLP = MLPClassifier(solver='lbfgs', 
                        random_state=42, 
                        max_iter=1000,
                        learning_rate="invscaling")
alphas = np.logspace(-6, -0.5, 50)
parameters={'alpha': alphas,
            'hidden_layer_sizes':[(50, 3),(300, 3)],
            'tol':[1e-2,1e-6],#[1e-5,1e-4,1e-3,1e-2,1e-6],
            "learning_rate_init":[0.5,0.01,0.0001]}#[0.5,0.1,0.01,0.001,0.0001]}


In [None]:
inicioGeral_modelo = time.time()
treinar_modelos(clf_MLP,parameters)
fimGeral = time.time()
tempo_Total = (fimGeral-inicioGeral_modelo)/60
print(f"tempo para rodar todo o Notebook é aproximadamente {int(tempo_Total/60)} horas e {round(tempo_Total%60,1)} minutos")

## Ensemble methods

In [36]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
#https://www.analyticsvidhya.com/blog/2020/03/beginners-guide-random-forest-hyperparameter-tuning/

In [47]:
Ensemble_methods1 = RandomForestClassifier(max_depth=1000, 
                                           random_state=42, 
                                           criterion= "gini", 
                                           class_weight = "balanced")


X_train , _, y_train, _  = train_test_split(X, y, test_size=0.25, random_state=42)
clf = DecisionTreeClassifier(random_state=42, max_depth=2500, criterion = "entropy", max_features = None)
path = clf.cost_complexity_pruning_path(X_train, y_train)
ccp_alphas, impurities = path.ccp_alphas, path.impurities

parameters={'ccp_alpha':ccp_alphas,
            #"max_depth":[100,10000],
            'n_estimators':[100,1000]}#[1e-5,1e-4,1e-3,1e-2,1e-6]}




In [None]:
inicioGeral_modelo = time.time()
model = treinar_modelos(Ensemble_methods1, parameters)
fimGeral = time.time()
tempo_Total = (fimGeral-inicioGeral_modelo)/60
print(f"tempo para rodar todo o Notebook é aproximadamente {int(tempo_Total/60)} horas e {round(tempo_Total%60,1)} minutos")

Fitting 7 folds for each of 2252 candidates, totalling 15764 fits


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


In [None]:
model.best_estimator_

In [None]:
model.best_score_

In [None]:
# https://kapernikov.com/tutorial-image-classification-with-scikit-learn/

In [None]:
# https://gogul.dev/software/image-classification-python

In [18]:
alphas = np.logspace(-6, -0.5, 80)
alphas

array([1.00000000e-06, 1.17387067e-06, 1.37797236e-06, 1.61756134e-06,
       1.89880782e-06, 2.22895482e-06, 2.61650470e-06, 3.07143813e-06,
       3.60547115e-06, 4.23235685e-06, 4.96823959e-06, 5.83207076e-06,
       6.84609684e-06, 8.03643231e-06, 9.43373222e-06, 1.10739816e-05,
       1.29994222e-05, 1.52596406e-05, 1.79128445e-05, 2.10273629e-05,
       2.46834047e-05, 2.89751249e-05, 3.40130494e-05, 3.99269212e-05,
       4.68690419e-05, 5.50181938e-05, 6.45842443e-05, 7.58135504e-05,
       8.89953035e-05, 1.04468977e-04, 1.22633068e-04, 1.43955363e-04,
       1.68984979e-04, 1.98366511e-04, 2.32856630e-04, 2.73343569e-04,
       3.20870000e-04, 3.76659883e-04, 4.42149991e-04, 5.19026908e-04,
       6.09270466e-04, 7.15204733e-04, 8.39557862e-04, 9.85532354e-04,
       1.15688753e-03, 1.35803634e-03, 1.59415904e-03, 1.87133654e-03,
       2.19670709e-03, 2.57865003e-03, 3.02700165e-03, 3.55330847e-03,
       4.17112461e-03, 4.89636086e-03, 5.74769442e-03, 6.74704993e-03,
      