# Chapter 5. Model evaluation and enhancement.
# Part 3. Quality metrics.
R^2 for regressions and accuracy for classificators sometimes are not the metrics that really needed. Thus there are more.

2 types of errors:

1) False-positive

2) False-negative

Accuracy is NOT an adequate metric for evaluating model prognostic ability fitted over unbalanced datasets.

## - Confusion Matrices
One of the most qualitative methods to evaluate model's prognostic ability

In [7]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

#-----setting up
#--loading dataset
digits = load_digits()
#'== 9' means that dataset will have only one forecasting class
y = digits.target == 9
X_train, X_test, y_train, y_test = train_test_split(digits.data, y, random_state=0)

#initialization, builidng and applying logreg
logreg = LogisticRegression(C=0.2).fit(X_train, y_train)
pred_logreg = logreg.predict(X_test)

#-----applying confusion matrix
confusion = confusion_matrix(y_test, pred_logreg)
print('Confusion matrix: \n{}'.format(confusion))

Confusion matrix: 
[[402   1]
 [  6  41]]


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


^ Gain array means following:

Number of rows (2): There were 2 actual classes and each row represents one of them

Number of columns (2): Forecated 2 classes and each column represents one of them

Main diagonal represents number of correctly forecasted samples

Rest of elements represents numbers of mistakes

#### Some examples for alternative models:
These models show nice accuracy but considering the unbalanced dataset they were trained on AND the very incorrect specifics of some of them, there IS a need of using another metrics like confusion matrix to check the actual prognostic ability.

In [9]:
from sklearn.dummy import DummyClassifier
from sklearn.tree import DecisionTreeClassifier

#-----setting up models
#strategy - most frequent class preferring
dummy_majority = DummyClassifier(strategy='most_frequent').fit(X_train, y_train)
pred_most_frequent = dummy_majority.predict(X_test)

#strategy - tree model
tree = DecisionTreeClassifier(max_depth=2).fit(X_train, y_train)
pred_tree = tree.predict(X_test)

#strategy - dummies model
dummy = DummyClassifier().fit(X_train, y_train)
pred_dummy = dummy.predict(X_test)

#model's quality check
print("<Most frequent> strategy:")
print(confusion_matrix(y_test, pred_most_frequent))
print("\n<Dummy model> strategy:")
print(confusion_matrix(y_test, pred_dummy))
print("\n<Tree model> strategy:")
print(confusion_matrix(y_test, pred_tree))

<Most frequent> strategy:
[[403   0]
 [ 47   0]]

<Dummy model> strategy:
[[403   0]
 [ 47   0]]

<Tree model> strategy:
[[390  13]
 [ 24  23]]


^ Only tree model among given alternatives shows a decent ish correctness. But still logreg was better.