
# Solution to the following programming question: 


Using sklearn.model_selection.permutation_test_score, compute the p-value indicating if the score obtained by an instance of sklearn.dummy.DummyClassifier on the dataset sklearn.datasets.load_iris is obtained by chance.

Repeat this analysis using sklearn.ensemble.HistGradientBoostingClassifier instead of sklearn.dummy.DummyClassifier.

In [11]:
# Author:  Irene Markelic <irene@markelic.de>

## Dataset

As asked, we will use the `iris_dataset`, which consists of measurements taken
from 3 types of irises.

In [12]:
from sklearn.datasets import load_iris

iris = load_iris()
X = iris.data
y = iris.target

## Apply the two different models to permuation_test_score and compute the according p-values


In [13]:
from sklearn.model_selection import StratifiedKFold, permutation_test_score
from sklearn.dummy import DummyClassifier
from sklearn.ensemble import HistGradientBoostingClassifier

dummy_clf = DummyClassifier()
grad_boosting_clf = HistGradientBoostingClassifier(max_bins=255, max_iter=100)

cv = StratifiedKFold(2, shuffle=True, random_state=0)

score_dummy, perm_scores_dummy, pvalue_dummy = permutation_test_score(
    dummy_clf, X, y, scoring="accuracy", cv=cv, n_permutations=100
)
      
score_boost, perm_scores_boost, pvalue_boost = permutation_test_score(
    grad_boosting_clf, X, y, scoring="accuracy", cv=cv, n_permutations=100
)
print('this is pvalue_dummy: ', pvalue_dummy)
print('this is pvalue_boost: ', pvalue_boost)


this is pvalue_dummy:  1.0
this is pvalue_boost:  0.009900990099009901


# Answers to questions: 

Q1: What can you conclude about the existence of a significant statistical association between the iris type and the input features (petal and sepal width and length)? 

Q2: What can you conclude about the ability of each kind of estimator to assess or not such a statistical association between features and target variable?

The null hypothesis is that the score was obtained by chance. If the p-value is very small, e.g. smaller than
0.025, we should reject the null hypothesis and have reason to believe that the model score was not obtained
by random guessing. 

The p-value for the dummy classifier is: 1
The p-value for the HistGradientBoosting Classifier is: 0.000999000999000999
Thus, we can conclude that the dummy classifier does not differ (significantly) from random guessing. However, for the boosting-based classifier we must reject the null hypothesis and conclude that this classifier is very different from random guessing. Also, this shows that there is a significant statistical association between the iris type and the input features (Q1) and the boosting-based classifier captures it (Q2).
But since the p-value for the dummy classifier is relatively large it is obviously not capable of capturing this association - unlike the boosting-based classifier (Q2).