When comparing different cross-validation techniques on the Iris dataset, each method offers unique advantages and potential insights into model performance. Here's an inference based on using these four methods:

1. *Hold Out Cross-Validation*:
    - *Method*: The dataset is split into two parts: training and testing. Typically, a common split is 80/20 or 70/30.
    - *Inference*: This method is simple and quick but can be subject to variability depending on the split. If the training and testing sets are not representative, the model performance might be misleading. It provides a snapshot of model performance but doesn't utilize the entire dataset for training.

2. *K-Fold Cross-Validation*:
    - *Method*: The dataset is divided into k equal parts. The model is trained on k-1 parts and tested on the remaining part. This process is repeated k times, with each part used exactly once as the testing set.
    - *Inference*: K-Fold cross-validation provides a more reliable estimate of model performance compared to hold-out validation. By training and testing on different subsets of the data, it reduces the variance in performance estimation. It ensures that every data point is used for both training and testing, providing a more comprehensive assessment.

3. *Stratified K-Fold Cross-Validation*:
    - *Method*: Similar to K-Fold, but the data is split in such a way that each fold has approximately the same proportion of classes as the entire dataset.
    - *Inference*: Stratified K-Fold is especially useful for imbalanced datasets. For the Iris dataset, which has balanced classes, it ensures that each fold is representative of the overall class distribution. This typically leads to more stable and reliable performance metrics compared to regular K-Fold, especially if there are any minor imbalances.

4. *Leave One Out Cross-Validation (LOOCV)*:
    - *Method*: Each data point is used once as a test set while the remaining data points are used as the training set. This process is repeated for every data point in the dataset.
    - *Inference*: LOOCV uses the maximum amount of data for training in each iteration, providing a very thorough assessment. However, it is computationally expensive and can be impractical for large datasets. For the Iris dataset, LOOCV would be feasible and could provide very detailed insights into model performance but might overestimate variance due to the high number of splits.

*Overall Comparison*:
- *Hold Out*: Quick and simple but less reliable due to potential data split variability.
- *K-Fold*: Balances between computational efficiency and reliable performance estimation.
- *Stratified K-Fold*: Enhances K-Fold by ensuring class distribution consistency across folds, providing more stable results.
- *LOOCV*: Provides the most detailed and least biased performance estimate but is computationally intensive.

For the Iris dataset, which is relatively small and balanced, *Stratified K-Fold* cross-validation often strikes the best balance between reliability and computational efficiency, ensuring that each fold is representative of the overall class distribution.

In [2]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
iris=load_iris()
X=iris.data
Y=iris.target
print("Size of Dataset {}".format(len(X)))
logreg=LogisticRegression()
x_train,x_test,y_train,y_test=train_test_split(X,Y,test_size=0.3,random_state=42)
logreg.fit(x_train,y_train)
predict=logreg.predict(x_test)
print("Accuracy score on training set is{}".format(accuracy_score(logreg.predict(x_train),y_train)))
print("Accuracy score on test set is {}".format(accuracy_score(predict,y_test)))

Size of Dataset 150
Accuracy score on training set is0.9619047619047619
Accuracy score on test set is 1.0


In [3]:
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score,KFold
from sklearn.linear_model import LogisticRegression
iris=load_iris()
X=iris.data
Y=iris.target
logreg=LogisticRegression()
kf=KFold(n_splits=5)
score=cross_val_score(logreg,X,Y,cv=kf)
print("Cross Validation Scores are {}".format(score))
print("Average Cross Validation score :{}".format(score.mean()))

Cross Validation Scores are [1.         1.         0.86666667 0.93333333 0.83333333]
Average Cross Validation score :0.9266666666666665


In [4]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import LeaveOneOut,cross_val_score
iris=load_iris()
X=iris.data
Y=iris.target
loo=LeaveOneOut()
tree=RandomForestClassifier(n_estimators=10,max_depth=5,n_jobs=-1)
score=cross_val_score(tree,X,Y,cv=loo)
print("Cross Validation Scores are {}".format(score))
print("Average Cross Validation score :{}".format(score.mean()))

Cross Validation Scores are [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 1.
 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 0. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1.]
Average Cross Validation score :0.9533333333333334
