# [Scikit Learn](https://scikit-learn.org/stable/)

## User Guide

1. [Supervised learning](https://scikit-learn.org/stable/supervised_learning.html)
    1. [Linear Models](https://scikit-learn.org/stable/modules/linear_model.html)
    2. [Linear and Quadratic Discriminant Analysis](https://scikit-learn.org/stable/modules/lda_qda.html)
    3. [Kernel ridge regression](https://scikit-learn.org/stable/modules/kernel_ridge.html)
    4. [Support Vector Machines](https://scikit-learn.org/stable/modules/svm.html)
    5. [Stochastic Gradient Descent](https://scikit-learn.org/stable/modules/sgd.html)
    6. [Nearest Neighbors](https://scikit-learn.org/stable/modules/neighbors.html)
    7. [Gaussian Processes](https://scikit-learn.org/stable/modules/gaussian_process.html)
    8. [Cross decomposition](https://scikit-learn.org/stable/modules/cross_decomposition.html)
    9. [Naive Bayes](https://scikit-learn.org/stable/modules/naive_bayes.html)
    10. [Decision Trees](https://scikit-learn.org/stable/modules/tree.html)
    11. [Ensemble methods](https://scikit-learn.org/stable/modules/ensemble.html)
    12. [Multiclass and multioutput algorithms](https://scikit-learn.org/stable/modules/multiclass.html)
    13. [Feature selection](https://scikit-learn.org/stable/modules/feature_selection.html)
    14. [Semi-supervised learning](https://scikit-learn.org/stable/modules/semi_supervised.html)
    15. [Isotonic regression](https://scikit-learn.org/stable/modules/isotonic.html)
    16. [Probability calibration](https://scikit-learn.org/stable/modules/calibration.html)
    17. [Neural network models (supervised)](https://scikit-learn.org/stable/modules/neural_networks_supervised.html)
    
2. [Unsupervised learning](https://scikit-learn.org/stable/unsupervised_learning.html)
    1. [Gaussian mixture models](https://scikit-learn.org/stable/modules/mixture.html)
    2. [Manifold learning](https://scikit-learn.org/stable/modules/manifold.html)
    3. [Clustering](https://scikit-learn.org/stable/modules/clustering.html)
    4. [Biclustering](https://scikit-learn.org/stable/modules/biclustering.html)
    5. [Decomposing signals in components (matrix factorization problems)](https://scikit-learn.org/stable/modules/decomposition.html)
    6. [Covariance estimation](https://scikit-learn.org/stable/modules/covariance.html)
    7. [Novelty and Outlier Detection](https://scikit-learn.org/stable/modules/outlier_detection.html)
    8. [Density Estimation](https://scikit-learn.org/stable/modules/density.html)
    9. [Neural network models (unsupervised)](https://scikit-learn.org/stable/modules/neural_networks_unsupervised.html)
3. [Model selection and evaluation](https://scikit-learn.org/stable/model_selection.html)
    1. [Cross-validation: evaluating estimator performance](https://scikit-learn.org/stable/modules/cross_validation.html)
    2. [Tuning the hyper-parameters of an estimator](https://scikit-learn.org/stable/modules/grid_search.html)
    3. [Metrics and scoring: quantifying the quality of predictions](https://scikit-learn.org/stable/modules/model_evaluation.html)
    4. [Validation curves: plotting scores to evaluate models](https://scikit-learn.org/stable/modules/model_evaluation.html)
4. [Inspection](https://scikit-learn.org/stable/inspection.html)
    1. [Partial Dependence and Individual Conditional Expectation plots](https://scikit-learn.org/stable/modules/partial_dependence.html)
    2. [Permutation feature importance](https://scikit-learn.org/stable/modules/permutation_importance.html)
5. [Visualizations](https://scikit-learn.org/stable/visualizations.html)
    1. Available Plotting Utilities
6. [Dataset transformations](https://scikit-learn.org/stable/data_transforms.html)
    1. [Pipelines and composite estimators](https://scikit-learn.org/stable/modules/compose.html)
    2. [Feature extraction](https://scikit-learn.org/stable/modules/feature_extraction.html)
    3. [Preprocessing data](https://scikit-learn.org/stable/modules/preprocessing.html)
    4. [Imputation of missing values](https://scikit-learn.org/stable/modules/impute.html)
    5. [Unsupervised dimensionality reduction](https://scikit-learn.org/stable/modules/unsupervised_reduction.html)
    6. [Random Projection](https://scikit-learn.org/stable/modules/random_projection.html)
    7. [Kernel Approximation](https://scikit-learn.org/stable/modules/kernel_approximation.html)
    8. [Pairwise metrics, Affinities and Kernels](https://scikit-learn.org/stable/modules/metrics.html)
    9. [Transforming the prediction target (y)](https://scikit-learn.org/stable/modules/preprocessing_targets.html)
7. Dataset loading utilities
    1. [Toy datasets](https://scikit-learn.org/dev/datasets/toy_dataset.html)
    2. [Real world datasets](https://scikit-learn.org/dev/datasets/real_world.html)
    3. [Generated datasets](https://scikit-learn.org/dev/datasets/sample_generators.html)
    4. [Loading other datasets](https://scikit-learn.org/dev/datasets/loading_other_datasets.html)
8. Computing with scikit-learn
    1. [Strategies to scale computationally: bigger data](https://scikit-learn.org/dev/computing/scaling_strategies.html)
    2. [Computational Performance](https://scikit-learn.org/dev/computing/computational_performance.html)
    3. [Parallelism, resource management, and configuration](https://scikit-learn.org/dev/computing/parallelism.html)
9. [Model persistence](https://scikit-learn.org/stable/modules/model_persistence.html)
    1. Python specific serialization
    2. Interoperable formats
10. [Common pitfalls and recommended practices](https://scikit-learn.org/dev/common_pitfalls.html)
    1. Inconsistent preprocessing
    2. Data leakage
    3. Controlling randomness

In [28]:
import numpy as np
import sklearn as sk 
sk.__version__


'0.24.1'

# Dataset

In [None]:
## Dataset Function

#1. Load and return the boston house-prices dataset (regression).
datasets.load_boston(*[, return_X_y])

#2. Load and return the iris dataset (classification).
datasets.load_iris(*[, return_X_y, as_frame])

#3. Load and return the diabetes dataset (regression).
datasets.load_diabetes(*[, return_X_y, as_frame])

#4. Load and return the digits dataset (classification).
datasets.load_digits(*[, n_class, return_X_y, as_frame])

#5. Load and return the physical excercise linnerud dataset.
datasets.load_linnerud(*[, return_X_y, as_frame])

#6. Load and return the wine dataset (classification).
datasets.load_wine(*[, return_X_y, as_frame])

#7. Load and return the breast cancer wisconsin dataset (classification).
datasets.load_breast_cancer(*[, return_X_y, as_frame])

#8. Load the Olivetti faces data-set from AT&T (classification).
fetch_olivetti_faces(*[, data_home, …])

#9. Load the filenames and data from the 20 newsgroups dataset (classification).
fetch_20newsgroups(*[, data_home, subset, …])

#10. Load and vectorize the 20 newsgroups dataset (classification).
fetch_20newsgroups_vectorized(*[, subset, …])

#11. Load the Labeled Faces in the Wild (LFW) people dataset (classification).
fetch_lfw_people(*[, data_home, funneled, …])

#12. Load the Labeled Faces in the Wild (LFW) pairs dataset (classification).
fetch_lfw_pairs(*[, subset, data_home, …])

#13. Load the covertype dataset (classification).
fetch_covtype(*[, data_home, …])

#14. Load the RCV1 multilabel dataset (classification).
fetch_rcv1(*[, data_home, subset, …])

#15. Load the kddcup99 dataset (classification).
fetch_kddcup99(*[, subset, data_home, …])

#16. Load the California housing dataset (regression).
fetch_california_housing(*[, data_home, …])

In [31]:
#from sklearn.datasets import load_iris
from sklearn import datasets
type(datasets.load_iris())

sklearn.utils.Bunch

In [15]:
#Load and return the boston house-prices dataset (regression).
type(datasets.load_boston())

sklearn.utils.Bunch

In [13]:
#Load and return the iris dataset (classification).
datasets.load_iris()

{'data': array([[5.1, 3.5, 1.4, 0.2],
        [4.9, 3. , 1.4, 0.2],
        [4.7, 3.2, 1.3, 0.2],
        [4.6, 3.1, 1.5, 0.2],
        [5. , 3.6, 1.4, 0.2],
        [5.4, 3.9, 1.7, 0.4],
        [4.6, 3.4, 1.4, 0.3],
        [5. , 3.4, 1.5, 0.2],
        [4.4, 2.9, 1.4, 0.2],
        [4.9, 3.1, 1.5, 0.1],
        [5.4, 3.7, 1.5, 0.2],
        [4.8, 3.4, 1.6, 0.2],
        [4.8, 3. , 1.4, 0.1],
        [4.3, 3. , 1.1, 0.1],
        [5.8, 4. , 1.2, 0.2],
        [5.7, 4.4, 1.5, 0.4],
        [5.4, 3.9, 1.3, 0.4],
        [5.1, 3.5, 1.4, 0.3],
        [5.7, 3.8, 1.7, 0.3],
        [5.1, 3.8, 1.5, 0.3],
        [5.4, 3.4, 1.7, 0.2],
        [5.1, 3.7, 1.5, 0.4],
        [4.6, 3.6, 1. , 0.2],
        [5.1, 3.3, 1.7, 0.5],
        [4.8, 3.4, 1.9, 0.2],
        [5. , 3. , 1.6, 0.2],
        [5. , 3.4, 1.6, 0.4],
        [5.2, 3.5, 1.5, 0.2],
        [5.2, 3.4, 1.4, 0.2],
        [4.7, 3.2, 1.6, 0.2],
        [4.8, 3.1, 1.6, 0.2],
        [5.4, 3.4, 1.5, 0.4],
        [5.2, 4.1, 1.5, 0.1],
  

In [None]:
#Load and return the diabetes dataset (regression).
datasets.load_diabetes()

In [None]:
#Load and return the digits dataset (classification).
datasets.load_digits([n_class])

In [11]:
#Load and return the linnerud dataset (multivariate regression).
datasets.load_linnerud()

{'data': array([[  5., 162.,  60.],
        [  2., 110.,  60.],
        [ 12., 101., 101.],
        [ 12., 105.,  37.],
        [ 13., 155.,  58.],
        [  4., 101.,  42.],
        [  8., 101.,  38.],
        [  6., 125.,  40.],
        [ 15., 200.,  40.],
        [ 17., 251., 250.],
        [ 17., 120.,  38.],
        [ 13., 210., 115.],
        [ 14., 215., 105.],
        [  1.,  50.,  50.],
        [  6.,  70.,  31.],
        [ 12., 210., 120.],
        [  4.,  60.,  25.],
        [ 11., 230.,  80.],
        [ 15., 225.,  73.],
        [  2., 110.,  43.]]),
 'feature_names': ['Chins', 'Situps', 'Jumps'],
 'target': array([[191.,  36.,  50.],
        [189.,  37.,  52.],
        [193.,  38.,  58.],
        [162.,  35.,  62.],
        [189.,  35.,  46.],
        [182.,  36.,  56.],
        [211.,  38.,  56.],
        [167.,  34.,  60.],
        [176.,  31.,  74.],
        [154.,  33.,  56.],
        [169.,  34.,  50.],
        [166.,  33.,  52.],
        [154.,  34.,  64.],
        