# 03. A Tour of Machine Learning Classifiers Using Schikit-Learn

In [1]:
%run -i  'watermark.py'

2020-01-05 18:08:41
----------------------
python		3.6.7
----------------------
numpy		1.16.2
scipy		1.1.0
pandas		0.25.1
matplotlib	3.1.1
imageio		2.5.0
----------------------
ipython		7.8.0
----------------------
sklearn		0.20.4
tensorflow	1.13.1
nltk		3.2.4


## 03.01. Chosing a classification algorithm
## 03.02. First steps with scikit-learn: training a perceptron

In [2]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
X = iris.data[:, [2, 3]]
y = iris.target

print('Class labels:', np.unique(y))

Class labels: [0 1 2]


In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state=1, stratify=y)

In [4]:
print('Labels counts in y:', np.bincount(y))

print('Labels counts in y_train:', np.bincount(y_train))

print('Labels counts in y_test:', np.bincount(y_test))

Labels counts in y: [50 50 50]
Labels counts in y_train: [35 35 35]
Labels counts in y_test: [15 15 15]


In [5]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

In [6]:
from sklearn.linear_model import Perceptron

ppn = Perceptron(eta0=.1, random_state=1)
ppn.fit(X_train_std, y_train)



Perceptron(alpha=0.0001, class_weight=None, early_stopping=False, eta0=0.1,
      fit_intercept=True, max_iter=None, n_iter=None, n_iter_no_change=5,
      n_jobs=None, penalty=None, random_state=1, shuffle=True, tol=None,
      validation_fraction=0.1, verbose=0, warm_start=False)

In [8]:
y_pred = ppn.predict(X_test_std)

print('Misclassified examples: %d' % (y_test != y_pred).sum())

Misclassified example: 11


In [9]:
from sklearn.metrics import accuracy_score

print('Accuracy: %.3f' % accuracy_score(y_test, y_pred))

Accuracy: 0.756


In [10]:
print('Accuracy: %.3f' % ppn.score(X_test_std, y_test))

Accuracy: 0.756


## 03.03. Modeling class probabilities via logistic regression
### 03.03.01. Logistic regression intuition & conditional probabilities
### 03.03.02. Learning the weights of the logistic cost function
### 03.03.03. Converting an Adaline implementation into an algorithm for logistic regression
### 03.03.04. Training a logistic regression model with scikit-learn
### 03.03.05. Tackling overfitting via regularization
## 03.04. Maximum margin classifcation with support vector machines
### 03.04.01. Maximum margin intuition
### 03.04.02. Dealing with the nonlinearly separable case using slack variables
### 03.04.03. Alternative implementations in scikit-learn
## 03.05. Solving non-linear problems using a kernel SVM
### 03.05.01. Kernel methods for linearly inseparable data
### 03.05.02. Using the kernel trick to find separating hyperplanes in higher dimensional space
## 03.06. Decision tree learning
### 03.06.01. Maximizing information gain  getting the most bang for the buck
### 03.06.02. Building a decision tree
### 03.06.03. Combining weak to strong learners via random forests
## 03.07. K-nearest neighbors a lazy learning algorithm
## 03.08. Summary

- [03.01. Chosing a classification algorithm][0301]
- [03.02. First steps with scikit-learn: training a perceptron][0302]
- [03.03. Modeling class probabilities via logistic regression][0303]
    - [03.03.01. Logistic regression intuition & conditional probabilities][030301]
    - [03.03.02. Learning the weights of the logistic cost function][030302]
    - [03.03.03. Converting an Adaline implementation into an ][030303]
    - [03.03.04. Training a logistic regression model with scikit-learn][030304]
    - [03.03.05. Tackling overfitting via regularization][030305]
- [03.04. Maximum margin classifcation with support vector machines][0304]
    - [03.04.01. Maximum margin intuition][030401]
    - [03.04.02. Dealing with the nonlinearly separable case using slack variables][030402]
    - [03.04.03. Alternative implementations in scikit-learn][030403]
- [03.05. Solving non-linear problems using a kernel SVM][0305]
    - [03.05.01. Kernel methods for linearly inseparable data][030501]
    - [03.05.02. Using the kernel trick to find separating hyperplanes in higher dimensional space][030502]
- [03.06. Decision tree learning][0306]
    - [03.06.01. Maximizing information gain  getting the most bang for the buck][030601]
    - [03.06.02. Building a decision tree][030602]
    - [03.06.03. Combining weak to strong learners via random forests][030603]
- [03.07. K-nearest neighbors a lazy learning algorithm][0307]
- [03.08. Summary][0308]