# Practicum in human intelligent information processing
**Prof. Tomohiro Shibata**

Kyushu Institute of Technology

## Content
* numpy
* matplotlib
* scipy

In [144]:
%doctest_mode 
%matplotlib inline

Exception reporting mode: Plain
Doctest mode is: ON


In [145]:
import numpy as np
import matplotlib.pyplot as plt
 

**Example 1 : creating numpy array using inbuilt function**  
組み込み関数によるnumpy配列の作成

In [146]:
a=np.arange(15).reshape(3,5)

In [147]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [148]:
a.shape

(3, 5)

In [149]:
a.ndim

2

In [150]:
a.dtype.name

'int32'

In [151]:
a.size

15

In [152]:
a.itemsize

4

In [153]:
type(a)

<class 'numpy.ndarray'>

**numpy array declaration and common errors**  
numpy配列の宣言と一般的なエラー

In [154]:
#a = np.array(1,2,3,4)    # WRONG
a = np.array([1,2,3,4])  # RIGHT
b = np.array([(1.5,2,3), (4,5,6)])


In [155]:
a
b

array([[ 1.5,  2. ,  3. ],
       [ 4. ,  5. ,  6. ]])

**Type of array can also be explicity specified at time of creation**  
配列作成時に型を明示的に指定することができる

In [156]:
c = np.array( [ [1,2], [3,4] ], dtype=complex )


In [157]:
c

array([[ 1.+0.j,  2.+0.j],
       [ 3.+0.j,  4.+0.j]])

In [158]:
c.dtype

dtype('complex128')

**Some useful array**  
便利な配列

In [159]:
q=np.zeros((3,4))
w=np.ones((2,3,4),dtype=np.int16) #dtype can also be specified
x=np.empty((2,3)) #uninitialized, output may vary　
#numpy.empty は初期化を行わない

In [160]:
q
w

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)

In [161]:
x

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

### To create _sequence_ of numbers, **Numpy** provides a function analogous to _range_ that retuns _array_ insted of list  
数のシーケンスを作成する方法として,Numpyは_list_の代わりに_array_を返す_range_ のような関数を提供する

In [162]:
np.arange(10,30,5)

array([10, 15, 20, 25])

In [163]:
np.arange(0,2,0.3) # it can accept float arguments

array([ 0. ,  0.3,  0.6,  0.9,  1.2,  1.5,  1.8])

### When **arange** is used with floating point arguments, it is generally not possible to predict the number of elements obtained, due to the **finite floating point precision**. For this reason, it is usually better to use the function **linspace** that receives as an **_argument_** the number of elements that we want, instead of the step:

arangeの引数が浮動小数点の場合、得られる要素の数は予測できない。  
そのため、ステップ数の代わりに要素数を引数として受け取るlinspace関数を使う方がよい。

In [164]:
np.linspace(0,2,9) # 9 numbers from 0 to 2

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ,  1.25,  1.5 ,  1.75,  2.  ])

In [165]:
from numpy import pi

In [166]:
x = np.linspace( 0, 2*pi, 100 )        # useful to evaluate function at lots of points
f = np.sin(x)


In [167]:
f

array([  0.00000000e+00,   6.34239197e-02,   1.26592454e-01,
         1.89251244e-01,   2.51147987e-01,   3.12033446e-01,
         3.71662456e-01,   4.29794912e-01,   4.86196736e-01,
         5.40640817e-01,   5.92907929e-01,   6.42787610e-01,
         6.90079011e-01,   7.34591709e-01,   7.76146464e-01,
         8.14575952e-01,   8.49725430e-01,   8.81453363e-01,
         9.09631995e-01,   9.34147860e-01,   9.54902241e-01,
         9.71811568e-01,   9.84807753e-01,   9.93838464e-01,
         9.98867339e-01,   9.99874128e-01,   9.96854776e-01,
         9.89821442e-01,   9.78802446e-01,   9.63842159e-01,
         9.45000819e-01,   9.22354294e-01,   8.95993774e-01,
         8.66025404e-01,   8.32569855e-01,   7.95761841e-01,
         7.55749574e-01,   7.12694171e-01,   6.66769001e-01,
         6.18158986e-01,   5.67059864e-01,   5.13677392e-01,
         4.58226522e-01,   4.00930535e-01,   3.42020143e-01,
         2.81732557e-01,   2.20310533e-01,   1.58001396e-01,
         9.50560433e-02,

### **Printing arrays**
配列の出力

In [168]:
a = np.arange(6)                         # 1d array
print(a)
a

[0 1 2 3 4 5]


array([0, 1, 2, 3, 4, 5])

In [169]:
b = np.arange(12).reshape(4,3)           # 2d array
print(b)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


In [170]:
c = np.arange(24).reshape(2,3,4)         # 3d array
print(c)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


### **If an array is too large to be printed, NumPy automatically skips the central part of the array and only prints the corners:**
Numpyではサイズの大きい配列は中間部分を省略してprintされる。

In [171]:
print(np.arange(10000))

[   0    1    2 ..., 9997 9998 9999]


In [172]:
print(np.arange(10000).reshape(100,100))

[[   0    1    2 ...,   97   98   99]
 [ 100  101  102 ...,  197  198  199]
 [ 200  201  202 ...,  297  298  299]
 ..., 
 [9700 9701 9702 ..., 9797 9798 9799]
 [9800 9801 9802 ..., 9897 9898 9899]
 [9900 9901 9902 ..., 9997 9998 9999]]


### **To disable this behaviour and force NumPy to print the entire array, you can change the printing options using set_printoptions.**
set_printoptions を指定することで上記のふるまいを停止することができる。(配列内の要素がすべて出力される)

In [173]:
#np.set_printoptions(threshold='nan')

#### **Basic operations**
基本的な演算子

In [174]:
a = np.array( [20,30,40,50] )
b = np.arange( 4 )
b

array([0, 1, 2, 3])

In [175]:
c = a-b
c

array([20, 29, 38, 47])

In [176]:
b**2

array([0, 1, 4, 9])

In [177]:
10*np.sin(a)

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

In [178]:
a<35

array([ True,  True, False, False], dtype=bool)

#### **Unlike in many matrix languages, the product operator * operates elementwise in NumPy arrays. The matrix product can be performed using the dot function or method:**
numpy では積演算子 * は配列の要素同士の積を返す。行列同士の積を求めるときは dot関数を使用する。

In [179]:
A = np.array( [[1,1],
            [0,1]] )
B = np.array( [[2,0],
            [3,4]] )
A*B                         # elementwise product

array([[2, 0],
       [0, 4]])

In [180]:
A.dot(B)                    # matrix product

array([[5, 4],
       [3, 4]])

In [181]:
np.dot(A, B)                # another matrix product

array([[5, 4],
       [3, 4]])

#### Some operations, such as += and *=, act in place to modify an existing array rather than create a new one.
+= や *= などの演算を行うと元の配列が書き換えられる。

In [182]:
a = np.ones((2,3), dtype=int)
a

array([[1, 1, 1],
       [1, 1, 1]])

In [183]:
b = np.random.random((2,3))
a *= 3
a

array([[3, 3, 3],
       [3, 3, 3]])

In [184]:
b

array([[ 0.04766559,  0.32544976,  0.39777714],
       [ 0.63928175,  0.66161016,  0.62626311]])

In [185]:
b += a
b

array([[ 3.04766559,  3.32544976,  3.39777714],
       [ 3.63928175,  3.66161016,  3.62626311]])

In [186]:
#a += b                  # b is not automatically converted to integer type

**When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).**

異なる型の配列同士で演算を行った場合、結果はより一般的な型の配列となる(アップキャスト)

# An introduction to machine learning with scikit-learn
scikit-learn による機械学習入門

### in this section, we will introduce the machine learning vocabulary that we will use throughout 
### scikit-learn and give a simple learning example.  


本節ではscikit-learnを用いた機械学習を紹介し、シンプルな学習例を与える。

# **Machine learning: the problem setting**  
機械学習:問題設定

In general, a learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), is it said to have several attributes or features.  
  
一般的に問題の学習とはn標本のデータを考察して未知のデータの特性を予測することである。例として、各サンプルが単数以上で、多次元のエントリ(多変量データ)である場合、それがいくつかの属性または特徴を持つといわれている。

We can separate learning problems in a few large categories:  
問題の学習はいくつかの大きなカテゴリに分けられる  


* **_supervised learning_** , in which the data comes with additional attributes that we want to predict (Click here to go to the scikit-learn supervised learning page).This problem can be either:  


    * classification: samples belong to two or more classes and we want to learn from already labeled data how to predict the class of unlabeled data. An example of classification problem would be the handwritten digit recognition example, in which the aim is to assign each input vector to one of a finite number of discrete categories. Another way to think of classification is as a discrete (as opposed to continuous) form of supervised learning where one has a limited number of categories and for each of the n samples provided, one is to try to label them with the correct category or class.
    * regression: if the desired output consists of one or more continuous variables, then the task is called regression. An example of a regression problem would be the prediction of the length of a salmon as a function of its age and weight.


* **_unsupervised learning_**, in which the training data consists of a set of input vectors x without any corresponding target values. The goal in such problems may be to discover groups of similar examples within the data, where it is called clustering, or to determine the distribution of data within the input space, known as density estimation, or to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization

- **_教師あり学習_** , データとともに、我々の予測したい属性が取得できる.　この問題は次のうちのどちらかになる:  
<br>
    - クラス分類: サンプルは２つ以上のクラスに属しており，すでにラベル付けされたデータから，ラベル付けされていないデータのクラスを予測する方法を学習する． クラス分類問題の例として手書きの数字認識が挙げられる．これの目的は各入力ベクトルを有限個の離散カテゴリの一つに割り当てることである．クラス分類問題のもう一つの考え方は，有限個のカテゴリがあり，与えられたn個のサンプルそれぞれを，正しいカテゴリまたはクラスにラベル付けしようとする離散的(連続に対して)な形の教師あり学習とすることである．
    - 回帰: 求める出力が１つないしはそれ以上の連続変数である場合を回帰と呼ぶ．回帰問題の例としてサケの体長の予測が挙げられる。この場合その年齢と重量が入力となる．


* **_教師なし学習_**, 訓練データが入力ベクトルxのみで対応する目標値が存在しない．このような問題のゴールはデータの中から類似した事例を見つけるクラスタリングや，入力空間におけるデータの分布を求める密度推定，視覚化のために高次元のデータを2個もしくは3個の次元に射影するといったことである。

### Loading an example dataset

scikit-learn comes with a few standard datasets, for instance the **iris** and **digits** datasets for **classification** and the **boston house prices dataset** for **regression**.

scikit-learnにはいくつかのデータセットご同梱されている。例えば、**iris** データセットと **digits** データセットはクラス分類のデータセットであり、**boston house price ** データセットは回帰分析用のデータセットである。

In [187]:
from sklearn import datasets
iris= datasets.load_iris()
digits= datasets.load_digits()

A dataset is a **dictionary**-like object that holds all the data and some metadata about the data. This data is stored in the **.data** member, which is a **n_samples, n_features** array. In the case of supervised problem, one or more response variables are stored in the **.target** member.

データセットはデータについてのメタデータを保持する **dictionary** に似たオブジェクトである。このデータは** . data ** メンバに格納され ** n_samples ** , ** n_features ** の配列となっている．教師あり学習問題の場合１つもしくはそれ以上の目的変数は ** .target **　メンバ内に格納される。

In [188]:
print(digits.data)

[[  0.   0.   5. ...,   0.   0.   0.]
 [  0.   0.   0. ...,  10.   0.   0.]
 [  0.   0.   0. ...,  16.   9.   0.]
 ..., 
 [  0.   0.   1. ...,   6.   0.   0.]
 [  0.   0.   2. ...,  12.   0.   0.]
 [  0.   0.  10. ...,  12.   1.   0.]]


In [189]:
print(digits.target)

[0 1 2 ..., 8 9 8]


### Shape of the data arrays
The data is always a 2D array, **shape (n_samples, n_features)**, although the original data may have had a different shape. In the case of the digits, each original sample is an image of shape (8, 8) and can be accessed using:  

元のデータが異なる形を持っていたとしても，データは常に**shape(n_sample, n_features) ** の２次元配列となる．digits の場合，元のサンプルは(8,8)の画像であり次のようにアクセスできる．

In [190]:
digits.images[0]

array([[  0.,   0.,   5.,  13.,   9.,   1.,   0.,   0.],
       [  0.,   0.,  13.,  15.,  10.,  15.,   5.,   0.],
       [  0.,   3.,  15.,   2.,   0.,  11.,   8.,   0.],
       [  0.,   4.,  12.,   0.,   0.,   8.,   8.,   0.],
       [  0.,   5.,   8.,   0.,   0.,   9.,   8.,   0.],
       [  0.,   4.,  11.,   0.,   1.,  12.,   7.,   0.],
       [  0.,   2.,  14.,   5.,  10.,  12.,   0.,   0.],
       [  0.,   0.,   6.,  13.,  10.,   0.,   0.,   0.]])

## Learning and predecting  
学習と予測  

In the case of the digits dataset, the task is to predict, given an image, which digit it represents. We are given samples of each of the 10 possible classes (the digits zero through nine) on which we fit an estimator to be able to predict the classes to which unseen samples belong.
In scikit-learn, an estimator for classification is a Python object that implements the methods fit(X, y) and predict(T).

An example of an estimator is the class sklearn.svm.SVC that implements support vector classification. The constructor of an estimator takes as arguments the parameters of the model, but for the time being, we will consider the estimator as a black box:

digitsデータセットの場合，タスクは与えられた数字の画像から，それがどの数字を表しているのか予測することである．未知のデータが属するクラスを予測できる予測器をあてはめられる10個の有効なクラス(0から9の数字)のサンプルが与えられる．  
scikit-learn ではクラス分類の予測器はfit(X,y)メソッドと predict(T)メソッドを実装するPythonオブジェクトである．  

例えば，sklearn.svm.SVC クラスはsupprot vector classification を実装する予測器である．
予測器のコンストラクタは引数としてモデルのパラメータをとるが，当面は予測器をブラックボックスとして扱う．

In [191]:
from sklearn import svm
clf = svm.SVC(gamma=0.001,C=100)

In [192]:
clf.fit(digits.data[:-1],digits.target[:-1])

SVC(C=100, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.001, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [193]:
print (clf.predict(digits.data[-1:]))

[8]


In [194]:
#digits.target_names[] #error

SyntaxError: invalid syntax (<ipython-input-194-f15e3022bf7f>, line 1)

In [195]:
digits.target_names

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [196]:

#plt.imshow(digits.data[::])
#plt.imshow(digits.data[-1:], cmap=plt.cm.gray_r)#, interpolation='nearest')

補足:上記のコードを有効化しても図が表示されない場合はdoctest_mode をOFFにする．  
(一番上の %doctes_mode を再度実行すればOFFに切り替わる)

## Model Persistence
モデルの保持  
It is possible to save a model in the scikit by using Python’s built-in persistence model, namely **pickle**:  

Python組み込みの永続化モデル **pickle** を用いてscikitのモデルを保持することができる．

In [197]:
from sklearn import svm
from sklearn import datasets

In [198]:
clf=svm.SVC()

In [199]:
iris=datasets.load_iris()

In [200]:
x,y=iris.data, iris.target

In [201]:
clf.fit(x,y)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [202]:
import pickle

In [203]:
#s=pickle.dump(clf,"model.pkl") # error

In [204]:
with open('model.pkl', mode='wb') as f:
    s = pickle.dump(clf,f)

# **Some Machine learning Example Using Python**  
Python を用いた機械学習の例

In [205]:
# Standard scientific Python imports
import matplotlib.pyplot as plt

In [206]:
# Import datasets, classifiers and performance metrics
from sklearn import datasets, svm, metrics

In [207]:
# The digits dataset
digits = datasets.load_digits()


# The data that we are interested in is made of 8x8 images of digits, let's
# have a look at the first 3 images, stored in the `images` attribute of the
# dataset.  If we were working from image files, we could load them using
# pylab.imread.  Note that each image must have the same size. For these
# images, we know which digit they represent: it is given in the 'target' of
# the dataset.

$8\times8$の数字の画像からなるデータに注目する．  
データセットのimages attribute に格納された画像の、初めの３つを見てみよう．  
image files から作業を行った場合，pylab.imread を用いてそれらを読み込むことができる．  
それぞれの画像はすべて同じサイズでなければならない．
画像がどの"数字"を示しているかはデータセットの'target'で与えられる．

In [208]:
images_and_labels = list(zip(digits.images, digits.target))

In [209]:
for index, (image, label) in enumerate(images_and_labels[:8]):
    plt.subplot(2, 4, index + 1)
    plt.axis('off')
    plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    plt.title('Training: %i' % label)
plt.show()

<matplotlib.figure.Figure object at 0x08281110>

# To apply a classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:  
クラス分類器をこのデータに適用するため，イメージを行列 (samples, fature)のデータに平坦化する必要がある．

In [210]:
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

# Create a classifier: a support vector classifier  
クラス分類器の作成:a support vector classifier  

In [211]:
classifier = svm.SVC(gamma=0.001)

# We learn the digits on the first half of the digits  
始めの半分の"数字"を学習

In [212]:
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])

  if __name__ == '__main__':


SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.001, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

# Now predict the value of the digit on the second half:  
もう半分の"数字"の値を予測

In [213]:
expected = digits.target[n_samples / 2:]
predicted = classifier.predict(data[n_samples / 2:])

  if __name__ == '__main__':
  from ipykernel import kernelapp as app


In [214]:
print("Classification report for classifier %s:\n%s\n"
      % (classifier, metrics.classification_report(expected, predicted)))

Classification report for classifier SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.001, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False):
             precision    recall  f1-score   support

          0       1.00      0.99      0.99        88
          1       0.99      0.97      0.98        91
          2       0.99      0.99      0.99        86
          3       0.98      0.87      0.92        91
          4       0.99      0.96      0.97        92
          5       0.95      0.97      0.96        91
          6       0.99      0.99      0.99        91
          7       0.96      0.99      0.97        89
          8       0.94      1.00      0.97        88
          9       0.93      0.98      0.95        92

avg / total       0.97      0.97      0.97       899




In [215]:
print("Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted))

Confusion matrix:
[[87  0  0  0  1  0  0  0  0  0]
 [ 0 88  1  0  0  0  0  0  1  1]
 [ 0  0 85  1  0  0  0  0  0  0]
 [ 0  0  0 79  0  3  0  4  5  0]
 [ 0  0  0  0 88  0  0  0  0  4]
 [ 0  0  0  0  0 88  1  0  0  2]
 [ 0  1  0  0  0  0 90  0  0  0]
 [ 0  0  0  0  0  1  0 88  0  0]
 [ 0  0  0  0  0  0  0  0 88  0]
 [ 0  0  0  1  0  1  0  0  0 90]]


In [216]:
images_and_predictions = list(zip(digits.images[n_samples / 2:], predicted))

  if __name__ == '__main__':


In [217]:
for index, (image, prediction) in enumerate(images_and_predictions[:4]):
    plt.subplot(2, 4, index + 5)
    plt.axis('off')
    plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    plt.title('Prediction: %i' % prediction)

<matplotlib.figure.Figure object at 0x08884A50>

In [218]:
plt.show()

In [219]:
import pickle


In [220]:
from sklearn import datasets
from sklearn.svm import SVC
iris = datasets.load_iris()
clf = SVC()
clf.fit(iris.data, iris.target)  


SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [221]:
iris.target

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [222]:
list(clf.predict(iris.data[:3]))


clf.fit(iris.data, iris.target_names[iris.target])  




#list(clf.predict(iris.data[]) #error
list(clf.predict(iris.data))



['setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'virginica', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'virginica', 'versicolor', 'versicolor

In [228]:
iris.viewkeys()
#iris.keys()

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])

In [224]:
iris.data

array([[ 5.1,  3.5,  1.4,  0.2],
       [ 4.9,  3. ,  1.4,  0.2],
       [ 4.7,  3.2,  1.3,  0.2],
       [ 4.6,  3.1,  1.5,  0.2],
       [ 5. ,  3.6,  1.4,  0.2],
       [ 5.4,  3.9,  1.7,  0.4],
       [ 4.6,  3.4,  1.4,  0.3],
       [ 5. ,  3.4,  1.5,  0.2],
       [ 4.4,  2.9,  1.4,  0.2],
       [ 4.9,  3.1,  1.5,  0.1],
       [ 5.4,  3.7,  1.5,  0.2],
       [ 4.8,  3.4,  1.6,  0.2],
       [ 4.8,  3. ,  1.4,  0.1],
       [ 4.3,  3. ,  1.1,  0.1],
       [ 5.8,  4. ,  1.2,  0.2],
       [ 5.7,  4.4,  1.5,  0.4],
       [ 5.4,  3.9,  1.3,  0.4],
       [ 5.1,  3.5,  1.4,  0.3],
       [ 5.7,  3.8,  1.7,  0.3],
       [ 5.1,  3.8,  1.5,  0.3],
       [ 5.4,  3.4,  1.7,  0.2],
       [ 5.1,  3.7,  1.5,  0.4],
       [ 4.6,  3.6,  1. ,  0.2],
       [ 5.1,  3.3,  1.7,  0.5],
       [ 4.8,  3.4,  1.9,  0.2],
       [ 5. ,  3. ,  1.6,  0.2],
       [ 5. ,  3.4,  1.6,  0.4],
       [ 5.2,  3.5,  1.5,  0.2],
       [ 5.2,  3.4,  1.4,  0.2],
       [ 4.7,  3.2,  1.6,  0.2],
       [ 4