# Principal Component Analysis using the Iris Dataset

[![Open in Layer](https://development.layer.co/assets/badge.svg)](https://app.layer.ai/douglas_mcilwraith/iris-pca/) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/layerai/examples/blob/main/pca-iris/iris-pca.ipynb) [![Layer Examples Github](https://badgen.net/badge/icon/github?icon=github&label)](https://github.com/layerai/examples/tree/main/pca_iris)

We use the iris dataset to perform Principal Component Analysis (PCA). PCA is performed and the first two components are used to plot the iris data in two dimensions (down from the original four). We make use of `layer.log()` to plot the resultant graph under the associated resources for this model, this can be found [here](https://app.layer.ai/douglas_mcilwraith/iris-pca/models/iris-pca): 

Some code used with permission from: Douglas McIlwraith, Haralambos Marmanis, and Dmitry Babenko. 2016. Algorithms of the Intelligent Web (2nd. ed.). Manning Publications Co., USA.

In [None]:
!pip install layer -U

In [None]:
import layer
layer.login()

In [None]:
layer.init("iris-pca")

In [None]:
import numpy as np
import pandas as pd

from layer.decorators import model, dataset

from sklearn import decomposition
from sklearn import datasets

from itertools import cycle
import matplotlib.pyplot as pl

In [None]:
iris = datasets.load_iris()
iris_df = pd.DataFrame(data= np.c_[iris['data'], iris['target']], columns= iris['feature_names'] + ['target'])    

@dataset("iris-data")
def build():
    return iris_df

layer.run([build])

In [None]:
df_iris = layer.get_dataset("iris-data").to_pandas()
df_iris_X = df_iris.drop(columns=['target'])
df_iris_Y = df_iris['target']

In [None]:
@model("iris-pca")
def train():
    pca = decomposition.PCA(n_components=2)
    pca.fit(df_iris_X)
    
    targets = range(len(list(df_iris_Y.unique())))
    colors = cycle('rgb')
    markers = cycle('^+o')

    X = pca.transform(df_iris_X)

    for target,color,marker in zip(targets,colors,markers):
        pl.scatter(X[df_iris_Y==target,0],X[df_iris_Y==target,1],label=targets[target],c=color,marker=marker)
        pl.legend()
        
    layer.log({"PCA on iris":pl})
    
    return pca

layer.run([train])