# Principal Component Analysis using the Iris Dataset

[![Open in Layer](https://development.layer.co/assets/badge.svg)](https://app.layer.ai/douglas_mcilwraith/iris-pca/) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/layerai/examples/blob/main/pca-iris/iris-pca.ipynb) [![Layer Examples Github](https://badgen.net/badge/icon/github?icon=github&label)](https://github.com/layerai/examples/tree/main/pca_iris)

We use the iris dataset to perform Principal Component Analysis (PCA). PCA is performed and the first two components are used to plot the iris data in two dimensions (down from the original four). We make use of `layer.log()` to plot the resultant graph under the associated resources for this model, this can be found [here](https://app.layer.ai/douglas_mcilwraith/iris-pca/models/iris-pca): 

Some code used with permission from: Douglas McIlwraith, Haralambos Marmanis, and Dmitry Babenko. 2016. Algorithms of the Intelligent Web (2nd. ed.). Manning Publications Co., USA.

In [2]:
!pip install layer -U

Collecting layer
  Downloading layer-0.9.357766-py3-none-any.whl (438 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m438.3/438.3 KB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m


Installing collected packages: layer
  Attempting uninstall: layer
    Found existing installation: layer 0.9.356767
    Uninstalling layer-0.9.356767:
      Successfully uninstalled layer-0.9.356767
Successfully installed layer-0.9.357766
You should consider upgrading via the '/Users/douglas.mcilwraith/.pyenv/versions/3.8.10/bin/python3.8 -m pip install --upgrade pip' command.[0m[33m
[0m

In [3]:
import layer
layer.login()

In [4]:
layer.init("iris-pca")

Project(name='iris-pca', raw_datasets=[], derived_datasets=[], models=[], path=PosixPath('.'), project_files_hash='', readme='', account=Account(id=UUID('93bceff2-c8d2-484c-99cb-8bc5c8b6962a'), name='douglas_mcilwraith'), _id=UUID('72ba452e-f9c4-4ce3-8d48-80a996832f10'), functions=[])

In [5]:
import numpy as np
import pandas as pd

from layer.decorators import model, dataset

from sklearn import decomposition
from sklearn import datasets

from itertools import cycle
import matplotlib.pyplot as pl

In [6]:
iris = datasets.load_iris()
iris_df = pd.DataFrame(data= np.c_[iris['data'], iris['target']], columns= iris['feature_names'] + ['target'])    

@dataset("iris-data")
def build():
    return iris_df

layer.run([build])

Output()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Run(project_name='iris-pca')

In [7]:
df_iris = layer.get_dataset("iris-data").to_pandas()
df_iris_X = df_iris.drop(columns=['target'])
df_iris_Y = df_iris['target']

In [8]:
@model("iris-pca")
def train():
    pca = decomposition.PCA(n_components=2)
    pca.fit(df_iris_X)
    
    targets = range(len(list(df_iris_Y.unique())))
    colors = cycle('rgb')
    markers = cycle('^+o')

    X = pca.transform(df_iris_X)

    for target,color,marker in zip(targets,colors,markers):
        pl.scatter(X[df_iris_Y==target,0],X[df_iris_Y==target,1],label=targets[target],c=color,marker=marker)
        pl.legend()
        
    layer.log({"PCA on iris":pl})
    
    return pca

layer.run([train])

Output()

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Run(project_name='iris-pca')