# PCI (principal component analysis)

**PCI** derives an orthogonal projection to convert a given set of observations to linearly uncorrelated variables. 

The **PCI** projections are called principal components.

## Import required modules for this tutorial

In [None]:
import Plots
import MultivariateStats
import Clustering
import RDatasets, Plots
Plots.plotly()

## Load iris dataset

In [None]:
iris = RDatasets.dataset("datasets", "iris");

## Split half to training set

In [None]:
# training set
Xtr        = Array(iris[1:2:end,1:4])';
Xtr_labels = Array(iris[1:2:end,5]);

In [None]:
Xtr

## Split other half to testing set

In [None]:
Xte = Array(iris[2:2:end,1:4])';
Xte_labels = Array(iris[2:2:end,5]);

## PCI Analysis

Suppose Xtr and Xte are training and testing data matrices with each observations along the columns.

Train a PCA model, allowing up to 4 dimensions:

In [None]:
M = MultivariateStats.fit(MultivariateStats.PCA, Xtr; maxoutdim=4)

The maximum output dimension is 4 but the **PCI** alogorithm finds optimal value as 3:

## Apply PCA model to testing set

In [None]:
Yte = MultivariateStats.transform(M, Xte)

## Reconstruct testing observations (approximately)

In [None]:
Xr = MultivariateStats.reconstruct(M, Yte)
using Statistics
r2 = sum((Xte .- Xr).^2)  # calculates the mse between true and predicted data

In [None]:
sqrt(r2)

## Group results by testing set labels for color coding

In [None]:
setosa     = Yte[:,Xte_labels.=="setosa"]
versicolor = Yte[:,Xte_labels.=="versicolor"]
virginica  = Yte[:,Xte_labels.=="virginica"]

In [None]:
Xte_labels.=="versicolor"

## Visualize first 3 principal components in 3D interactive plot

In [None]:
# visualize first 3 principal components in 3D interacive plot
p = Plots.scatter(setosa[1,:],setosa[2,:],setosa[3,:],marker=:circle,linewidth=0)
Plots.scatter!(versicolor[1,:],versicolor[2,:],versicolor[3,:],marker=:circle,linewidth=0)
Plots.scatter!(virginica[1,:],virginica[2,:],virginica[3,:],marker=:circle,linewidth=0)
Plots.plot!(p,xlabel="PC1",ylabel="PC2",zlabel="PC3")