# Coping with Dimensionality

# Topics

- The curse of dimensionality
- Principal Component Analysis
- Singular Value Decomposition
- Latent Dirichlet Analysis

## Where are we?

![is there a 4th dimension?](assets/linear-regression/machine-learning-cheet-sheet.png)

(image: [sas.com](https://www.sas.com/en_us/insights/analytics/machine-learning.html))

# Visualizing data

Humans can't visualize data in more than 3-D

# The curse of dimensionality

- As number of dimensions increase, need exponentially more data to create a generalized model

- $d$ dimensions, $v$ target values: $O(v^d)$ examples

# Dimension Reduction

### Objective
"Project" data from high dimensions to lower dimensions

There will be data loss, but should be within acceptable limits

# Techniques to reduce dimensions

- Principal Component Analysis (PCA)
- Singular Value Decomposition (SVD)
- Latent Dirichlet Analysis


# Principal Component Analysis

http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

# Singular Value Decomposition

# Latent Dirichlet Analysis

https://archive.ics.uci.edu/ml/datasets/Motion+Capture+Hand+Postures

In [11]:
import pandas as pd

df = pd.read_csv('C:/Users/issohl/Downloads/Postures/Postures.csv')
df = df.apply(pd.to_numeric, errors='coerce').dropna()
df.head(10)

Unnamed: 0,Class,User,X0,Y0,Z0,X1,Y1,Z1,X2,Y2,...,Z8,X9,Y9,Z9,X10,Y10,Z10,X11,Y11,Z11
0,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
15303,2,2,27.418588,148.21612,14.770587,49.129027,99.723134,0.883278,92.301344,43.900549,...,7.015468,100.841332,65.337892,-54.651373,77.932709,16.487276,-60.048174,16.113686,-42.471264,-3.140685
15304,2,2,99.817462,65.68667,-55.457478,57.653306,152.188853,6.644935,92.276726,43.845759,...,14.037669,-1.211737,88.672706,10.187398,16.482357,-42.641627,-2.050853,2.582169,127.94549,11.415229
36818,2,9,71.944991,135.565802,-27.78271,86.536958,66.145021,-39.206386,30.328132,102.824,...,2.384311,16.416635,127.7438,-19.398255,6.713694,90.193203,-13.508942,84.104979,44.247865,-45.390036
36819,2,9,86.346749,66.590273,-38.629277,71.496851,90.724686,-28.165732,51.26388,99.415833,...,-12.430796,78.02252,15.748048,-54.007325,4.626282,26.21987,2.40999,84.350011,44.798182,-44.880864
36820,2,9,72.541023,91.001656,-26.329067,87.507036,67.629524,-36.272075,71.195968,135.058671,...,-54.063723,40.442153,154.436985,-6.757292,83.578826,44.363539,-45.550582,5.494588,26.766967,3.806395
36821,2,9,85.896607,66.583822,-37.913902,74.132929,136.8945,-22.653122,52.224073,99.803024,...,2.864121,79.14904,17.479158,-52.50795,8.873037,91.569135,-9.106976,84.683328,45.480138,-44.102954
58312,2,12,44.570107,-57.266639,-7.931448,-81.50369,24.453014,12.449578,-73.74616,-34.949689,...,-7.079423,-99.231688,-8.725178,-45.804943,-64.375721,30.053873,12.782347,-49.18656,40.596278,10.331652
58313,2,12,46.229579,-56.123369,-9.292431,51.743311,-34.453451,-4.715155,25.15646,-63.466324,...,-12.780852,-71.795603,52.427854,15.856993,-63.184713,30.249366,11.912254,-48.593433,40.528996,10.383901
58314,2,12,-98.473308,-8.978293,-45.625495,-7.327416,-22.373579,-5.692506,-22.435115,-40.226382,...,-3.752423,45.647405,-56.624884,-9.143981,-48.050911,40.569113,10.226653,-63.042637,29.858839,12.452322
