# Dimensionality
The term "dimensionality" describes the quantity of features or variables used in the research. 
It can be difficult to visualize and interpret the relationships between variables when dealing with high-dimensional data, such as datasets with numerous variables.

## PCA
The Principal Component Analysis is a popular unsupervised learning technique for reducing the dimensionality of data. 
It increases interpretability yet, at the same time, it minimizes information loss. It helps to find the most significant features in a dataset and makes the data easy for plotting in 2D and 3D. 
PCA helps in finding a sequence of linear combinations of variables.
While reducing the number of variables in the dataset, dimensionality reduction methods like PCA are used to preserve the most crucial data. The original variables are converted into a new set of variables called principal components, which are linear combinations of the original variables, by PCA in order to accomplish this. The dataset's reduced dimensionality depends on how many principal components are used in the study.

In [12]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [13]:
dataset = pd.read_csv("Wine.csv")
X = dataset.iloc[:,:-1].values
Y = dataset.iloc[:,-1].values

In [14]:
from sklearn.model_selection import train_test_split

X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.2)

In [15]:
from sklearn.preprocessing import StandardScaler

st = StandardScaler()
X_train = st.fit_transform(X_train)
X_test = st.fit_transform(X_test)
print(X_train)
print(X_test)

[[-1.68391493 -0.34752154 -0.2801616  ...  0.93373078  0.54838329
  -1.29101567]
 [ 0.25381538  2.49623474 -0.13692867 ... -1.49958742 -1.5292845
  -0.05152302]
 [-0.48377229 -0.96959322 -0.9247098  ... -1.58649164 -1.43101643
  -0.32877795]
 ...
 [-1.88393871  1.20765768 -1.92734034 ... -0.84780576  0.33780885
  -0.58320013]
 [-0.54627972  2.78061037  1.0089348  ... -0.54364098 -1.23448029
  -0.7365058 ]
 [ 1.11641791 -0.91627279 -0.31596983 ...  0.28194912  1.37664275
   0.99226027]]
[[ 1.49597606e+00  1.36191162e+00 -4.82132350e-01 -1.21801081e+00
   4.60832748e-01  1.30312102e+00  8.94935221e-01 -5.86648789e-01
   1.05580356e+00 -8.94670160e-02  1.59770740e-01  9.94834417e-01
   8.62974182e-01]
 [ 4.27993152e-01  1.36191162e+00 -8.46260279e-01 -9.36931396e-02
  -8.97411140e-01 -1.51542455e+00 -1.96436952e+00  2.02171794e+00
  -1.73834324e+00  2.88895695e-01 -9.24574940e-01 -1.44190406e+00
  -5.60370728e-01]
 [ 4.15993344e-01 -3.33335013e-01  4.38302137e-02  2.27540482e-01
   2.6679

In [16]:
from sklearn.decomposition import PCA
pca = PCA(n_components = 2)
X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.fit_transform(X_test)
print(X_train_pca)

[[ 1.00902475e+00  2.21176888e+00]
 [-3.38846969e+00 -1.86554265e+00]
 [-2.61411756e+00 -4.31575696e-01]
 [-4.65601695e-01  2.51793593e+00]
 [ 2.72109792e+00 -6.84782550e-01]
 [-2.25998229e+00  3.90229291e-01]
 [ 3.26479395e+00 -6.39025661e-01]
 [ 2.90150600e+00 -1.33364628e+00]
 [ 1.58260856e+00 -4.19721493e-01]
 [-1.43800804e+00 -2.51076274e+00]
 [ 2.71291948e+00 -1.63961966e+00]
 [-2.17614542e+00 -5.22040579e-01]
 [ 3.23296128e+00 -1.11658786e+00]
 [ 5.02155351e-01  3.85900878e+00]
 [-1.47285862e+00  1.79105184e+00]
 [ 2.40333864e+00 -2.25585236e+00]
 [-2.45029790e+00 -5.77225629e-01]
 [-3.53103350e+00 -9.40131044e-01]
 [ 9.86562157e-01  7.86596691e-01]
 [ 8.82937050e-01  1.46538593e+00]
 [-1.41203901e+00  1.37613140e+00]
 [-2.77961374e+00 -3.44525641e-01]
 [ 1.94771274e+00 -6.91546771e-01]
 [ 2.50237825e+00 -1.57927833e+00]
 [-2.33953598e+00  1.21050414e+00]
 [-1.20048997e+00 -1.12414569e-01]
 [-3.33802718e+00 -9.41169266e-01]
 [-2.23685540e+00 -3.86134636e-01]
 [-4.01324187e-01  4

## Linear dimensianality Analysis
In this other than finding the componenent axis we also try to find the maximum separation between the classes

In [17]:
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

lda = LinearDiscriminantAnalysis(n_components = 2)
X_train_lda = lda.fit_transform(X_train,Y_train)
X_test_lda = lda.transform(X_test)
print(X_train_lda)

[[ 0.43158005 -2.69251551]
 [ 4.55085412  2.1998394 ]
 [ 4.19867174 -0.00806676]
 [-0.25348519 -2.74024415]
 [-3.03751947  0.0805162 ]
 [ 3.17991987 -0.07044178]
 [-3.67303217  0.29601806]
 [-3.15537006  1.58895935]
 [-2.79315189 -3.13490889]
 [ 5.02828354  2.32324121]
 [-3.37917801  2.37827086]
 [ 2.50588355  1.35232145]
 [-4.87949993  2.41702701]
 [-0.16880623 -5.84739537]
 [ 1.47300578 -2.59159133]
 [-3.43245813  2.49899354]
 [ 2.68710666  1.13832438]
 [ 5.26881167  1.47541087]
 [-0.80969981 -1.54240234]
 [-0.47950499 -1.78204247]
 [ 0.85070616 -1.36054219]
 [ 3.85183212  0.10676976]
 [-4.11169375  2.09144845]
 [-3.35233139  2.67647134]
 [ 1.59723537 -0.73549564]
 [ 2.2219894   0.03953427]
 [ 4.09375192  1.27225988]
 [ 3.12778262  0.6124516 ]
 [ 2.17487632 -0.70922511]
 [-3.20185371  1.76547371]
 [-4.40490047  1.05282983]
 [ 1.19114592 -3.1066846 ]
 [ 3.58661125  0.64968885]
 [-1.41947855 -2.20904124]
 [-3.69303826  2.85525305]
 [ 4.34223126  1.63810174]
 [ 1.18238714 -0.97532549]
 