# EigenValues and EigenVector Demonstration In Python

## Background:
This analysis combines key lifestyle indicators—daily steps, resting heart rate, and sleep duration—to understand overall wellness patterns. Since these variables may be correlated, dimensionality reduction helps identify the strongest underlying trend. Eigenvalues and eigenvectors from the covariance matrix reveal the primary direction of variation in the data.

## Objective:
To compute eigenvalues/eigenvectors for the dataset and create a new composite measure (PC1) by projecting the original features onto the first eigenvector. This provides a single, meaningful score summarizing the dominant lifestyle pattern across individuals.

## Dataset Description :


| **Feature**                  | **Description**                                      |
| ---------------------------- | ---------------------------------------------------- |
| **Daily Steps**              | Daily physical activity measured in number of steps. |
| **Resting Heart Rate (bpm)** | Baseline heart rate in beats per minute.             |
| **Sleep (hours)**            | Total hours of sleep per day.                        |

---


### Import Libraries

In [2]:
import pandas as pd
import numpy as np

### Create Data

In [3]:
data = {
    "Person": [1, 2, 3, 4, 5, 6, 7, 8],
    "Daily Steps":       [4200, 5100, 7000, 6500, 8300, 9000, 7600, 5800],
    "Resting HR (bpm)": [78,   75,   73,   72,   70,   68,   74,   76],
    "Sleep (hours)":     [5.8,  6.1,  6.9,  7.2,  7.5,  7.8,  6.4,  6.0]
}

df = pd.DataFrame(data)
df.drop(columns = ['Person'],axis = 1,inplace = True)

df

Unnamed: 0,Daily Steps,Resting HR (bpm),Sleep (hours)
0,4200,78,5.8
1,5100,75,6.1
2,7000,73,6.9
3,6500,72,7.2
4,8300,70,7.5
5,9000,68,7.8
6,7600,74,6.4
7,5800,76,6.0


### Correlation Matrix

In [4]:
corr_matrix = np.corrcoef(df.T)
corr_matrix

array([[ 1.        , -0.91276757,  0.86766654],
       [-0.91276757,  1.        , -0.97744603],
       [ 0.86766654, -0.97744603,  1.        ]])

### Eigen Values and Eigen Vector

In [5]:
eigenvalues, eigenvectors = np.linalg.eig(corr_matrix)

print("Correlation Matrix:\n", corr_matrix)
print("\nEigenvalues:\n", eigenvalues)
print("\nEigenvectors:\n", eigenvectors)

Correlation Matrix:
 [[ 1.         -0.91276757  0.86766654]
 [-0.91276757  1.         -0.97744603]
 [ 0.86766654 -0.97744603  1.        ]]

Eigenvalues:
 [2.83932814 0.14393315 0.01673871]

Eigenvectors:
 [[ 0.5648913  -0.80569154  0.17821045]
 [-0.58798488 -0.24149775  0.77197968]
 [ 0.57894007  0.54086965  0.61015442]]


#### Note :
**NumPy stores eigenvectors in columns, not rows.**

**The first eigenvector is [ 0.565,-0.588,0.579]**

The linear combination 0.565 * X1 -0.588*X2 + 0.579*X3 is also called as the first principal component commonly used in data reduction.

The idea is that the first principal component is sufficient to summarize the data if it explains a large proportion of variation.

### Total variance explained


In [6]:
total = np.sum(eigenvalues)

var1 = eigenvalues[0] / total
var2 = eigenvalues[1] / total
var3 = eigenvalues[2] / total

print(var1, var2, var3)

0.9464427130064016 0.047977717875939664 0.0055795691176587755


##### Here, the first component explains approximately 95% of the variation.

### Get Linear Combination

#### Note : 
X @ v1 is matrix multiplication (also called the dot product) between:

X → your data matrix with shape (n_rows, 3)

v1 → the first eigenvector with shape (3,)

This operation creates a linear combination of the columns of X.

In [8]:
# First eigenvector
v1 = np.array([0.5648913,-0.58798488, 0.57894007]) #First column of eigen vector

# Linear combination for first eigenvector
df["PC1"] = df @ v1

df

Unnamed: 0,Daily Steps,Resting HR (bpm),Sleep (hours),PC1
0,4200,78,5.8,2330.038492
1,5100,75,6.1,2840.378298
2,7000,73,6.9,3915.31089
3,6500,72,7.2,3633.626907
4,8300,70,7.5,4651.780899
5,9000,68,7.8,5048.554461
6,7600,74,6.4,4253.368215
7,5800,76,6.0,3235.15633


- PC1 (Principal Component 1) is a single composite score that captures most of the variation from the three original variables — Daily Steps, Resting Heart Rate, and Sleep Duration.
- Instead of analyzing all three variables separately, one can simply analyze PC1 to understand the overall wellness/fitness trend.