<a href="https://colab.research.google.com/github/anjha1/Data-Science/blob/main/Non-Negative%20Matrix%20Factorization%20(NMF)/(NMF).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Creating the content as a string
content = """To determine the order (dimensions) of matrices W and H in Non-Negative Matrix Factorization (NMF), follow these steps:

### Step 1: Understanding the Dimensions
- Suppose the original matrix Meas has the dimensions:  
  Meas : m × n
  where:
  - m = Number of rows (samples)
  - n = Number of columns (features)
- You choose the number of components k for factorization using NMF.

### Step 2: Matrix Dimensions After Factorization
When using NMF, it factorizes the matrix into two matrices:

1. Matrix W (Basis Matrix):  
W : m × k
- m = Same number of rows as Meas
- k = Number of components (you choose it)
  
2. Matrix H (Coefficient Matrix):  
H : k × n
- k = Number of components
- n = Same number of columns as Meas

### Step 3: Formula for Reconstruction
The relationship between these matrices is given by:  
Meas ≈ W · H
Where:
- W · H results in a matrix of size m × n, matching the original matrix.

---

### Example 1: Short Matrix

- Meas is a 3 × 4 matrix.
- You choose k = 2.
  
Matrix W:  
W : 3 × 2

Matrix H:  
H : 2 × 4

Multiplying them:  
3 × 2 · 2 × 4 = 3 × 4

---

### Example 2: Fisher Iris Data

- Meas is a 150 × 4 matrix (150 samples, 4 features).
- k = 2 components.

Matrix W:  
W : 150 × 2

Matrix H:  
H : 2 × 4

Multiplying them:  
150 × 2 · 2 × 4 = 150 × 4

This reconstructed matrix should approximately match the original data. If you'd like further examples or explanations, feel free to ask!
"""



In [None]:
import numpy as np
from sklearn.decomposition import NMF

In [None]:
# 6(a) Short Matrix
Meas = np.array([
    [5.1, 3.5, 1.4, 0.2],
    [4.9, 3.0, 1.4, 0.2],
    [4.7, 3.2, 1.3, 0.2]
])
Meas

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2]])

In [None]:
# Applying NMF with 2 components
nmf_model = NMF(n_components=2, init='random', random_state=42)
W = nmf_model.fit_transform(Meas)
H = nmf_model.components_

In [None]:
# Reconstructed matrix
L = np.dot(W, H)

In [None]:
print("Original Matrix (Meas):\n", Meas)

Original Matrix (Meas):
 [[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]]


In [None]:
print("W Matrix:\n", W)

W Matrix:
 [[0.40014092 1.02674731]
 [0.621223   0.47772346]
 [0.38597493 0.90983001]]


In [None]:
print("H Matrix:\n", H)

H Matrix:
 [[5.8094732  3.15278669 1.71843251 0.23932044]
 [2.70230392 2.17991891 0.69641181 0.10874246]]


In [None]:
print("Reconstructed Matrix (L):\n", L)

Reconstructed Matrix (L):
 [[5.09919126 3.49978485 1.40265412 0.20741293]
 [4.89993234 2.999982   1.40022205 0.20062018]
 [4.7009482  3.20025225 1.29688822 0.19130884]]


In [None]:
# 6(b) Long Matrix using Fisher Iris dataset
from sklearn import datasets
iris = datasets.load_iris()
meas = iris.data

In [None]:
# Applying NMF with 2 components
nmf_model_iris = NMF(n_components=2, init='random', random_state=42)
W_iris = nmf_model_iris.fit_transform(meas)
H_iris = nmf_model_iris.components_

In [None]:
# Reconstructed matrix
L_iris = np.dot(W_iris, H_iris)

In [None]:
print("\nOriginal Iris Data Matrix (meas):\n", meas[:5])


Original Iris Data Matrix (meas):
 [[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]


In [None]:
print("W Matrix for Iris Data:\n", W_iris[:5])

W Matrix for Iris Data:
 [[1.52995118 0.14186618]
 [1.39288444 0.20725808]
 [1.40311201 0.14017235]
 [1.34566372 0.22203649]
 [1.5304409  0.12583602]]


In [None]:
print("H Matrix for Iris Data:\n", H_iris)

H Matrix for Iris Data:
 [[3.20334813 2.28667568 0.69632093 0.04327065]
 [1.3708567  0.0541275  2.36939905 0.95290157]]


In [None]:
print("Reconstructed Matrix (L_iris):\n", L_iris[:5])

Reconstructed Matrix (L_iris):
 [[5.09544444 3.50618101 1.40147461 0.20138649]
 [4.74601489 3.19629334 1.46097168 0.25776757]
 [4.68681242 3.21604928 1.30914048 0.19428403]
 [4.61500955 3.08911477 1.46310685 0.26980667]
 [5.07503814 3.50643317 1.36383376 0.18613252]]


In [None]:
meas[:5]

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2]])

In [None]:
# Calculate the element-wise difference
matrix_difference = meas - L_iris
matrix_difference[:5]  # Display the first 5 rows for a quick view

array([[ 0.00455556, -0.00618101, -0.00147461, -0.00138649],
       [ 0.15398511, -0.19629334, -0.06097168, -0.05776757],
       [ 0.01318758, -0.01604928, -0.00914048,  0.00571597],
       [-0.01500955,  0.01088523,  0.03689315, -0.06980667],
       [-0.07503814,  0.09356683,  0.03616624,  0.01386748]])