In [4]:
import pandas as pd
import numpy as np
from sklearn.decomposition import NMF

In [5]:
# Sample non-negative matrix
V = np.array([[1, 0, 0, 1, 0, 0],
              [0, 1, 0, 1, 1, 0],
              [0, 0, 1, 1, 0, 1]
              ])
print("Orignal Matrix V: \n", V)

Orignal Matrix V: 
 [[1 0 0 1 0 0]
 [0 1 0 1 1 0]
 [0 0 1 1 0 1]]


In [7]:
# We set n_components to 2, meaning we want to find a factorization with a rank of 2.
# We create an NMF object. 
# The init='random' parameter tells the algorithm to initialize W and H with random non-negative values. 
# Random_state=0 is used for reproducibility.

n_components = 2
model = NMF(n_components=n_components, init='random', random_state=42)
W = model.fit_transform(V)
H = model.components_

print("\nMatrix W (Document-Component matrix):\n", W)
print("\nMatrix H (Component-Term matrix):\n", H)

V_constructed = np.dot(W, H)
print('Reconstructed V Matrix: \n', V_constructed)


Matrix W (Document-Component matrix):
 [[0.00000000e+00 1.01271248e+00]
 [1.39363280e+00 2.19581252e-04]
 [1.39410467e+00 0.00000000e+00]]

Matrix H (Component-Term matrix):
 [[0.00000000e+00 3.58653097e-01 3.58774543e-01 7.17349885e-01
  3.58653097e-01 3.58774543e-01]
 [9.87447056e-01 1.07087784e-04 0.00000000e+00 9.87447115e-01
  1.07087784e-04 0.00000000e+00]]
Reconstructed V Matrix: 
 [[9.99999953e-01 1.08449135e-04 0.00000000e+00 1.00000001e+00
  1.08449135e-04 0.00000000e+00]
 [2.16824861e-04 4.99830743e-01 4.99999971e-01 9.99939154e-01
  4.99830743e-01 4.99999971e-01]
 [0.00000000e+00 4.99999957e-01 5.00169266e-01 1.00006082e+00
  4.99999957e-01 5.00169266e-01]]


**`n_components`**: It determines the number of hidden components (the rank k) that you want to extract from the data. 
    
- **`init`**: This parameter specifies the method used to initialize the matrices W and H before the iterative process begins.
    
    - `'random'` (default): Initializes W and H with random non-negative numbers.
    - `'nndsvd'`: Non-negative Double Singular Value Decomposition. This is a more sophisticated initialization method that often leads to faster convergence and better results than random initialization, especially for data with sparse structure.
    - `'nndsvda'`: Similar to `'nndsvd'` but handles cases where the data might have some zero singular values.
    - `'nndsvdar'`: Another variant of NNDSVD that incorporates random perturbations.
- **`solver`**: This parameter selects the optimization algorithm used to minimize the objective function. Common options include:
    
    - `'mu'` (default): Multiplicative Update rules, which we discussed earlier. It's generally a safe and widely used option.
    - `'cd'`: Coordinate Descent. This is another iterative optimization algorithm that updates one element of W or H at a time. It can sometimes be faster than `'mu'` for certain datasets.
- **`beta_loss`**: This parameter is relevant when you want to use a different objective function than the default squared Euclidean distance. It corresponds to the β-divergence.
    
    - `'frobenius'` (default): This corresponds to the squared Euclidean distance (∣∣V−WH∣∣F2​).
    - `'kl'`: This corresponds to the Kullback-Leibler (KL) divergence (generalized to include 0 values in V).
    - You can also specify other beta values for different types of divergence, but Frobenius and KL are the most common for NMF.
