Principal Component Analysis (PCA) Implementation
Write a Python function that performs Principal Component Analysis (PCA) from scratch. The function should take a 2D NumPy array as input, where each row represents a data sample and each column represents a feature. The function should standardize the dataset, compute the covariance matrix, find the eigenvalues and eigenvectors, and return the principal components (the eigenvectors corresponding to the largest eigenvalues). The function should also take an integer k as input, representing the number of principal components to return.

Example:
Input:
data = np.array([[1, 2], [3, 4], [5, 6]]), k = 1
Output:
[[0.7071], [0.7071]]

In [None]:
import numpy as np

def pca(data: np.ndarray, k: int) -> np.ndarray:
    """
    Perform Principal Component Analysis (PCA) from scratch.

    :param data: 2D NumPy array where each row is a data sample and each column is a feature.
    :param k: Number of principal components to return.
    :return: Principal components (eigenvectors corresponding to the largest eigenvalues).
    """

    # Step 1: Standardize the dataset (zero mean)
    mean = np.mean(data, axis=0)
    standardized_data = data - mean

    # Step 2: Compute the covariance matrix
    covariance_matrix = np.cov(standardized_data, rowvar=False)  # rowvar=False makes it compute feature covariance

    # Step 3: Compute eigenvalues and eigenvectors
    eigenvalues, eigenvectors = np.linalg.eigh(covariance_matrix)  # eigh is used for symmetric matrices (covariance matrix)

    # Step 4: Sort eigenvectors by decreasing eigenvalues
    sorted_indices = np.argsort(eigenvalues)[::-1]  # Sort in descending order
    top_k_indices = sorted_indices[:k]  # Select top k eigenvalues

    # Step 5: Select the top k eigenvectors
    principal_components = eigenvectors[:, top_k_indices]

    return np.round(principal_components, 4)

# Example usage
data = np.array([[1, 2], [3, 4], [5, 6]])
k = 1
print(pca(data, k))