## ML - Linear Algebra

### Calculating Mean and Standard Deviation: Math Formulas and Python Implementation

#### Mean: Mathematical Formula

The **mean** (or average) of a set of numbers measures the central value.

Given a set of numbers $ x_1, x_2, \ldots, x_n $, the formula for the mean is:

$$
\text{mean} = \frac{1}{n} \sum_{i=1}^n x_i
$$

Where:
- $ n $: Total number of elements.
- $ x_i $: Each individual value in the dataset.

##### Example
For the dataset $ [1, 2, 3, 4, 5] $:
$$
\text{mean} = \frac{1}{5} (1 + 2 + 3 + 4 + 5) = 3
$$

---

#### Standard Deviation: Mathematical Formula

The **standard deviation** measures the spread of a dataset relative to its mean.

For a dataset $ x_1, x_2, \ldots, x_n $, the formula for the standard deviation is:

$$
\text{std} = \sqrt{\frac{1}{n} \sum_{i=1}^n (x_i - \text{mean})^2}
$$

Where:
- $ n $: Total number of elements.
- $ x_i $: Each individual value in the dataset.
- $ \text{mean} $: The mean of the dataset.

##### Example
For the dataset $ [1, 2, 3, 4, 5] $:
1. Compute the mean:
   $$
   \text{mean} = 3
   $$
2. Compute squared differences:
   $$
   (1 - 3)^2 = 4, \, (2 - 3)^2 = 1, \, (3 - 3)^2 = 0, \, (4 - 3)^2 = 1, \, (5 - 3)^2 = 4
   $$
3. Compute the average of squared differences:
   $$
   \frac{4 + 1 + 0 + 1 + 4}{5} = 2
   $$
4. Take the square root:
   $$
   \text{std} = \sqrt{2} \approx 1.414
   $$

---

In [7]:
def calculate_matrix_statistics(matrix, mode):
    """
    Calculate the mean and standard deviation of the rows or columns of a matrix.

    Args:
        matrix (list of lists): A 2D list representing the matrix.
        mode (str): 'row' to calculate statistics for rows, 'column' for columns.

    Returns:
        dict: A dictionary with 'means' and 'stds' as keys and their respective values as lists.
    """
    rows, cols = len(matrix), len(matrix[0])
    means = []
    stds = []

    if mode == 'row':
        # Calculate row means and stds
        for r in range(rows):
            row_values = matrix[r]
            row_mean = sum(row_values) / cols
            row_std = (sum((x - row_mean) ** 2 for x in row_values) / cols) ** 0.5
            means.append(row_mean)
            stds.append(row_std)

    elif mode == 'column':
        # Calculate column means and stds
        for c in range(cols):
            col_values = [matrix[r][c] for r in range(rows)]
            col_mean = sum(col_values) / rows
            col_std = (sum((x - col_mean) ** 2 for x in col_values) / rows) ** 0.5
            means.append(col_mean)
            stds.append(col_std)

    else:
        raise ValueError("Mode must be either 'row' or 'column'.")

    return {'means': means, 'stds': stds}

# Example usage
matrix = [[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]

# Calculate column statistics
mode = 'column'
column_stats = calculate_matrix_statistics(matrix, mode)
print("Column Statistics:")
print("Means:", column_stats['means'])
print("Standard Deviations:", column_stats['stds'])

# Calculate row statistics
mode = 'row'
row_stats = calculate_matrix_statistics(matrix, mode)
print("\nRow Statistics:")
print("Means:", row_stats['means'])
print("Standard Deviations:", row_stats['stds'])


Column Statistics:
Means: [4.0, 5.0, 6.0]
Standard Deviations: [2.449489742783178, 2.449489742783178, 2.449489742783178]

Row Statistics:
Means: [2.0, 5.0, 8.0]
Standard Deviations: [0.816496580927726, 0.816496580927726, 0.816496580927726]
