# Instructions:

You have to submit the Notebook before the deadline:

*   Before you start go to the menu in the upper left corner and click file->Save a copy in Drive.
*   Modify (add your name) the notebook name to **firstname_lastname_NumPy_Assignment.ipynb**.

In [None]:
import numpy as np

# Exercise 1:

1. Create 3 arrays $X$, $w$ and $y$ with shapes (3,4), (2,1) and (6,1) respectively such that:

  * $X$ is a sample from the standard normal distribution
  * $w$ an $y$ are filled  with random floats

2. Perform the following operations:

  * Find the minimum value in $X$.
  * Find the maximum value in each row of $X$.
  * Reshape $X$ such that, the operation $Xw$  will result in an array of a shape (6,1).
  * Find the L2 norm  $||Xw-y||^2$.

In [None]:
## WRITE YOUR CODE HERE ##
X = np.random.randn(3,4)
w = np.random.random((2,1))
y = np.random.random((6,1))

#minimum value of W
print(f"Minimum value of X: {np.min(X)}")

#maximum value of each row of X
print(f"The maximum value for each row: {np.max(X,axis=1)}")

#Reshape  𝑋  such that, the operation  𝑋𝑤  will result in an array of a shape (6,1)
X = X.reshape(6,2)
Xw = X@w
print(f"Array shape of Xw after reshaping X: {Xw.shape}")

#L2 Norm
print(f"L2 Norm: {np.linalg.norm(Xw - y)}")


Minimum value of X: -1.1078052354752475
The maximum value for each row: [1.21633873 1.59657953 1.03280389]
Array shape of Xw after reshaping X: (6, 1)
L2 Norm: 1.7554738203031324


# Exercise 2

Replace all values greater than or equal to 0.5 with 1, else remain the same.


In [None]:
a = np.array([0, 0.6, 0.5, 0, 0, 0.9, 1, 0,0,0.8])

In [None]:
## WRITE YOUR CODE HERE ##
print(np.where((a >= 0.5),1,a))

[0. 1. 1. 0. 0. 1. 1. 0. 0. 1.]


# Exercise 3:

Create a (4X4) array with 1 on the border and 0 inside

In [None]:
## WRITE YOUR CODE HERE ##
result = np.ones((4,4)) #4 x 4 with all 1s
result[1:-1, 1:-1] = 0 #set inner elements to 0
print(result)

[[1. 1. 1. 1.]
 [1. 0. 0. 1.]
 [1. 0. 0. 1.]
 [1. 1. 1. 1.]]


# Exercise 4:

Remove all rows with nan values in the given array.

Hint: use `np.isnan()`

Expected output:

        array([[37.5, 24.1,  2.2],
               [18.1,  4.8, 32. ]])

In [None]:
a = np.array([[37.5, 24.1, 2.2],
             [25.4, np.nan, 38],
             [18.1, 4.8, 32],
             [np.nan, np.nan, 28.9]
                ])


In [None]:
## WRITE YOUR CODE HERE ##
print(a[~np.isnan(a).any(axis=1),:])

[[37.5 24.1  2.2]
 [18.1  4.8 32. ]]


# Exercice 5
Sort the following NumPy array:
 - by the 2nd row. Expected output:

        array([
              [ 6, 45, 34],
              [16, 21, 67],
              [32, 40, 17]])
 - by the 2nd column. Expected output:

       array([[45,  6, 34],
              [21, 16, 67],
              [40, 32, 17]])

In [None]:
a = np.array([[45,6,34],[21,16,67],[40,32,17]])

In [None]:
## WRITE YOUR CODE HERE ##
print("original matrix:\n")
print(a,"\n")
print("Sorted by the 2nd row:\n")
sort_r2_indx = np.argsort(a[1])
sorted_by_row2 = a[:, sort_r2_indx]
print(sorted_by_row2,"\n")

print("Sorted by the 2nd column:\n")
sort_c2_indx = np.argsort(a[:, 1])
sorted_by_col2 = a[:, sort_c2_indx]
print(sorted_by_col2,"\n")


original matrix:

[[45  6 34]
 [21 16 67]
 [40 32 17]] 

Sorted by the 2nd row:

[[ 6 45 34]
 [16 21 67]
 [32 40 17]] 

Sorted by the 2nd column:

[[45  6 34]
 [21 16 67]
 [40 32 17]] 



# Exercise 6:

## Problem Description:

In this exercise, you are required to implement a function that performs Principal Component Analysis (PCA) on a given array of dataset using Singular Value Decomposition (SVD).


### Instructions:

A. Write a function named **pca_with_svd()** that takes a 2D array X as input, where each row represents a data point and each column represents a feature.

Inside the function:

1. Center X by substracting the column-mean from each column.
2. Decompose the centered X using SVD (`np.linalg.svd()`) to obtain the matrices $U$, $Σ$, $V^T$

  where:
    * $U$ is a 2D array containing the eigenvectors as columns.
    * $Σ$ is a 1D array containing the singular values.

3. Extract the principal components (PCs).

  Hint: $PCs = V^T$

4. The function should return a tuple of  $X\_centered$, $PCs$, $U$ and $Σ$


B. Write a second function named  **get_projected_data()** that:

1. Project the array X onto the subspace defined by the $V$.

  **Hint**:  $projected\_X = XV = US$ where, S is the 2D diagonal matrix of singular values.
2. Return a 2D array representing the projected data.



In [None]:
X = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

In [None]:
##  COMPLETE THE CODE ##

def pca_with_svd(x):

    # Step 1: Compute the mean-centered X matrix
    mean_centred_matrix = x - np.mean(x,axis=0)

    # Step 2: Use SVD to obtain U, Sigma, and V^T
    u, sigma, v_transpose = np.linalg.svd(mean_centred_matrix)

    # Step 3: Extract the principal components from V
    pcs = v_transpose.T
    # Step 4: Return the result as a tuple
    return mean_centred_matrix, pcs, u, sigma


def get_projected_data(x,v):

    # Project the data onto the subspace defined by V
    projected_x = x@v

    # Return the projected data
    return projected_x

In [None]:
X_centered,principal_components,U, singular_values = pca_with_svd(X)

projected_data = get_projected_data(X_centered,principal_components)

# Display the results
print("\nPrincipal Components:\n", principal_components)
print("\nSingular Values:", singular_values)
print("\nProjected Data:\n", projected_data)


Principal Components:
 [[ 0.57735027 -0.81649658  0.        ]
 [ 0.57735027  0.40824829 -0.70710678]
 [ 0.57735027  0.40824829  0.70710678]]

Singular Values: [7.34846923e+00 3.62597321e-16 0.00000000e+00]

Projected Data:
 [[-5.19615242e+00 -1.33226763e-15 -3.33066907e-16]
 [ 0.00000000e+00  0.00000000e+00  0.00000000e+00]
 [ 5.19615242e+00  1.33226763e-15  3.33066907e-16]]
