## Task List 04: Numpy 

In [1]:
import numpy as np

**01.** Using `np.array()` create
- A *vector* (i.e. one-dimensional array) of lenght 5
- A *matrix* (i.e. two-dimensional array) of shape 2x3

In [2]:
a_vector = np.array([1,2,3,4,5])
print(len(a_vector))

5


In [3]:
a_matrix = np.arange(6).reshape(2,3)
print(a_matrix)

[[0 1 2]
 [3 4 5]]


**02.** `.reshape()` array method is most commonly used to reshape vectors into matrices of desired (allowable) shape. Let's see it in action:

In [4]:
v_2 = np.ones(6, dtype=int)
v_2.shape

(6,)

`v_2` is a vector of ones of lenght 6. Run the cells below to see what happens when we reshape it.

In [5]:
v_2a=v_2.reshape(2,3)
v_2a

array([[1, 1, 1],
       [1, 1, 1]])

And what happens if we do this?

In [6]:
v_2b = v_2a.reshape(3,2)
print(v_2b)

[[1 1]
 [1 1]
 [1 1]]


And this? 

In [67]:
v_2c= v_2.reshape(4,2)

# ValueError: cannot reshape array of size 6 into shape (4,2)


ValueError: cannot reshape array of size 6 into shape (4,2)

Now, create a vector of zeros of length 12 and reshape it to some matrix.

In [None]:
v_zero = np.zeros(12)
v_matrix = v_zero.reshape(3,4)
print(v_matrix)

**03.** -1 is a very special number when used with reshape. Let's make a vector of ones of lenght 20:

In [None]:
v_3 = np.ones(20).reshape(5, -1)
print(v_3)

# so from the output we can see that if -1 is first number in the tuple, it represents 5-1 number of rows, 5 stays the same for the column # (-1,5). 

### How -1 Works in reshape:
#### (The behavior of -1 in reshape is consistent, regardless of whether it's the first or second number in the shape tuple)

-   The -1 in reshape is a placeholder that tells Numpy to automatically calculate the appropriate dimension based on the size of the original array and the specified dimensions.

-   Numpy ensures that the total number of elements in the reshaped array is equal to the total number of elements in the original array.



> Reshape to (5, -1):

5 specifies the number of rows. 

-1 tells Numpy to calculate the number of columns automatically. Since the total number of elements must stay the same (20), Numpy determines:

Number of columns = total elements (20) / number of rows (5) = 4

***


What do you think will happen if we use `.reshape((-1, 10))` on `v_3`?

In [None]:
v_3.reshape((-1,10))
# expected output - two rows , 10 columns - Correct

Later on in the course, when implementing Machine Learning, you'll be sometimes required to reshape a vector into *row matrix* or *column matrix*. What are those? See for yourself, on the example of the following vector:

In [None]:
v = np.ones(5)
print(v)
v.shape

- Row matrix

In [None]:
v_row = v.reshape(1,-1)
print(v_row)
print(v_row.shape)

> __1D array__ into a __2D matrix__ by reshaping it, and the placeholder -1 in v.reshape(1, -1) specifies that the number of columns should be automatically calculated to fit the total size of the array. Let’s break this down step by step.

- Column matrix

In [None]:
v_columns = v_row.reshape(-1,1)
print(v_columns)
print(v_columns.shape)

**04.** Let

In [None]:
x = np.array([-1,0,1])
y = np.array([1,0,0])

A = np.eye(3) # This creates a 3x3 identity matrix
B = np.diag((2, 1, -2)) #this creates a diagonal matrix with the given elements on the main diagonal


print('x = ', x)
print('-------------')
print('y = ', y)
print('-------------')
print('A = \n', A)
print('-------------')
print('B = \n', B)


Compute 

- $x\cdot y$ (the inner product)
- $Ay$
- $AB$
- $xBy$
- $B^TA$
- $AB - 4BA$

In [None]:
x@y 

In [None]:
A@y

In [None]:
A@B

In [None]:
x@B@y

In [None]:
B.T@A

In [None]:
A@B - 4*B@A

Notice the outputs of the operations above? Depending on the operands involved, you can obtain either a vector, a matrix or a *scalar*, i.e. a number. Knowing what kind of output to expect for given inputs is very useful when doing *Linear Algebra*.

But why all this? Why Linear Algebra? Well, because

$$ {\rm No\ Linear\ Algebra} \Rightarrow {\rm No\ Machine\ Learning}. $$
a
But, don't worry - we aren't going to do some fancy Linear Algebra in this course. We'll just cover the most basic concepts, needed to understand the mathematics behind Machine Learning and utilizing ML in Python. And should you decide to venture into *Deep Learning* - area of ML dealing with *Neural Networks* (used for Computer Vision, Chatbots and Natural Language Understanding/Generation), be prepared to do Linear Algebra quite a bit:)

**05.** Using indexing and index slicing extract from matrix `A_5` the following:

- an element in its third row and second column
- an element in its last column and second row
- its first row
- its third column
- its last two columns
- its first and third row
- all its elements in the interesction of its last two rows and first two columns
- all its elements in the intesection of its two middle rows and two middle columns

In [None]:
A_5 = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
A_5

In [None]:
# 3rd row , 2nd columns:
A_5[2:3,1:2]

In [None]:
# element in ints last columns and second row:
A_5[1:2,-1]

In [None]:
# it's first row:
A_5[0,: ,]

In [None]:
# it's last two columns:
A_5[:,2:]

In [None]:
# its first and third row:
A_5[[0,2][1:2]]

# Fancy indexing

In [None]:
# all its elements in the interesction of its last two rows and first two columns

A_5[-2:,:2]


In [None]:
# all its elements in the intesection of its two middle rows and two middle columns
A_5[1:3,1:3]

#### Fancy Indexing:

-   [[0, 2]] specifies the indices of the rows you want to extract (1st row = index 0, 3rd row = index 2).

-   : selects all columns for the specified rows.


(Use row indices inside [] to extract multiple rows in one go., Use : to keep all columns, or specify a range of columns if needed (e.g., A_5[[0, 2], 1:3] to extract specific columns)_)


Core Concepts of Fancy Indexing:

Fancy indexing allows you to access elements, rows, or columns in a non-sequential or customized manner by using arrays or lists of indices. It’s more flexible than traditional slicing.

### 1. Fancy Indexing for Rows and Columns

-   You use lists or arrays of indices to select specific rows, specific columns, or both.

1.1 Example:

In [None]:
import numpy as np

# Example array
arr = np.array([[10, 20, 30], 
                [40, 50, 60], 
                [70, 80, 90]])

# Select the 1st and 3rd rows
print(arr[[0, 2], :])  # Output: [[10 20 30]
                       #         [70 80 90]]

# Select the 2nd and 3rd columns
print(arr[:, [1, 2]])  # Output: [[20 30]
                       #         [50 60]
                       #         [80 90]]


### 2. Fancy Indexing for Specific Elements

-   Combine row indices and column indices to pick specific elements.

1.2 Example:



In [None]:
# Example array
arr = np.array([[10, 20, 30], 
                [40, 50, 60], 
                [70, 80, 90]])

# Select elements at (0, 1), (1, 2), and (2, 0)
print(arr[[0, 1, 2], [1, 2, 0]])  # Output: [20 60 70]


Exercises

Exercise 1: Extract Specific Rows and Columns:

1) Extract the 1st and 4th rows (0-indexed).
2) Extract the 2nd and 3rd columns for the 1st and 4th rows.
3) Select specific elements: (0, 1), (2, 3), and (3, 0).

* Bonus: Write a function extract_rows_columns that takes:

1a) A 2D array,
2a) A list of row indices,
3a) A list of column indices, And returns the submatrix containing only those rows and columns.

In [None]:
# 1.

arr[[0,-1], :]

# 2.

arr[[0,-1],[1,2]]

# 3.
arr[[0,2,2],[1,2,0]]

In [None]:
# 1a)

def extract_rows_columns(arr,rows,columns):
    return arr[rows,columns]

matrix_design = np.arange(50).reshape(5,-1)
extract_rows_columns(matrix_design, [1, 3],[0, 2])



**06.** Using conditional indexing, find all the elements of matrix `A_6` which are:

- strictly greater than 1
- not equal to zero
- strictly lesser than -1 or greater than 2
- in the [0, 3] segment (endpoints included) 

In [None]:
A_6 = np.array([[0, -1, 3, 4, 2], [0, 2, 0, -4, 1], [0, 0, -5, 2, 1], [5, -1, 2, 0, -2]])
   
A_6[A_6>1]
A_6[A_6 != 0]
A_6[(A_6 <1) | (A_6 > 2)]
A_6[(A_6<=0)&(A_6<=3)]

**07.** Generate a random 3x7 matrix of floats, and then select all its entries greater or equal than 0.5 .

In [None]:
A_6 = np.random.random(size=(3,7))
print(A_6)
print(A_6[A_6>=0.5])

**08.** `np.linspace()` creates a vector of equaly distant points between the two given points. Similar to this is `np.arange()` which produces a vector of equaly distant points between the two given points. Wait, what's the difference then? 

 - for `np.linspace()` you specify the <u>number of points</u> between the two given points
 - for `np.arange()` you specify the <u>stepsize</u> between the numbers in the interval between the given two points
 
To check this for yourself, do the following:

 - Make an array of 17 equally distributed points between 0 and 2
 - Make an array of points between 0 and 2 with stepsize 0.25

In [None]:
A = np.linspace(0,2,17)
A

B = np.arange(0,2,0.25)
B


-   np.arange ( ) - Purpose: Creates a sequence of numbers within a specified range. __Step size will not be included !__



-   np.linspace ( ) - Purpose: Creates a sequence of numbers between a start and end range, with a specified number of equally spaced points.

**09.** Create the same array using both `np.linspace()` and `np.arange()`.

In [None]:
arr1 = np.linspace(1,10,7)
arr2 = np.arange(1,11,1.5)
print(arr1,arr2)

**10.** `np.concatenate()` let's you stack two (or more) vectors/matrices next to one antoher or on top of each other. Given to vectors `v_10a` and `v_10b` use it to:
   - produce a new vector by stacking them one next to another (horizontal stacking)
   - produce a new matrix by stacking them on top of each other (vertical stacking) 

In [None]:
v_10a = np.array([1, 2, 3])
v_10b = np.array([4, 5, 6])
print(v_10a,v_10b)

In [None]:
np.concatenate((v_10a, v_10b), axis=0)

# Expected output 1,2,3,4,5,6

With this approach, we created a new vector array that has v_10a and v_10b horizontally stacked. __Axis = 0__ is the only axis for the 1D arrays. 

But why when I want to make a new matrix by vertically stacking these 2 arrays, i receive an __Axis__ error? AxisError: axis 1 is out of bounds for array of dimension 1

-   The error occurs because np.concatenate() requires that the input arrays have the same number of dimensions, and the axis you specify must exist for those arrays. Let's break it down:

This works because you are concatenating along axis 0, which is the only valid axis for 1D arrays.

-   If you want to stack the arrays vertically and create a 2D matrix, you need to first convert the 1D arrays into 2D arrays. This can be done using np.reshape() or np.newaxis.


In [None]:
v_10a = np.array([1, 2, 3])
v_10b = np.array([4, 5, 6])

b = np.vstack((v_10a, v_10b))
print(b)
b.shape

# np.vstack () could be used for this purpose for creating 2D from 1D vector.


v_10a_2d = v_10a.reshape(1, -1)  # Shape: (1, 3)
v_10b_2d = v_10b.reshape(1, -1)  # Shape: (1, 3)

print(v_10a_2d.shape)
print(v_10b_2d.shape)


result = np.concatenate((v_10a_2d, v_10b_2d), axis=0)
print(result)



### Keypoints:

-   A 1D array has only axis 0, so axis=1 is invalid.

-   Convert 1D arrays to 2D using reshape() or use vstack() if you want to stack them vertically.

-   __np.concatenate ( ) requires the arrays to have the same number of dimensions.__

Now, do the same for matrices `A_10a` and `A_10b`.

In [None]:
A_10a = np.array([[1, 2], [3, 4]]) # 2D
A_10a

In [None]:
A_10b = np.array([[-3, -2], [1, 0]]) #2D
A_10b

In [None]:
c = np.concatenate((A_10a,A_10b),axis=1)
print(c)

#Here we can choose between axis 1 and 0 . 

c = np.concatenate((A_10a,A_10b),axis=0)
print(c)


c = np.vstack((A_10a,A_10b))
print(c)

**11**. Given a 2x3 matrix `A_11` use

 - `np.vstack()` to expand it by two rows with some matrix `A_11a` you define
 - `np.hstack()` to expand it by two column with some matrix `A_11b` you define
 
*Hint: Watch for the shapes!*

In [None]:
A_11 = np.array([[1, 7, 1], [7, 1, 7]])
A_11

In [None]:
A_11.shape #2,3
A_11a = np.array([[5,5,5],[9,0,9]])
print(A_11a)
A_11a.shape
np.vstack((A_11,A_11a))

In [None]:
A_11b = np.arange(6).reshape(2,3)
print(A_11b)

np.hstack((A_11,A_11b))

**12.** Find a way to turn matrix `A_12` into a vector containing its elements. 

*Hint: Try listening to Bolero.*

In [None]:
A_12 = np.array([[4, 7], [7, 4], [4, 7]])
A_12

In [None]:
A_12V = A_12.reshape(-1,6)
A_12V.shape
print(A_12V)

In [None]:
A_12.ravel()

### np.ravel ( )

-   The np.ravel() function in NumPy is used to flatten a multi-dimensional array into a 1D array. It works similarly to reshape(-1) but is more focused on flattening.

-   np.ravel() does not change the shape of the original array. It just returns a flattened version.

**13.** For the matrix `A_13` find 

- mean of all of its elements
- mean of all of its elements in the second column
- mean of all of its elements for every row

In [9]:
A_13 = np.array([[1, 2, 3, 4], [-1, -2, -3, -4], [.1, .2, .3, .4]])
A_13

array([[ 1. ,  2. ,  3. ,  4. ],
       [-1. , -2. , -3. , -4. ],
       [ 0.1,  0.2,  0.3,  0.4]])

In [None]:
np.mean(A_13)

In [10]:
np.mean(A_13[:,1:2])

0.06666666666666667

In [12]:
np.mean(A_13, axis=1)

array([ 2.5 , -2.5 ,  0.25])

**14.** For matrix `A_14` find

- standard deviation of all of its elements
- standard deviation of all of its elements for every column

In [14]:
A_14 = np.array([[1, 2, np.nan, 4], [-1, np.nan, np.nan, -4], [np.nan, .2, .3, .4]])
A_14

array([[ 1. ,  2. ,  nan,  4. ],
       [-1. ,  nan,  nan, -4. ],
       [ nan,  0.2,  0.3,  0.4]])

-   nanmean ( ) , remember?

In [16]:
np.nanstd(A_14)

2.1575086905966336

In [17]:
np.nanstd(A_14, axis=0)

array([1.        , 0.9       , 0.        , 3.27142511])

### __std( )__

-    The `std()` method in NumPy calculates the standard deviation of an array, which is a measure of the amount of variation or dispersion in a dataset.


-   Standard deviation (SD) quantifies how spread out the numbers in a dataset are relative to their mean. A low standard deviation indicates that the values are close to the mean, while a high standard deviation indicates that the values are spread out over a larger range.

**15.** `np.argmin()` and `np.argmax()` are two very useful functions for finding an <u>index (position)</u> of the minimal/maximal value of an array. Given matrix `A_15` find

- position of its biggest element 
- indices of every smallest element from its columns
- position of its biggest negative element

In [52]:
A_15 = np.array([[1, -1, 3, 4], [2, -6, -3, 0], [7, -4, 5, -5], [8, 9, 6, -7]])
A_15

array([[ 1, -1,  3,  4],
       [ 2, -6, -3,  0],
       [ 7, -4,  5, -5],
       [ 8,  9,  6, -7]])

In [53]:
np.argmax(A_15)

13

In [54]:
np.argmin(A_15, axis=0)

array([0, 1, 1, 3], dtype=int64)

In [55]:
# position of it's biggest negative element

# 1. filter the negative elemnts - Use a condition to get only the negative elements of the matrix.

# 2. Use np.max() on the filtered negatives to identify the largest negative value.

# 3. 3. Get the Position of the Largest Negative Element - Use np.where() to find the indices of the largest negative element.


# filter negatives and find the largest negative in the matrix

largest_neg = np.max(A_15[A_15<0])



In [56]:
# filter negatives and find the largest negative in the matrix

largest_neg = np.max(A_15[A_15<0])

In [57]:
# find the position of the element

position = np.where(A_15 == largest_neg)

print(position) # -1 

(array([0], dtype=int64), array([1], dtype=int64))


**16.** `np.where()` is a function which can change elements of the array based on some condition, in a single line of code. For example, given a matrix `A_16`

In [58]:
A_16 = np.copy(A_15)
A_16

array([[ 1, -1,  3,  4],
       [ 2, -6, -3,  0],
       [ 7, -4,  5, -5],
       [ 8,  9,  6, -7]])

In [59]:
np.where(A_16 > 0, A_16**2, 2*A_16)

array([[  1,  -2,   9,  16],
       [  4, -12,  -6,   0],
       [ 49,  -8,  25, -10],
       [ 64,  81,  36, -14]])

squares all its positive values and doubles all its negative values.

Now, using `np.where()` divide all even values of matrix A_16 by two, while leaving odd values the same. 

In [66]:
np.where(A_16 % 2 ==0, A_16//2, A_16)

array([[ 1., -1.,  3.,  2.],
       [ 1., -3., -3.,  0.],
       [ 7., -2.,  5., -5.],
       [ 4.,  9.,  3., -7.]])

np.where() is a powerful NumPy function used for conditional indexing and searching. It allows you to return the indices of elements in an array that meet a given condition or to create a new array based on conditions.

#### Usage: 

1. Find Indices of Elements Satisfying a Condition

             If only the condition is provided, np.where() returns the indices where the condition is `True`.

In [60]:
arr = np.array([10, 15, 20, 25, 30])

#Find indices where elements are greater than 20

indices = np.where(arr > 20)
print(indices)  # Output: (array([3, 4]),)

(array([3, 4], dtype=int64),)


2. Conditional Selection:

            If x and y are provided, np.where() chooses elements from x where the condition is True, and from y where the condition is False.

In [61]:
arr = np.array([10, 15, 20, 25, 30])

# Replace elements greater than 20 with 0, others with 1
result = np.where(arr > 20, 0, 1)
print(result)  # Output: [1 1 1 0 0]


[1 1 1 0 0]
