In [62]:
import numpy as np
import pandas as pd

## Exercise 1: Creating Arrays
**Objective:** Learn to create and manipulate NumPy arrays.

1. Create 1D array:

In [17]:
arr = np.array([1, 2, 3])
arr

array([1, 2, 3])

2. Create 2D array:

In [19]:
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])
arr_2d

array([[1, 2, 3],
       [4, 5, 6]])

3. Create arrays with built-in functions:

In [15]:
zeros = np.zeros((3, 3))
ones = np.ones((2, 4))
range_arr = np.arange(10)

print(zeros)
print(ones)
print(range_arr)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[0 1 2 3 4 5 6 7 8 9]


The more important attributes of an `ndarray` object:

- **ndim**: number of dimensions of the array (ex: 3D)
- **shape**: (n, m)
- **size**: total number of elements of the array
- **dtype**: <em>(object)</em> [doc](https://numpy.org/doc/stable/reference/arrays.dtypes.html#arrays-dtypes) refer to the sidebar for a list of attributes of dtype
- **itemsize**: bytes size of each element
- **data**: buffer of elements

In [46]:
a = np.arange(27).reshape(3, 3, 3)

print(a)

print(f"Matrix of {a.shape} elements")
print(f"This is a {a.ndim}D matrix")
print(f"Elements are of type: {a.dtype.name}")
print(f"Each element has a size of: {a.itemsize} bytes")
print(f"The matrix has {a.size} elements")

print(type(a))

[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]

 [[18 19 20]
  [21 22 23]
  [24 25 26]]]
Matrix of (3, 3, 3) elements
This is a 3D matrix
Elements are of type: int64
Each element has a size of: 8 bytes
The matrix has 27 elements
<class 'numpy.ndarray'>


## Excercise 2: Array operations
**Objective**: Perform basic arithmetic and manipulation on arrays

1. Element-wise operations:

In [52]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(f"Addition: {a + b}")
print(f"Multiplication: {a * b}")
print(f"Square each element: {a ** 2}")

Addition: [5 7 9]
Multiplication: [ 4 10 18]
Square each element: [1 4 9]


2. Dot product:<br/>
<i>NB: The dot product works because we are using 1D arrays.</i>

In [55]:
dot_product = np.dot(a, b)

print(f"Dot product: {dot_product}")

Dot product: 32


3. Reshape an array:

In [61]:
arr = np.arange(12)
reshaped = arr.reshape(3, 4)

print(reshaped)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


## Exercise 3: Indexing and Slicing
**Objective:** Access and modify array elements.

**One-dimensional** arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.

In [102]:
a = np.arange(10)**3
print(a)

# indexing
print(f"Element at index 2 (3rd pos): {a[2]}")

# slicing
print(f"From index 2 inclusive (3rd pos) to index 5 exclusive (6th pos) {a[2:5]}")
a[:6:2] = 1000
print("From start to index 6 (7th pos), exclusive, set every 2nd element to 1000")

# iterating
for i in a:
    print(i)

[  0   1   8  27  64 125 216 343 512 729]
Element at index 2 (3rd pos): 8
From index 2 inclusive (3rd pos) to index 5 exclusive (6th pos) [ 8 27 64]
From start to index 6 (7th pos), exclusive, set every 2nd element to 1000
1000
1
1000
27
1000
125
216
343
512
729


**Multidimensional** arrays can have one index per axis. These indices are given in a tuple separated by commas:

In [100]:
def f(x, y):
    return 10 * x + y

b = np.fromfunction(f, (5, 4), dtype=int)
print(b)

# indexing
print(f"Element at row index 2 and col index 3: {b[2, 3]}")

# slicing
# equivalent to b[0:5, 1]
print(f"Each row in the second column {b[:, 1]}")
print(f"Missing indices are default to complete slice: {b[-1]}")

# iterating
print("Iterating over row:")
for row in b:
    print(row)

print("Iterating over each element:")
for el in b.flat:
    print(el)

[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]
Element at row index 2 and col index 3: 23
Each row in the second column [ 1 11 21 31 41]
Missing indices are default to complete slice: [40 41 42 43]
Iterating over row:
[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]
Iterating over each element:
0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43


## Exercice 4: Array Aggregation
**Objective**: Use NumPy to compute statistics.

1. Sum, mean, and standard deviation:

In [111]:
arr = np.array([[1, 2, 3], 
                [4, 5, 6]])

print(f"Sum of all elements: {np.sum(arr)}")
print(f"Mean of each column (axis 0): {np.mean(arr, axis=0)}")
print(f"Standard deviation: {np.std(arr)}")

Sum of all elements: 21
Mean of each column (axis 0): [2.5 3.5 4.5]
Standard deviation: 1.707825127659933


2. Min and max:

In [118]:
print(f"Min of the matrix: {np.min(arr)}")
print(f"Max of each row (axis 1): {np.max(arr, axis=1)}")

Min of the matrix: 1
Max of each row (axis 1): [3 6]


## Exercice 5: Broadcasting
**Objective**: Understand how NumPy handles operations on arrays of different shapes.

1. Add a scalar to an array:

In [8]:
a = np.array([1, 2, 3])
print(a + 5)

[6 7 8]


2. Add arrays with different shapes:

In [21]:
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([10, 20, 30])
print(a + b)

[[11 22 33]
 [14 25 36]]


### Example of Broadcasting Rules
We are trying to add a(3,4) with b(3,3)

In [12]:
# Array 1: Shape (3, 4)
a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]])

# Array 2: Shape (3, 3) - Incompatible with Array 1
b = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

# Attempt to add the arrays
try:
    result = a + b
except ValueError as e:
    print(f"Error: {e}")

Error: operands could not be broadcast together with shapes (3,4) (3,3) 


### Explanation of the Error
- **Shape of** `a`: (3, 4)
- **Shape of** `b`: (3, 3)
<br/>

The shapes are incompatible because:
- The second dimension of a (size 4) **does not match the second dimension** of b (size 3).
- **Neither array has a size of 1** in the mismatched dimension, so broadcasting cannot occur.


## Exercise 6: Random Numbers
**Objective:** Generate and use random numbers.

1. Generate random numbers:

In [45]:
random_arr = np.random.rand(3, 3)
print(random_arr)

[[0.40806233 0.30552331 0.23931116]
 [0.00543178 0.82666763 0.97354975]
 [0.70812338 0.78356729 0.16341049]]


2. Generate random integers:

In [48]:
random_ints = np.random.randint(0, 10, size=(2, 4))
print(random_ints)

[[6 7 7 4]
 [0 5 0 5]]


## Exercise 7: Advanced Operations
**Objective**: Explore advanced NumPy features.

1. Matrix multiplication:

In [52]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
print(np.matmul(a, b))

[[19 22]
 [43 50]]


2. Transpose a matrix:

In [55]:
print(a.T)

[[1 3]
 [2 4]]


3. Find unique elements:

In [58]:
arr = np.array([1, 2, 2, 3, 4, 4, 4])
print(np.unique(arr))

[1 2 3 4]


## Exercise 8: Real-World Application
**Objective**: Apply NumPy to a real-world dataset.

1. Load a dataset:

In [64]:
df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
arr = df[["Age", "Fare"]].to_numpy()
print(arr)

[[22.      7.25  ]
 [38.     71.2833]
 [26.      7.925 ]
 ...
 [    nan 23.45  ]
 [26.     30.    ]
 [32.      7.75  ]]


2. Describe method of pandas:

In [118]:
print(f"Count: {arr.size}")
print(f"Mean: {np.nanmean(arr).round(4)}")
print(f"Std: {np.nanstd(arr).round(4)}")
print(f"Min: {np.nanmin(arr)}")
print(f"25%: {np.nanpercentile(arr, 25).round(4)}")
print(f"50%: {np.nanpercentile(arr, 50).round(4)}")
print(f"75%: {np.nanpercentile(arr, 75).round(4)}")
print(f"Max: {np.nanmax(arr)}")
print(f"Type: {arr.dtype.name}")

Count: 1782
Mean: 31.0898
Std: 38.2706
Min: 0.0
25%: 10.4625
50%: 24.0
75%: 36.0
Max: 512.3292
Type: float64
