In [3]:
import numpy as np

# 1. Creating NumPy Arrays

There are numerous ways you can create NumPy arrays. For example, you can create an array from a Python list as follows:

In [4]:
a = [1, 2, 3]
arr = np.array(a)
print(arr)

[1 2 3]


NumPy arrays can be multidimensional. For example, we can create an array `arr` storing the matrix $\begin{pmatrix}1 & 2\\4 & 5\end{pmatrix}$ as follows:

In [5]:
arr = np.array([[1, 2], [4, 5]])
print(arr)

[[1 2]
 [4 5]]


To create an array containing only zeros, we use `np.zeros()`. The `shape` argument expects a tuple determining the shape of the zero array.

In [6]:
arr1 = np.zeros(2) # or np.zeros(shape=(2,))
arr2 = np.zeros(shape=(3, 2)) # 3x2 matrix
arr3 = np.zeros(shape=(2, 3, 2)) # A 3-dimensional array

print(f"arr1 = \n{arr1}\n")
print(f"arr2 = \n{arr2}\n")
print(f"arr3 = \n{arr3}\n")

arr1 = 
[0. 0.]

arr2 = 
[[0. 0.]
 [0. 0.]
 [0. 0.]]

arr3 = 
[[[0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]]]



Similarly, we can create a new array containing only ones using `np.ones()`.

In [7]:
arr2 = np.ones(shape=(3, 2))
print(arr2)

[[1. 1.]
 [1. 1.]
 [1. 1.]]


The function `np.empty()` creates a new NumPy array and is very fast. But be aware that we do not know which values it will contain, as it just allocates some memory for your array. It is useful if we are going to fill in the values ourselves later.

In [8]:
arr = np.empty(10)

for i in range(10):
    arr[i] = 2 * i

print(arr)

[ 0.  2.  4.  6.  8. 10. 12. 14. 16. 18.]


There is also a dedicated function `np.eye(n)` returning the $n\times n$ identity matrix (ones on the diagonal, zeros everywhere else).

In [9]:
arr = np.eye(4) # 4x4 identity matrix
print(arr)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


NumPy's `np.arange()` is similar to Python's `range()` function but returns a NumPy array instead of a Python list.

In [10]:
arr1 = np.arange(10)
arr2 = np.arange(2, 7)
arr3 = np.arange(1, 11, 2)
arr4 = np.arange(5, 0, -1)

print(f"arr1 = {arr1}")
print(f"arr2 = {arr2}")
print(f"arr3 = {arr3}")
print(f"arr4 = {arr4}")

arr1 = [0 1 2 3 4 5 6 7 8 9]
arr2 = [2 3 4 5 6]
arr3 = [1 3 5 7 9]
arr4 = [5 4 3 2 1]


To create an evenly spaced 1-dimensional grid, we can use `np.linspace()`. For example, if we want a grid with $11$ points on the interval $[-1, 1]$, we can do as follows:

In [11]:
arr = np.linspace(-1, 1, 11)
print(arr)

[-1.  -0.8 -0.6 -0.4 -0.2  0.   0.2  0.4  0.6  0.8  1. ]


### Shape and data types of NumPy Arrays

There are two important properties of NumPy arrays we should know about. Namely, the `dtype` describing the array's data type and the `shape` describing its shape.

In [12]:
arr = np.array([1, 2, 3])

print(arr)
print(f"dtype: {arr.dtype}")
print(f"shape: {arr.shape}")

[1 2 3]
dtype: int64
shape: (3,)


Here is another example where we use `np.random.uniform()` to create a random NumPy array with shape $(2, 4, 3)$ sampled uniformly from the interval $[0, 1)$. You can read more about NumPy's random module [here (link)](https://numpy.org/doc/stable/reference/random/index.html).

In [13]:
arr = np.random.uniform(size=(2, 4, 3))
print(arr)
print(f"dtype: {arr.dtype}")
print(f"shape: {arr.shape}")

[[[0.29246383 0.8847958  0.3466321 ]
  [0.46188101 0.17661155 0.27870535]
  [0.67559919 0.91185079 0.49854295]
  [0.05408619 0.97659697 0.16194039]]

 [[0.69893787 0.45325332 0.02668294]
  [0.45429636 0.06017868 0.32256033]
  [0.84560915 0.06352706 0.93623699]
  [0.06247186 0.73380964 0.65298109]]]
dtype: float64
shape: (2, 4, 3)


NumPy arrays can also store boolean and string values.

In [14]:
arr1 = np.array([[True, False], [False, False]])
arr2 = np.array([["Hel"], ["lo"], ["wo"], ["rld"]])

print("arr1:")
print(arr1)
print(f"dtype: {arr1.dtype}")
print(f"shape: {arr1.shape}")

print("\narr2:")
print(arr2)
print(f"dtype: {arr2.dtype}")
print(f"shape: {arr2.shape}")

arr1:
[[ True False]
 [False False]]
dtype: bool
shape: (2, 2)

arr2:
[['Hel']
 ['lo']
 ['wo']
 ['rld']]
dtype: <U3
shape: (4, 1)


By calling `len()` on a NumPy array, we get the size of the first dimension, so `len(arr)` is equivalent to `arr.shape[0]`.

In [15]:
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
print(arr.shape)
print(arr.shape[0])
print(len(arr))

(4, 2)
4
4


For more details about NumPy data types, see [here (link)](https://numpy.org/doc/stable/user/basics.types.html). To convert an array from one data type to another, one can use `np.ndarray.astype()`. 

In [16]:
arr = np.array(["2.1", "3.6"])
print(arr)

arr = arr.astype(np.float64)
print(arr)

arr = arr.astype(np.int64)
print(arr)

['2.1' '3.6']
[2.1 3.6]
[2 3]


## Exercises

### 1. CSV Data to NumPy Array

The variable `csv_content` contains comma separated CSV data.  Convert this to a NumPy array `arr` of shape `(3, 4)` with `dtype=np.float64`.

In [17]:
csv_content = "1.5,2.2,7.5,0.1\n1.2,7.0,8.9,7.5\n5.5,9.9,9.5,3.4"

# Your code here...
arr = ...

# Solution
arr = np.array([row.split(",") for row in csv_content.split("\n")]).astype(np.float64)

# Automatic tests:
assert (arr == np.array([[1.5, 2.2, 7.5, 0.1], [1.2, 7.,  8.9, 7.5], [5.5, 9.9, 9.5, 3.4]])).all()
assert arr.shape == (3, 4)
assert arr.dtype == np.float64
print("All test passed!")

All test passed!


### 2. Function Values on a Grid

Use `np.linspace()` to create a grid `X` on $[0, 2\pi]$ with $8$ points (NumPy provides $\pi$ as a constant `np.pi`).

Then use `np.cos` on `X` and store the result in a variable `y`. Calling `np.cos(X)` will compute `cos(x)` element-wise on `X` and return an array of the same shape as `X`.

In [18]:
# Your code here
X = ...
y = ...

# Solution
X = np.linspace(0, 2 * np.pi, 8)
y = np.cos(X)

# Automatic tests:
assert np.allclose(X, np.array([0.0, 0.8975979010256552, 1.7951958020513104, 2.6927937030769655, 3.5903916041026207, 4.487989505128276, 5.385587406153931, 6.283185307179586]))
assert np.allclose(y, np.array([1., 0.6234898, -0.22252093, -0.90096887, -0.90096887, -0.22252093, 0.6234898, 1.]))
assert X.shape == y.shape == (8,)
assert X.dtype == y.dtype == np.float_
print("All test passed!")

All test passed!


### 3. Saving and Loading NumPy Arrays

NumPy comes with the functions `np.save()` ([documentation](https://numpy.org/doc/stable/reference/generated/numpy.save.html)) and `np.load()` ([documentation](https://numpy.org/doc/stable/reference/generated/numpy.load.html)).

Create a NumPy array of shape `(4, 2, 2)` containing only ones using `np.ones()` and save it to a file named `ones_array.npy` using `np.save()`.

In [19]:
# Your code here

# Solution
arr = np.ones(shape=(4, 2, 2))
np.save("ones_array.npy", arr)

# Automatic test:
arr = np.load("ones_array.npy")
assert np.allclose(arr, np.array([[[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]]]))
print("All test passed!")

All test passed!


# 2. Indexing NumPy Arrays

NumPy arrays can be indexed in the same way we index Python lists.

In [20]:
arr = np.array([1, 2, 3, 4])

print(arr[0])  # First value
print(arr[1])  # Second value
print(arr[-1]) # Last value

1
2
4


We can also use slicing to index a subarray by `arr[start:end+1]`. You can choose the step size by using `arr[start:end+1:step]` as well.

In [21]:
arr = np.array([1, 2, 3, 4, 5, 6])

print(arr[0:2])  # First two values
print(arr[:2])   # Does the same as above
print(arr[2:-1]) # From index 2 to the second last value
print(arr[2:])   # Everything starting from index 2
print(arr[::-1]) # Reverse array

[1 2]
[1 2]
[3 4 5]
[3 4 5 6]
[6 5 4 3 2 1]


We also want to do indexing on multidimensional NumPy arrays. The syntax is `arr[i_0, i_1, ..., i_n]` where `i_j` is the index for dimension `j`. We can do slicing also here.

In [22]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(arr)           # Original array
print(arr[1, 1])     # The middle value at index (1, 1)
print(arr[0])        # First row
print(arr[:, 0])     # First column
print(arr[1:, 1:])   # Bottom right 2x2 square submatrix
print(arr[::2, ::2]) # Everything but the middle "cross"

[[1 2 3]
 [4 5 6]
 [7 8 9]]
5
[1 2 3]
[1 4 7]
[[5 6]
 [8 9]]
[[1 3]
 [7 9]]


#### Conditional Indexing

We can also index using boolean arrays. One useful application is when we want to do conditional indexing. For example, the code cell below finds the indices of all entries in `arr` greater or equal to `16` and then uses this boolean array to extract those entries.

In [23]:
arr = np.array([[1, 2, 4], [8, 16, 32], [64, 128, 256]])
idxs = (arr >= 16)
print(arr)
print(idxs)
print(arr[idxs])

[[  1   2   4]
 [  8  16  32]
 [ 64 128 256]]
[[False False False]
 [False  True  True]
 [ True  True  True]]
[ 16  32  64 128 256]


### Bonus: Useful Methods of Boolean Arrays

**Using `arr.all()` and `arr.any()` on boolean arrays**: When you have a boolean NumPy array `arr`, you can use the methods `.all()` and `.any()` to reduce the array to a single boolean variable. The method `.all()` returns the logical AND of all the values in the array, whereas the method `.any()` returns the logical OR. We can also specify which axis we want to take the AND/OR along (see below example).

In [24]:
# Over all dimensions
arr = np.array([[True, False], [True, False]])
print(arr)
print(f"any() : {arr.any()}")
print(f"all() : {arr.all()}")

arr = np.array([[True, True], [True, True]])
print(arr)
print(f"any() : {arr.any()}")
print(f"all() : {arr.all()}")

# Along a given dimension
arr = np.array([[True, False], [True, False]])
print(arr)
print(f"any(axis=0) : {arr.any(axis=0)}")
print(f"any(axis=1) : {arr.any(axis=1)}")
print(f"all(axis=0) : {arr.all(axis=0)}")
print(f"all(axis=1) : {arr.all(axis=1)}")

# Combination of all() and any()
# Check if there exists at least one column with all values True
arr = np.array([[True, True], [False, False]])
print(arr)
print(f"all(axis=0).any() : {arr.all(axis=0).any()}")

arr = np.array([[True, False], [True, False]])
print(arr)
print(f"all(axis=0).any() : {arr.all(axis=0).any()}")

[[ True False]
 [ True False]]
any() : True
all() : False
[[ True  True]
 [ True  True]]
any() : True
all() : True
[[ True False]
 [ True False]]
any(axis=0) : [ True False]
any(axis=1) : [ True  True]
all(axis=0) : [ True False]
all(axis=1) : [False False]
[[ True  True]
 [False False]]
all(axis=0).any() : False
[[ True False]
 [ True False]]
all(axis=0).any() : True


## Exercises

### 1. Indexing and Slicing a Matrix (2D-array)

First, create a NumPy array named `arr` storing the matrix $A=\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}$.

Then use indexing and slicing to perform the following tasks:

1. Store the centre value of $A$, i.e., the value at index $(1,1)$ to a variable named `centre_value`.
2. Store the second row of $A$, i.e., the array `[4, 5, 6]`, in a variable named `second_row`.
3. Store the last column of $A$, i.e., the array `[3, 6, 9]`, in a variable named `last_column`.
4. Store the bottom left $2\times2$ sub-matrix of $A$, i.e., the array `[[4, 5], [7, 8]]` in a variable named `bottom_left_submatrix`.

In [25]:
# Your code here
arr = ...
centre_value = ...
second_row = ...
last_column = ...
bottom_left_submatrix = ...

# Solution:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
centre_value = arr[1,1]
second_row = arr[1]
last_column = arr[:, -1]
bottom_left_submatrix = arr[1:,:2]

# Automatic tests:
assert (arr == np.arange(1, 10).reshape(3, 3)).all()
assert centre_value == 5
assert (second_row == np.array([4, 5, 6])).all()
assert (last_column == np.array([3, 6, 9])).all()
assert (bottom_left_submatrix == np.array([[4, 5], [7, 8]])).all()
print("All test passed!")

All test passed!


### 3. Find All Values Over a Given Threshold

Write a function `return_values_over_threshold(arr, threshold)` that takes in a NumPy array `arr` and a float `threshold`, and returns a NumPy array containing those values in `arr` *greater than or equal to* `threshold`. 

**NB! Do not use any loops for this task.** Vectorizing our code speeds up things and is one of the main reasons we use NumPy.

In [26]:
def return_values_over_threshold(arr: np.ndarray, threshold: float):
    ...

# Solution:
def return_values_over_threshold(arr: np.ndarray, threshold: float):
    idxs = (arr >= threshold)
    return arr[idxs]

# Automatic tests
arr = np.array([-3, 4, -1, 5, 7, 12, 0, -8, 4, -3, 1])
threshold = -0.5
result = return_values_over_threshold(arr, threshold)
assert (result == np.array([4, 5, 7, 12, 0, 4, 1])).all()

arr = np.array([[17, 16, 38], [14, 1, 20], [43, 11, 23], [31, 15, 18]])
threshold = 16
result = return_values_over_threshold(arr, threshold)
assert (result == np.array([17, 16, 38, 20, 43, 23, 31, 18])).all()

print("All test passed!")

All test passed!


# 3. Reshaping, Transposing and Concatenating Arrays

Every so often, we want to reshape our NumPy arrays. The content stays the same when reshaping, making it a very fast operation.

For example, if we have a matrix $A=\begin{pmatrix}1&2&3\\4&5&6\\7&8&9\end{pmatrix}$ stored as a NumPy array `arr` and we want to "flatten" it to a 1-dimensional array of length $9$, we can simply write `arr.reshape(9)`. We can also use `-1` to let NumPy infer the size when possible. Furthermore, we can reshape the array back into a 2-dimensional one.

In [27]:
# Create a 3x3 array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr)
print(f"shape: {arr.shape}")

# Flatten the array
arr = arr.reshape(9) # or arr.reshape(-1)
print(arr)
print(f"shape: {arr.shape}")

# Reshape it back to a 3x3 array
arr = arr.reshape(-1, 3) # or arr.reshape(3, 3) or arr.reshape(3, -1)
print(arr)
print(f"shape: {arr.shape}")

[[1 2 3]
 [4 5 6]
 [7 8 9]]
shape: (3, 3)
[1 2 3 4 5 6 7 8 9]
shape: (9,)
[[1 2 3]
 [4 5 6]
 [7 8 9]]
shape: (3, 3)


Here are some more examples of reshaping.

In [28]:
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr)
print(f"shape: {arr.shape}")

arr = arr.reshape(-1, 4)
print(arr)
print(f"shape: {arr.shape}")

arr = arr.reshape(1, 8)
print(arr)
print(f"shape: {arr.shape}")

arr = arr.reshape(8, 1)
print(arr)
print(f"shape: {arr.shape}")

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
shape: (2, 2, 2)
[[1 2 3 4]
 [5 6 7 8]]
shape: (2, 4)
[[1 2 3 4 5 6 7 8]]
shape: (1, 8)
[[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]]
shape: (8, 1)


We can also transpose a matrix `arr` by using `arr.transpose()` or simply `arr.T`. For "transposing" higher-dimensional arrays, see `np.swapaxes()` ([documentation](https://numpy.org/doc/stable/reference/generated/numpy.swapaxes.html)).

In [29]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr)

arr_transposed = arr.T # or arr.transpose()
print(arr_transposed)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[1 4 7]
 [2 5 8]
 [3 6 9]]


Often, we would like to join multiple NumPy arrays together. NumPy offers many functions that help us achieve this.

The two main ones are `np.concatenate()` ([documentation](https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html)) and `np.stack()` ([documentation](https://numpy.org/doc/stable/reference/generated/numpy.stack.html)).

Stacking with `np.stack()` adds a new dimension while stacking arrays, whereas `np.concatenate()` joins arrays along an existing axis without adding a new dimension.

Here is a simple example.

In [72]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Stack arrays along a new axis (in dimension 0, i.e., stack as rows)
stacked_arr = np.stack((arr1, arr2), axis=0)
print("Stacked in new dimension 0")
print(stacked_arr)
print(f"shape: {stacked_arr.shape}")

# Stack arrays along another new axis
stacked_arr = np.stack((array1, arr2), axis=1)
print("Stacked in new dimension 1")
print(stacked_arr)
print(f"shape: {stacked_arr.shape}")

# Concatenate in existing dimension 0
concatenated_arr = np.concatenate((arr1, arr2), axis=0)
print("Concatenated in existing dimension")
print(concatenated_arr)
print(f"shape: {concatenated_arr.shape}")

Stacked in new dimension 0
[[1 2 3]
 [4 5 6]]
shape: (2, 3)
Stacked in new dimension 1
[[1 4]
 [2 5]
 [3 6]]
shape: (3, 2)
Concatenated in existing dimension
[1 2 3 4 5 6]
shape: (6,)


Let us see how we can concatenate two 2-dimensional arrays as well.

In [79]:
arr3 = np.array([[1, 2, 3], [4, 5, 6]])
arr4 = np.array([[7, 8, 9], [10, 11, 12]])

print("Original arrays")
print(f"arr3 =\n{arr3} (shape: {arr3.shape})\n")
print(f"arr4 =\n{arr4} (shape: {arr4.shape})\n")

# Concatenate arrays along axis=0
concatenated_arr = np.concatenate((arr3, arr4), axis=0)
print("Stacked along dimension 0 (rows)")
print(concatenated_arr)
print(f"new shape: {concatenated_arr.shape}")

# Concatenate arrays along axis=1
concatenated_arr = np.concatenate((array3, array4), axis=1)
print("Stacked along dimension 1 (columns)")
print(concatenated_arr)
print(f"new shape: {concatenated_arr.shape}")

Original arrays
arr3 =
[[1 2 3]
 [4 5 6]] (shape: (2, 3))

arr4 =
[[ 7  8  9]
 [10 11 12]] (shape: (2, 3))

Stacked along dimension 0 (rows)
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
new shape: (4, 3)
Stacked along dimension 1 (columns)
[[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]
new shape: (2, 6)


Here is a full list (I think) of different functions for stacking and concatenating arrays in NumPy. But usually, `concatenate` and `stack` will be sufficient for most tasks.

**Concatenation Functions:**
- `np.concatenate()`
- `np.vstack()`
- `np.hstack()`
- `np.dstack()`
- `np.column_stack()`
- `np.row_stack()`

**Stacking Functions:**
- `np.stack()`
- `np.block()`
- `np.tile()`
- `np.repeat()`

**Shorthand Utilities:**
- `np.c_[]`
- `np.r_[]`

## Exercises

### 1. Matrix from `np.arange()`

Create a NumPy array `arr` storing the matrix $\begin{pmatrix}1&4&7\\10&13&16\\19&22&25\end{pmatrix}$ using only `np.arange()` and `arr.reshape()`.

In [30]:
# You code here
arr = ...

# Solution:
arr = np.arange(1, 26, 3).reshape(-1 ,3)

# Automatic test
assert (arr == np.array([[1, 4, 7], [10, 13, 16], [19, 22, 25]])).all()
print("Test passed!")

Test passed!


### 2. 2D to 3D

Reshape the 2D array/matrix `arr` of shape `(2, 9)` into a 3D array of shape `(2, 3, 3)`. Print the array before and after reshaping to better understand what is going on.

In [31]:
arr = np.array([[3, 6, 1, 9, 2, 4, 8, 3, 2], [0, 1, 6, 3, 7, 4, 3, 6, 1]])

# Your code here
...

# Solution
print(arr)
arr = arr.reshape(2, 3, 3)
print(arr)

# Automatic test
assert (arr == np.array([[[3, 6, 1], [9, 2, 4], [8, 3, 2]], [[0, 1, 6], [3, 7, 4], [3, 6, 1]]])).all()
print("Test passed!")

[[3 6 1 9 2 4 8 3 2]
 [0 1 6 3 7 4 3 6 1]]
[[[3 6 1]
  [9 2 4]
  [8 3 2]]

 [[0 1 6]
  [3 7 4]
  [3 6 1]]]
Test passed!


### 3. Reshape, Transpose and Reshape

Create a 2D array `arr` with shape `(3, 4)` containing the numbers 0 through 11. Transpose it and then reshape it to a 3D array with shape `(2, 2, 3)`.

**Note:** You only need to use `np.arange()`, `arr.reshape()` and `arr.T` (or `arr.transpose()`) to solve this task. Again, do not use loops.

In [32]:
# Your code here
arr = ...

# Solution
arr = np.arange(12).reshape(3, 4)
arr = arr.T
arr = arr.reshape(2, 2, 3)

# Automatic test
assert (arr == np.array([[[0, 4, 8], [1, 5, 9]], [[2, 6, 10], [3, 7, 11]]])).all()
print("Test passed!")

Test passed!


### 4. Stacking Grades

You are given three 1D arrays representing the scores of students in three different subjects: Math, Science, and English.

1. Use `np.stack()` to combine the arrays along a new axis so that each row in the resulting array represents the scores of a student across all subjects.
2. Print the resulting array.
3. Verify the shape of the resulting array to ensure it has the correct dimensions `(4, 3)`.

The output should be as follows:

```python
[[85 88 87]
 [90 94 85]
 [78 80 90]
 [92 86 88]]
Shape: (4, 3)
```

In [81]:
math_scores = np.array([85, 90, 78, 92])
science_scores = np.array([88, 94, 80, 86])
english_scores = np.array([87, 85, 90, 88])

# Your code here
stacked_scores = ...

# Solution
stacked_scores = np.stack((math_scores, science_scores, english_scores), axis=1)
# Print the resulting array
print("Stacked Scores:")
print(stacked_scores)
# Verify the shape
print("Shape:", stacked_scores.shape)

Stacked Scores:
[[85 88 87]
 [90 94 85]
 [78 80 90]
 [92 86 88]]
Shape: (4, 3)


### 5. °Concatenate

You are given two 2D arrays representing the daily temperatures (in Celsius) recorded over a week in two different cities. Each array has rows representing days and columns representing the morning and evening temperatures. Use `np.concatenate()` to combine these arrays along the vertical axis (axis 0) to get a complete record of temperatures for both cities.

That is, your task is to

1. Use `np.concatenate()` to combine the arrays along the vertical axis (axis 0).
2. Print the resulting array.
3. Verify the shape of the resulting array to ensure it has the correct dimensions `(14, 2)`.

In [83]:
city1_temperatures = np.array([[15, 22], 
                               [16, 23], 
                               [15, 21], 
                               [14, 20], 
                               [13, 19], 
                               [12, 18], 
                               [14, 21]])

city2_temperatures = np.array([[18, 25], 
                               [17, 24], 
                               [19, 26], 
                               [16, 23], 
                               [15, 22], 
                               [14, 21], 
                               [17, 24]])

# Your code here
concatenated_temperatures = ...

# Solution
concatenated_temperatures = np.concatenate((city1_temperatures, city2_temperatures), axis=0)
# Print the resulting array
print("Concatenated Temperatures:")
print(concatenated_temperatures)
# Verify the shape
print("Shape:", concatenated_temperatures.shape)

Concatenated Temperatures:
[[15 22]
 [16 23]
 [15 21]
 [14 20]
 [13 19]
 [12 18]
 [14 21]
 [18 25]
 [17 24]
 [19 26]
 [16 23]
 [15 22]
 [14 21]
 [17 24]]
Shape: (14, 2)


# 4. Basic Array Operations and Broadcasting

Perform mathematical operations between a NumPy array and a scalar is straight-forward.

In [33]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])

print(f"Original:\n {arr}")
print(f"Multiply by 2:\n {2.0 * arr}") # Multiply all values by 2
print(f"Divide by 10:\n {arr / 10.0}") # Divide all values by 10
print(f"Subtract 3:\n {arr - 3.0}")    # Subtract 3 from all values in arr
print(f"Add 4:\n {arr + 4.0}")         # Add 4 to all values in arr
print(f"Square:\n {arr ** 2}")         # Square all values in arr

Original:
 [[1. 2. 3.]
 [4. 5. 6.]]
Multiply by 2:
 [[ 2.  4.  6.]
 [ 8. 10. 12.]]
Divide by 10:
 [[0.1 0.2 0.3]
 [0.4 0.5 0.6]]
Subtract 3:
 [[-2. -1.  0.]
 [ 1.  2.  3.]]
Add 4:
 [[ 5.  6.  7.]
 [ 8.  9. 10.]]
Square:
 [[ 1.  4.  9.]
 [16. 25. 36.]]


If two NumPy arrays are of the **same shape**, we can just as easily perform element-wise operations as follows:

In [34]:
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([[2, 1,-1], [3,.5, 2]])

print(f"arr1 =\n{arr1}")
print(f"arr2 =\n{arr2}")

print(f"arr1 + arr2 =\n{arr1 + arr2}")  # Element-wise addition
print(f"arr1 - arr2 =\n{arr1 + arr2}")  # Element-wise subtraction
print(f"arr1 * arr2 =\n{arr1 + arr2}")  # Element-wise multiplication
print(f"arr1 / arr2 =\n{arr1 + arr2}")  # Element-wise division
print(f"arr1 ** arr2 =\n{arr1 + arr2}") # Element-wise exponentiation

arr1 =
[[1 2 3]
 [4 5 6]]
arr2 =
[[ 2.   1.  -1. ]
 [ 3.   0.5  2. ]]
arr1 + arr2 =
[[3.  3.  2. ]
 [7.  5.5 8. ]]
arr1 - arr2 =
[[3.  3.  2. ]
 [7.  5.5 8. ]]
arr1 * arr2 =
[[3.  3.  2. ]
 [7.  5.5 8. ]]
arr1 / arr2 =
[[3.  3.  2. ]
 [7.  5.5 8. ]]
arr1 ** arr2 =
[[3.  3.  2. ]
 [7.  5.5 8. ]]


Broadcasting allows NumPy to work with arrays of **different shapes** when performing arithmetic operations. The smaller array is "broadcast" across the larger array so that they have compatible shapes. This makes many operations much more efficient.

**Broadcasting Rules**
- Arrays have compatible shapes if they are equal or one of them is 1.
- If the arrays do not have the same number of dimensions, prepend the shape of the smaller array with ones until they have the same number of dimensions.
- If any dimension does not match and is not 1, then broadcasting will not work.

Operations involving a NumPy array and a scalar is a special case of broadcasting where the scalar (which we can think of as a NumPy array of shape `(1,)`) is broadcast to the same shape as the NumPy array.

Here is a more interesting example: If you have a 1D array and a 2D array where the 1D array's shape is compatible with the trailing dimensions of the 2D array, broadcasting will occur.

In [35]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_1d = np.array([10, 20, 30])

result = arr_2d + arr_1d # This will broadcast arr_1d into [[10, 20, 30], [10, 20, 30]] and then do addition element-wise!
print(result)

[[11 22 33]
 [14 25 36]]


Here is another example of broadcasting where we want to multiply the first row of `arr_2d` by $1$ and the second row by $2$.

In [36]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_1d = np.array([1, 2])

# Reshape arr_1d to (2, 1) to make it compatible
arr_1d = arr_1d.reshape(2, 1)

result = arr_2d * arr_1d
print(result)

[[ 1  2  3]
 [ 8 10 12]]


When broadcasting is not possible, NumPy will raise an error. You will probably encounter this type of error message many times, so here is an example to help you get to know eachother.

In [37]:
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3, 4])

try:
    result = arr1 + arr2
except ValueError as e:
    print(f"Error: {e}")

Error: operands could not be broadcast together with shapes (2,3) (4,) 


**Summary:** Broadcasting makes it easy to perform operations on arrays of different shapes without having to manually resize them. By following the broadcasting rules, NumPy automatically handles the necessary shape transformations to enable efficient computation.

## Exercises

### 1. Adding Scalars to Arrays

Create a 1D array `arr` with values `[1, 2, 3, 4, 5]`. Add a scalar value `10` to `arr` and print the result.

In [38]:
# Your code here
arr = ...

# Solution
arr = np.array([1, 2, 3, 4, 5])
arr = arr + 10
print(arr)

[11 12 13 14 15]


### 2. Multiply 1D Arrays of Same Shape

Create two 1D arrays `arr1` with values `[1, 2, 3]` and `arr2` with values `[10, 20, 30]`. Multiply `arr1` and `arr2` element-wise and print the result.

In [39]:
# Your code here
arr1 = ...
arr2 = ...

# Solution
arr1 = np.array([1, 2, 3])
arr2 = np.array([10, 20, 30])
result = arr1 * arr2
print(result)

[10 40 90]


### 3. Broadcasting with Different Shapes

Create a 2D array `arr1` with values `[[1, 2, 3], [4, 5, 6]]` and a 1D array `arr2` with values `[1, 2, 3]`. Add `arr1` and `arr2` and print the result. Try to understand how `arr2` was broadcasted by NumPy.

In [40]:
# Your code here
arr1 = ...
arr2 = ...

# Solution
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3]) # Broadcasted to [[1, 2, 3], [1, 2, 3]] of shape (2, 3)
result = arr1 + arr2
print(result)

[[2 4 6]
 [5 7 9]]


### 4. Broadcasting with Different Dimensions

Create a 3D array `arr1` with shape `(2, 2, 3)` containing values from 1 to 12 and a 1D array `arr2` with values `[1, 2, 3]`. Add `arr1` and `arr2` and print the result. Try to understand how `arr2` was broadcasted by NumPy before the addition took place.

In [41]:
# Your code here
arr1 = ...
arr2 = ...

# Solution
arr1 = np.arange(1, 13).reshape(2, 2, 3)
arr2 = np.array([1, 2, 3])
result = arr1 + arr2
print(result)

[[[ 2  4  6]
  [ 5  7  9]]

 [[ 8 10 12]
  [11 13 15]]]


### 5. Reshaping for Broadcasting

Create a 2D array `arr1` with shape `(3, 4)` containing values from 0 to 11 and a 1D array `arr2` with values `[1, 2, 3]`. Reshape `arr2` to be compatible with `arr1` and then add them together. Print the result and try to understand what is going on.

In [42]:
# Your code here
arr1 = ...
arr2 = ...

# Solution
arr1 = np.arange(12).reshape(3, 4)
arr2 = np.array([1, 2, 3])
arr2 = arr2.reshape(3, 1)
result = arr1 + arr2
print(result)

[[ 1  2  3  4]
 [ 6  7  8  9]
 [11 12 13 14]]


### 6. Broadcasting with Higher Dimensions

Create a 3D array `arr1` with shape `(2, 3, 4)` containing values from 0 to 23 and a 1D array `arr2` with values `[1, 2, 3, 4]`. Add `arr1` and `arr2` and print the result. How did NumPy broadcast `arr2`?

In [43]:
# Your code here
arr1 = ...
arr2 = ...

# Solution
arr1 = np.arange(24).reshape(2, 3, 4)
arr2 = np.array([1, 2, 3, 4])
result = arr1 + arr2
print(result)

[[[ 1  3  5  7]
  [ 5  7  9 11]
  [ 9 11 13 15]]

 [[13 15 17 19]
  [17 19 21 23]
  [21 23 25 27]]]


# 5. Matrix and Vector Algebra in NumPy

NumPy can also perform matrix multiplication and dot products instead of the element-wise operations we have seen so far.

In this example, we first create two matrices $A=\begin{pmatrix}1&2&3\\4&5&6\end{pmatrix}$ and $B=\begin{pmatrix}2&4\\6&8\\10&12\end{pmatrix}$ as NumPy arrays and then compute their [matrix product](https://en.wikipedia.org/wiki/Matrix_multiplication) $AB$. You can either use the shorthand syntax `A @ B` or `np.matmul(A, B)` as these are the same.

For 2D arrays, we can also compute the matrix product by `np.dot(A, B)`.

In [44]:
A = np.array([[1, 2, 3], [4, 5, 6]])
B = np.array([[2, 4], [6, 8], [10, 12]])

print(A)
print(B)

# These are all equivalent for 2D-arrays:
print(A @ B)
print(np.matmul(A, B))
print(np.dot(A, B))

[[1 2 3]
 [4 5 6]]
[[ 2  4]
 [ 6  8]
 [10 12]]
[[ 44  56]
 [ 98 128]]
[[ 44  56]
 [ 98 128]]
[[ 44  56]
 [ 98 128]]


We can also do matrix-vector multiplication $Ax$ for a vector $x$. This is just a special case of matrix multiplication, really. 

Notice that if we try to use `*` instead of the `@`, the vector `x` is broadcast, and we end up with a new matrix with the same shape as `A`.

We can also use `np.dot()` to compute the dot-product $x\cdot y$ between two vectors. Alternatively, we can use matrix multiplication for computing the dot product as well since $x\cdot y = x^T y$.

In [50]:
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
x = np.array([1, -1, 2])

print(A @ x)           # Compute Ax matrix-vector product
print(A * x)           # NB! This is not the same as the previous line
print(np.matmul(A, x)) # But this computes Ax ...
print(np.dot(A, x))    # ...and this too.

# Compute dot-product of x and y
y = np.array([-1, 3, 0])
print(np.dot(x, y))
print(x.T @ y) # Equivalent way of computing the dot product

[ 5 11 17]
[[ 1 -2  6]
 [ 4 -5 12]
 [ 7 -8 18]]
[ 5 11 17]
[ 5 11 17]
-4
-4


For more advanced use of `np.matmul()` on higher-dimensional arrays see the [documentation](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html). Many high-dimensional operations can also be implemented using NumPy's `einsum()` function ([documentation](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html)).

Here are a few more matrix related functions provided by NumPy.

In [54]:
A = np.array([[1, -2, 3], [4, 5, -6], [-7, 8, 9]])

# Get the diagonal of A
print(A.diagonal()) # or np.diag(A)

# Compute trace (sum of diagonal)
print(A.trace())    # or np.trace(A) or A.diagonal().sum()

# Compute the inverse of the matrix A
print(np.linalg.inv(A))

# Print the determinant of A
print(np.linalg.det(A))

[1 5 9]
15
[[ 0.32978723  0.14893617 -0.0106383 ]
 [ 0.0212766   0.10638298  0.06382979]
 [ 0.23758865  0.0212766   0.04609929]]
282.00000000000006


## Exercises

### 1. Element-wise Operations vs. Matrix Multiplication

Perform element-wise multiplication and matrix multiplication for given matrices A and B. Print the results and see how the two operations differ.

In [55]:
A = np.array([[1, 2], 
              [3, 4]])
B = np.array([[5, 6], 
              [7, 8]])

# Your code here
elementwise_product = ...
matrix_product = ...

# Solution
elementwise_product = A * B
matrix_product = np.dot(A, B)

print(elementwise_product)
print(matrix_product)

[[ 5 12]
 [21 32]]
[[19 22]
 [43 50]]


### Properties of Matrix Multiplication

Recall that matrix multiplication is what we call *associative*. This simply means that $A(BC)=(AB)C$. In other words, multiplying $B$ and $C$ first and then multiply the result with $A$ is the same as first multiplying $A$ and $B$, and then multiplying the result with $C$.

Also recall that matrix multiplication is **not** *commutative*. That is, it is not always true that $AB\neq BA$.

1. Compute $A(BC)$ and $(AB)C$ using NumPy and check that the resulting matrices are equal.
2. Compute $AB$ and $BA$ using NumPy and verify that the two products are different.

In [56]:
A = np.array([[1, 2], 
              [3, 4]])
B = np.array([[5, 6], 
              [7, 8]])
C = np.array([[9, 10], 
              [11, 12]])

# Your code here
ABC1 = ...
ABC2 = ...

AB = ...
BA = ...

# Solution
ABC1 = A @ (B @ C)
ABC2 = (A @ B) @ C
print((ABC1 == ABC2).all())

AB = A @ B
BA = B @ A
print((AB == BA).all())

True
False


### 3. Solve Linear System of Equations

Solve the equation $Ax=b$ where $A=\begin{pmatrix}1&4&-1\\2&-5&0\\5&-1&-2\end{pmatrix}$ and $b=\begin{pmatrix}8\\2\\4\end{pmatrix}$ using NumPy.

**Hint:** Use `np.linalg.inv()` to find the inverse $A^{-1}$ of $A$. Then use matrix-vector multiplication to compute $x=A^{-1}Ax=A^{-1}b$ to solve for $x$.

You can compare your answer to the one you get using `np.linalg.solve(A, b)`.

In [66]:
A = np.array([[1,  4, -1],
              [2, -5,  0],
              [5, -1, -2]])

b = np.array([8, 2, 4])

# You code here
A_inv = ...
x = ...

# Solution
A_inv = np.linalg.inv(A)
x = A_inv @ b
print(x)
x_2 = np.linalg.solve(A, b)
print(np.allclose(x_2, x)) # Check that the solutions are approximately equal

[26. 10. 58.]
True


# 6. Reduction Operations in NumPy

For this part, let us create a dummy dataset consisting of $200$ rows and $5$ columns by sampling from the uniform distribution  $\mathcal{U}(-10, 10)$.

In [112]:
dataset = np.random.uniform(-10, 10, (200, 5))
print(arr[:10]) # Print the first 10 rows of our dataset

[[ 3.44157465 -0.10578764 -1.82770155 -2.88859143  8.07098534]
 [ 8.8439462  -3.79143762 -8.81061441 -4.24613652 -8.16639211]
 [-8.00199045 -7.95015806  8.4792619   9.35918909  6.21232087]
 [ 1.46677918  8.24212796 -1.83511693 -5.21368799 -9.57276995]
 [-4.64061016 -0.05065548  6.76410011 -0.8349231  -3.16507557]
 [ 1.89995065  2.61241636 -3.31683267 -0.97481748 -9.5595314 ]
 [-3.85347578 -9.68592591 -3.6391086  -7.98968209 -9.63762665]
 [ 1.21963159  5.87235349 -9.15116778  5.14038166 -1.72111261]
 [ 0.45346451 -3.58572502 -3.11931895  4.64064119  7.1129274 ]
 [-4.55021207  7.19440627 -7.30196755 -5.28522218  6.39951544]]


To compute the mean value, we can use the array method `arr.mean()`.

In [116]:
mean = np.mean(dataset) # or dataset.mean()
print(mean)

0.15667504348180866


Sometimes, we want to compute the mean for each column or row. We can do this by specifying which dimension we wish to compute the mean along.

In [123]:
# Compute the mean along axis 1. This will take the mean over all columns for each row
# resulting in an array of shape (200,)
row_means = np.mean(dataset, axis=1) # or dataset.mean(axis=1)
print(row_means.shape) # The mean for each row 
print(row_means[:10])  # Print the mean for the first 10 rows

# Similarly, for computing the mean over all rows
col_means = np.mean(dataset, axis=0) # or dataset.mean(axis=0)
print(col_means.shape)
print(col_means)

(200,)
[-1.79815425 -3.37033707 -2.2520202   1.04569135  0.1255295  -1.77977453
  1.52102308 -1.46294807 -0.36619767  3.12376951]
(5,)
[ 0.24133725  0.6617432   0.09774087  0.0716813  -0.2891274 ]


There are similar functions for computing the sum, minimum, maximum and standard deviation of an array. We can also specify over which axes we want to compute along for these functions.

In [131]:
print(np.sum(dataset, axis=0))      # The sum of each column (compute along row-dimension, axis 0)
print(np.max(dataset))              # The largest value in dataset
print(np.min(dataset, axis=1)[:10]) # The smallest values for the 10 first rows
print(np.std(dataset, axis=0))      # Standard deviation for each column

[ 48.26744952 132.3486391   19.54817436  14.33625955 -57.82547904]
9.996411693778128
[-9.59862926 -5.32404305 -8.17597457 -9.55260763 -8.98155301 -5.63642914
 -6.98296083 -9.38544457 -3.22375828 -5.6153777 ]
[6.05973596 5.83368762 5.87821233 6.00109297 5.53084512]


If we want to find the index of the smallest or largest value, we can use `np.argmin()` and `np.argmax()`, respectively. See the following example where we find the index (and value) of the smallest and largest value in the first column of `dataset`.

In [155]:
first_column = dataset[:, 0]      # Select only the first column

min_idx = np.argmin(first_column) # Find the index of the smallest value
min_val = first_column[min_idx]   # Get the value at index min_idx

print(f"The smallest value in the first column is {min_val:.4f} at index {min_idx}")

# Do the same with argmax
max_idx = np.argmax(first_column) # Find the index of the smallest value
max_val = first_column[max_idx]   # Get the value at index min_idx

print(f"The largest value in the first column is {max_val:.4f} at index {max_idx}")

The smallest value in the first column is -9.6672 at index 177
The largest value in the first column is 9.6946 at index 166


## Exercises

### 1. Computing the Mean, Sum and Maximum

You are given an array `A` in the code cell below. 

Compute
1. The mean of all elements in `A` using `np.mean()`,
2. The sum of each row, using `np.sum()`, and
3. The largest value in each column using `np.max()`.

Print the result to verify that it works as you intended. 

You need to use the `axis=` argument for 2. and 3.

In [202]:
A = np.array([[1, 2, 3], 
              [4, 5, 6], 
              [7, 8, 9]])

# Your code here
total_mean = ...
row_sums = ...
column_maxes = ...

# Solution
total_mean = np.mean(A)
row_sums = np.sum(A, axis=1)
column_maxes = np.max(A, axis=0)

print(total_mean)
print(row_sums)
print(column_maxes)

5.0
[ 6 15 24]
[7 8 9]


### 2. Standardize Each Column of a Dataset

You are given a dataset of shape `(200, 5)` stored in the variable `X`. Think of this as 200 observations of 5 different measurements/variables.

We now want to [standardize](https://en.wikipedia.org/wiki/Standard_score#Calculation) each column $X_i$ by applying the transform $\hat{X}_i = \frac{X_i - \hat{\mu}_i}{\hat{\sigma}_i}$ where $\hat{\mu}_i$ and $\hat{\sigma}_i$ are the mean and standard deviation of column $i$, respectively.

Compute the means and standard deviations for each column and standardize `X`. Store the standardized version of `X` in a new variable `X_hat`.

Verify that the mean and standard deviation of each column of `X_hat` is (very close to) 0 and 1, respectively.

**Hints:**

1. Compute the mean and standard deviations for each column by using `np.mean()` and `np.std()` with the argument `axis=0` ("compute along rows").
2. Taking advantage of broadcasting, all columns can be standardized simultaneously by simply writing `X_hat = (dataset - mean) / std`.
3. Use `np.mean(X_hat, axis=0)` and `np.std(X_hat, axis=0)` to verify that each column now has mean $\approx 0$ and standard deviation $\approx 1$.

In [157]:
X = np.random.uniform(-10, 10, (200, 5))

# Your code here
mean = ...
std = ...
X_hat = ...

mean = np.mean(dataset, axis=0)
std = np.std(dataset, axis=0)

X_hat = (X - mean) / std
print(X_hat.mean(axis=0))
print(X_hat.std(axis=0))

[-0.10817755 -0.08040524  0.04223062 -0.0852662  -0.0852476 ]
[1.01009035 1.0087451  0.97697125 0.93580758 1.07650751]


### 3. Different Ways of Doing Things

Below are three cases where two different expressions give the same result.

Try to understand why they give (approximately) the same answers.

In [186]:
X = np.random.uniform(-10, 10, (200, 5))

# Case 1
print(X.sum(axis=0) / X.shape[0])
print(X.mean(axis=0))

# Case 2
print(X.std())
print(np.mean((X - X.mean())**2) ** 0.5)

# Case 3
print(X.sum())
print((X.mean(0) * len(X)).sum())

[ 0.24052826 -0.14532175  0.19644168 -0.52014782  0.4249213 ]
[ 0.24052826 -0.14532175  0.19644168 -0.52014782  0.4249213 ]
5.807513183334541
5.807513183334541
39.2843364569759
39.28433645697582


### 4. Count With Boolean Arrays

You are given an array `X` with 100000 integers between 0 and 100 in the code cell below.

1. Create a boolean array `ge_50_mask` by using `(X > 50)`. This will create an array of the same shape as `X` with `True` or `False` depending on the value in `X` is greater than 50 or not.
2. Print the sum and mean of `ge_50_mask`.

What does the sum and mean of `ge_50_mask` tell us? And why are they approximately, 50000 and 0.5, respectively?

**Note:** When you do `np.sum()` or `arr.sum()` on a boolean array, NumPy will treat `False` as 0 and `True` as 1 and sum these. The same is true for mean and standard deviation.

In [274]:
X = np.random.randint(0, 100, 100000)

# Your code here
ge_50_mask = ...

# Solution
ge_50_mask = (X > 50)
print(ge_50_mask.sum())
print(ge_50_mask.mean())

# It means that approx. half of the random integers are greater than 50

48997
0.48997


# 7. Other Useful NumPy Functions

In this part, we will look at some other useful NumPy functions.

The function `np.where()` returns elements chosen from two arrays based on a condition. This is maybe best demonstrated with an example.

In [87]:
arr = np.array([10, 20, 30, 40, 50])

condition = arr > 30
x = np.array([-1, -2, -3, -4, -5])
y = np.array([1, 2, 3, 4, 5])
result = np.where(condition, x, y)

print(f"Original Array: {arr}")
print(f"Condition (array > 30): {condition}")
print(f"x array (selected when True): {x}")
print(f"y array (selected when False): {y}")
print(f"Result of np.where: {result}")

Original Array: [10 20 30 40 50]
Condition (array > 30): [False False False  True  True]
x array (selected when True): [-1 -2 -3 -4 -5]
y array (selected when False): [1 2 3 4 5]
Result of np.where: [ 1  2  3 -4 -5]


The function `np.unique()` finds unique elements of an array. If we pass `return_counts=True`, the function will also return the count of each unique element (how many times it appeared in the original array). We can also take unique along a given axis.

In [102]:
arr = np.array([1, 2, 2, 3, 1, 5, 6, 5])
unique_elements = np.unique(arr)

print(f"Original Array: {arr}")
print(f"Unique Elements: {unique_elements}")

# With counting
unique_elements, counts = np.unique(arr, return_counts=True)
print(f"Unique Elements: {unique_elements}")
print(f"Counts: {counts}")

Original Array: [1 2 2 3 1 5 6 5]
Unique Elements: [1 2 3 5 6]
Unique Elements: [1 2 3 5 6]
Counts: [2 2 1 2 1]


In [103]:
# Unique values along an axis (unique rows in this case)
arr = np.array([[1, 0],
                [3, 1],
                [1, 1],
                [1, 0],
                [6, 2],
                [3, 1],
                [1, 3],
                [1, 0]])

unique_elements = np.unique(arr, axis=0)
print(f"Original Array:\n{arr}")
print(f"Unique Elements:\n{unique_elements}")

Original Array:
[[1 0]
 [3 1]
 [1 1]
 [1 0]
 [6 2]
 [3 1]
 [1 3]
 [1 0]]
Unique Elements:
[[1 0]
 [1 1]
 [1 3]
 [3 1]
 [6 2]]


There are two main sorting functions provided by NumPy:
- `np.sort()` - sorts elements of an array.
- `np.argsort()` - returns indices of sorted elements.

In [106]:
array = np.array([3, 1, 2, 5, 4])

sorted_array = np.sort(array)
sorted_indices = np.argsort(array)

print(f"Original Array: {array}")
print(f"Sorted Array: {sorted_array}")
print(f"Indices of Sorted Array: {sorted_indices}")

Original Array: [3 1 2 5 4]
Sorted Array: [1 2 3 4 5]
Indices of Sorted Array: [1 2 0 4 3]


Also these functions supports the `axis=` argument.

In [107]:
array_2d = np.array([[3, 1, 4],
                     [1, 5, 9],
                     [2, 6, 5]])

# Sort along axis 0 (columns)
sorted_axis0 = np.sort(array_2d, axis=0)
sorted_indices_axis0 = np.argsort(array_2d, axis=0)

# Sort along axis 1 (rows)
sorted_axis1 = np.sort(array_2d, axis=1)
sorted_indices_axis1 = np.argsort(array_2d, axis=1)

print("Original 2D Array:")
print(array_2d)
print()

print("Sorted along Axis 0 (columns):")
print("Sorted Array:")
print(sorted_axis0)
print("Indices of Sorted Array:")
print(sorted_indices_axis0)
print()

print("Sorted along Axis 1 (rows):")
print("Sorted Array:")
print(sorted_axis1)
print("Indices of Sorted Array:")
print(sorted_indices_axis1)

Original 2D Array:
[[3 1 4]
 [1 5 9]
 [2 6 5]]

Sorted along Axis 0 (columns):
Sorted Array:
[[1 1 4]
 [2 5 5]
 [3 6 9]]
Indices of Sorted Array:
[[1 0 0]
 [2 1 2]
 [0 2 1]]

Sorted along Axis 1 (rows):
Sorted Array:
[[1 3 4]
 [1 5 9]
 [2 5 6]]
Indices of Sorted Array:
[[1 0 2]
 [0 1 2]
 [0 2 1]]
