# Numpy Exercices

- All the exercises in this section must be implemented using the `numpy` library.
- The vast majority of exercises can be solved **without** any `for` or `while` loop, just using Numpy's functions or vectorized operations.
- The vast majority of exercises can be solved with at most 1-6 lines of code.
- If you don't know how to solve something, feel free to search on the internet about your problem. For example, `"How to sort an array in numpy?"`.
- It is better if you don't use ChatGPT for finding the solution, so you get used to program with `numpy`.

In [None]:
import numpy as np

## Zeros

Initialize a null vector (with zeroes) of length 20.

In [1]:
np.zeros(20)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.])

## Matrix Init

Initialize a null matrix (with zeros) with size 3x2.

In [2]:
np.zeros((3,2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

## Replace

Replace the minimum element of a random vector with a 0.

In [10]:
v[v.argmin()] = 0
v

array([0.57285348, 0.90757159, 0.16290208, 0.56451768, 0.72194142,
       0.69789873, 0.44909756, 0.        , 0.24303745, 0.97388981])

## Consecutive

Create a vector with all consecutive values between 12 and 38.

In [6]:
np.arange(12, 39)

array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
       29, 30, 31, 32, 33, 34, 35, 36, 37, 38])

## Vector Init

Initialize a vector of length 10 with integer random numbers between 1 and 20.

In [11]:
v = np.random.randint(20, size=10)
v

array([ 8,  3,  7, 10, 17, 19, 11,  3,  2,  2])

a) Filter the elements that are greater than 3 and less than 12.

In [12]:
v[(v > 3) & (v < 12)]

array([ 8,  7, 10, 11])

b) Filter the elemetns that are less than or equal to 5 or greater than 15.

In [13]:
v[(v < 5) | (v > 15)]

array([ 3, 17, 19,  3,  2,  2])

## Random Matrix

Create a 4x4 matrix with integer random numbers between 0 and 5.

In [15]:
np.random.randint(6, size=(4,4))

array([[4, 5, 0, 5],
       [5, 3, 2, 5],
       [3, 0, 3, 5],
       [5, 3, 5, 2]])

## Not Zero

Find the indices of the elements that are not 0 in the vector: [2, 1, 0, 0, 3, 1, 0].

In [24]:
v = np.array([2, 1, 0, 0, 3, 1, 0])
np.where(v != 0)[0]

array([0, 1, 4, 5])

## Statistics

Create a random vector of length 10 and find the minimum, maximum, mean and standard deviation.

In [24]:
v = np.random.random(size=10)
v.max(), v.min(), v.mean(), v.std()

(0.9528553516873413,
 0.07689562026138008,
 0.39214818535806034,
 0.265099059517726)

## Surround

Create a 10x10 matrix with zeroes inside and ones surrounding them, like the following:

```
[[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]
```

In [26]:
V = np.ones((10,10))
V[1:-1, 1:-1] = 0
V

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

## Zero Replace

Define a function that, given a vector, returns all the elements greater than 10 replaced by 0.

Example: [4, 7, 15, 21, 3, 34] → [4, 7, 0, 0, 3, 0]

In [31]:
def sub10(v):
    v = np.array(v)
    v[v > 10] = 0
    return v

In [32]:
sub10([4, 7, 15, 21, 3, 34])

array([4, 7, 0, 0, 3, 0])

## Negate

Define a function that negates (changes the sign) all the elements of a vector with values between 2 and 6 (including also negative values between -6 and -2).

Example: [4, 7, -6, 21, 3, 34] → [-4,  7, 6, 21, -3, 34]

In [35]:
def reverse_sign(v):
    v = np.array(v)
    filt = (v >= 2) & (v <= 6)
    v[filt] = v[filt] * -1
    return v

In [38]:
reverse_sign([4, 7, 6, 21, 3, 34])

array([-4,  7, -6, 21, -3, 34])

## Matrix Negatives

Given a matrix containing both positive and negative numbers, modify it so:
- Each **positive number** remains unchanged.
- Each **negative number** is replaced by the sum of all positive numbers in the same column.

For example, this matrix:

```
np.array([
    [ 5, -2,  3],
    [-1,  4, -6],
    [ 2, -3,  1]
])
```

becomes:

```
np.array([
    [5, 4, 3],
    [7, 4, 4],
    [2, 4, 1]
])
```

Note: The `np.where()` function may be very useful.

In [16]:
m = np.array([
    [ 5, -2,  3],
    [-1,  4, -6],
    [ 2, -3,  1]
])

In [17]:
row_idxs, col_idxs = np.where(m < 0)
m[m < 0] = 0
cols_sum = m.sum(axis=0)
m[row_idxs, col_idxs] = cols_sum[col_idxs]

m

array([[5, 4, 3],
       [7, 4, 4],
       [2, 4, 1]])

More intuitive way:

In [None]:
idxs = np.where(m < 0)
m[m < 0] = 0
cols_sum = m.sum(axis=0)

for i, j in zip(*idxs):
    m[i, j] = cols_sum[j]

m

array([[5, 4, 3],
       [7, 4, 4],
       [2, 4, 1]])

There is even a shorter and trickier version:

In [30]:
# Compute column-wise sums of positive values
pos_sums = np.where(m > 0, m, 0).sum(axis=0)

# Build result matrix: keep positives, replace negatives by the column sum
m = np.where(m > 0, m, pos_sums)

m

array([[5, 4, 3],
       [7, 4, 4],
       [2, 4, 1]])

## Dates

Get the dates for today, yesterday and tomorrow. For today's date, you can use `np.datetime64('today')` then subtract or add 1 day to that.

In [None]:
yesterday = np.datetime64('today') - np.timedelta64(1)
today = np.datetime64('today')
tomorrow = np.datetime64('today') + np.timedelta64(1)

yesterday, today, tomorrow

(numpy.datetime64('2022-10-26'),
 numpy.datetime64('2022-10-27'),
 numpy.datetime64('2022-10-28'))

## September

Get all dates in September 2023.

In [13]:
np.arange('2023-09', '2023-10', dtype='datetime64[D]')

array(['2023-09-01', '2023-09-02', '2023-09-03', '2023-09-04',
       '2023-09-05', '2023-09-06', '2023-09-07', '2023-09-08',
       '2023-09-09', '2023-09-10', '2023-09-11', '2023-09-12',
       '2023-09-13', '2023-09-14', '2023-09-15', '2023-09-16',
       '2023-09-17', '2023-09-18', '2023-09-19', '2023-09-20',
       '2023-09-21', '2023-09-22', '2023-09-23', '2023-09-24',
       '2023-09-25', '2023-09-26', '2023-09-27', '2023-09-28',
       '2023-09-29', '2023-09-30'], dtype='datetime64[D]')

## Week

Create an array of dates starting from today and covering the next 7 days.

In [26]:
np.array([np.datetime64('today') + np.timedelta64(i, 'D') for i in range(8)])

array(['2023-08-15', '2023-08-16', '2023-08-17', '2023-08-18',
       '2023-08-19', '2023-08-20', '2023-08-21', '2023-08-22'],
      dtype='datetime64[D]')

## Diff

Given an array of dates, find the differences (in days) between each date and the first date in days.

In [None]:
dates = np.array(['2023-08-14', '2023-08-17', '2023-08-20', '2025-10-10'], dtype='datetime64')

In [27]:
dates[1:] - dates[0]

array([3, 6], dtype='timedelta64[D]')

## Closest

Define a function `closest(v, n)` that, given a vector and a number $N$, returns the element closest to $N$. In case of a tie return any of those elements.

In [19]:
def closest(v, n):
    v = np.array(v)
    idx = np.abs(v - n).argmin()
    return v[idx]

In [50]:
closest([1, 2, 3, 4], 5)

4

In [51]:
closest([1, 2, 3, 4], 2)

2

In [23]:
closest([1, 2, 2, 3, 4], 2.6)

3

In [25]:
assert closest([1, 2, 3, 4], 2) == 2
assert closest([1, 2, 3, 4], 5) == 4
assert closest([1, 2, 3, 4], -3) == 1
assert closest([1, 2, 2, 3, 4], 2) == 2
assert closest([1, 2, 2, 3, 4], 2.4) == 2
assert closest([1, 2, 2, 3, 4], 2.6) == 3

## N-Greater

Define a function that, given a positive number $N$ and a vector, returns the $N$ greater elements from the vector. You can assume $N$ is lower than the length of the vector.

In [57]:
def greater(v, n):
    v = np.array(v)
    v.sort()
    return v[::-1][:n]

In [58]:
greater([3, 7, 9, 3, 5], 2)

array([9, 7])

## Unique

Create a random vector of length 20 with values between 1 and 8. Which unique elements (without repetition) where created?

In [59]:
v = np.random.randint(1, 8, 20)
np.unique(v)

array([1, 2, 3, 4, 5, 6])

## NaN

What will be the result of the following statements?

First try to understand the concepts of `np.nan`, `np.inf` and `set`. Next, try to mentally solve the statements by using your logic. Finally, test if your expected results are correct. 

```python
0 * np.nan
np.nan == np.nan
np.inf > np.nan
np.nan - np.nan
np.nan in set([np.nan])
```

## Highest Mean

Given a 10x10 matrix of random values between 0 and 1, find and print the row with the highest mean value.

In [29]:
matrix = np.random.rand(10, 10)
mean_values = matrix.mean(axis=1)
matrix[np.argmax(mean_values)]

array([0.88357048, 0.86021135, 0.86918851, 0.3697665 , 0.52902631,
       0.90608512, 0.29074085, 0.66909895, 0.51804262, 0.71613131])

# Python vs Numpy

 For each one of the following exercises implement the functions using only Python, without any external libraries. Then compare the execution time with the `%%timeit` magic command (see the first example).

## Maximum

Write a function in Python that finds the maximum value of a list of numbers. Code it yourself, do not use the Python's `max()` function.

Create a list of random numbers with 10,000 elements or more.

Then test the average time it takes to run each of the `max` functions with the list (using the `%%timeit` magic command):
- Your custom `max` function.
- Python's built-in `max()` function.
- Numpy's `max()` function.

In [1]:
def my_max(l):
    # Find the maximum of a list of numbers
    mx = l[0]
    for e in l[1:]:
        if e > mx:
            mx = e
    return mx

In [2]:
tmp = np.random.randint(10, size=10000)

In [5]:
%%timeit

# My max
my_max(tmp)

594 µs ± 6.11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [7]:
%%timeit

# Python's max
max(tmp)

455 µs ± 4.74 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [6]:
%%timeit

# Numpy's max
np.max(tmp)

8.1 µs ± 31.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


## Mean

Write a function in Python that finds the average value of a list of numbers. Code it yourself, do not use Numpy's `mean()` function.

Create a list of random numbers with 10,000 elements or more.

Then test the average time it takes to run each of the `mean` functions with the list (using the `%%timeit` magic command):
- Your custom `mean` function.
- Numpy's `mean()` function.

In [1]:
def mean1(v):
    sum_ = 0
    count = 0
    for e in v:
        sum_ += e
        count += 1
    return sum_ / count

In [2]:
def mean2(v):
    return sum(v) / len(v)

In [None]:
tmp = np.random.randint(10, size=10000)

In [6]:
%%timeit

mean1(tmp)

1.2 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [7]:
%%timeit

mean2(tmp)

753 µs ± 21.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [8]:
%%timeit

np.mean(tmp)

15.3 µs ± 340 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


##  One-hot Encoding

Implement a function that returns a matrix with the one-hot encoding of a vector.

One-hot encoding: https://www.datacamp.com/tutorial/one-hot-encoding-python-tutorial

- Example: 
    - Input: [0, 2, 1, 2, 0, 1]
    - Output:
```
[[1., 0., 0.],
[0., 0., 1.],
[0., 1., 0.],
[0., 0., 1.],
[1., 0., 0.],
[0., 1., 0.]]
```

In [62]:
def one_hot_encode(vector):
    # Identify the number of unique classes
    n_classes = np.unique(vector).shape[0]
    # Initialize an array of zeros with shape (length of vector, number of classes)
    one_hot = np.zeros((vector.shape[0], n_classes))
    # For each item in the vector, set the corresponding class index to 1
    for i, value in enumerate(vector):
        one_hot[i, value] = 1
    return one_hot

In [63]:
vector = np.array([0, 2, 1, 2, 0, 1])
one_hot_encode(vector)

array([[1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [0., 0., 1.],
       [1., 0., 0.],
       [0., 1., 0.]])

## Min-Max Scaling

Implement a function `minmax_scale(values)` that rescales a list (or NumPy array) of numeric values so that all values fall within the range **[0, 1]**. Specifically, the smallest number will have the value of $0$, while the largest one will have the value of $1$. The rest of the numbers will fall in between.

This kind of normalization is common in machine learning, ensuring that all features contribute equally to a model’s training process, regardless of their original scale.

Given an array of values $X$, the formula for Min-Max scaling for each one of its values $x_i$ is:

$x_i\_scaled = (x_i - min(X)) / (max(X) - min(X))$



In [9]:
x = np.array([10, 15, 20, 30])
x_min = x.min()
x_scaled = (x - x_min) / (x.max() - x_min)
x_scaled

array([0.  , 0.25, 0.5 , 1.  ])

## Fruit Ninja

You are a Fruit Ninja master, slicing fruits with precision. You have sliced 100 fruits, each giving you a random score between 1 and 100. However, due to a ninja technique, every third score is doubled. Calculate your total score.

In [34]:
scores = np.random.randint(1, 101, 100)
scores[::3] *= 2
np.sum(scores)

6458

## Time Traveler

A historian time traveler wants to study major events in the past century. Create an array of dates starting from 1923-08-14 to 2023-08-14 in 10-year increments, then randomly select 3 dates to travel to.

In [39]:
dates = np.arange('1923-08-14', '2023-08-14', dtype='datetime64[10Y]')
# To ensure the dates, not just years
dates = dates.astype('datetime64[D]')
np.random.choice(dates, 3, replace=False)

array(['1950-01-01', '1920-01-01', '2010-01-01'], dtype='datetime64[D]')

## Aliens

Aliens are transmitting binary messages — arrays of 100 elements containing only 0s and 1s.

However, their real message is hidden inside the signal: it always starts and ends with 1, and everything else before the first 1 and after the last 1 is just noise.

For example, given the message `[0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0]`, the real message you have to extract is the range from the first 1 until the last one, both included `[1, 0, 1, 1, 0, 0, 1]`.

In [10]:
# Function that generates a random alien message

def get_aliens_message():
    # Creating a message where majority is zeros
    message = np.zeros(100, dtype=int)

    # Introducing the real message
    start_index = np.random.randint(10, 40)
    end_index = np.random.randint(60, 90)

    message[start_index:end_index] = np.random.randint(2, size=end_index-start_index)
    message[start_index] = 1
    message[end_index] = 1
    return message

In [None]:
message = get_aliens_message()

In [48]:
# Extract real message
start, end = np.where(message == 1)[0][[0, -1]]
# Real message
message[start:end+1]

array([1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0,
       1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0,
       0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1])

## Temperature Log

Implement the function `detect_temperature_anomalies(values)`.

You have a 365-day temperature log for a city, represented as an array of floats. Identify the days where the temperature is at least 2 standard deviations away from the mean.

In [None]:
def detect_temperature_anomalies(temperatures):
    mean_temp = np.mean(temperatures)
    anomalies = np.abs(temperatures - mean_temp) > 2 * np.std(temperatures)
    return np.where(anomalies)[0]

In [57]:
# Simulated temperatures around 25°C with std deviation 5°C
temps = np.random.normal(25, 5, 365)
detect_temperature_anomalies(temps)

array([  6,  37,  81, 110, 146, 156, 161, 187, 192, 233, 263, 275, 318,
       336, 361])

## Max Pooling

In Convolutional Neural Networks (CNNs), used for image processing, there’s a step called max pooling. It helps reduce the size of a matrix (often an image) while keeping only the most important information — the maximum value within small regions.

Write a function `max_pooling(matrix)` that performs 2×2 max pooling on a given matrix. That is, for each 2×2 region, take the maximum value and add it to a new matrix. Finally, return the resulting smaller matrix.

If the input matrix has dimensions $m · n$, the resulting matrix will have dimensions $(m−1) · (n−1)$.

For example, for the input matrix

[

    [1, 3, 2, 4]

    [5, 6, 1, 0]

    [2, 8, 3, 1]

    [4, 7, 2, 5]
],

the resulting matrix from the max pooling operation is

[
   
    [6 6 4]

    [8 8 3]

    [8 8 5]
]

**Explanation**:
- The function slides a 2×2 window one step at a time.
- For the top-left region of 2×2 that contains the values 1, 3, 5 and 6, the maximum value is 6. This gives the first value in the top-left of the returned matrix.
- These repeats across the entire matrix for each 2×2 region.

In [22]:
matrix = np.array([
    [1, 3, 2, 4],
    [5, 6, 1, 0],
    [2, 8, 3, 1],
    [4, 7, 2, 5],
])

In [23]:
def max_pooling(matrix):
    pool = np.empty(shape=(matrix.shape[0]-1, matrix.shape[1]-1))
    for i in range(0, len(matrix)-1):
        for j in range(0, len(matrix[0])-1):
            submat = matrix[i:i+2, j:j+2]
            pool[i, j] = submat.max()
    return np.array(pool)

max_pooling(matrix)

array([[6., 6., 4.],
       [8., 8., 3.],
       [8., 8., 5.]])

## The Game of Life

Implement "The Game of Life" using Numpy.

In [26]:
# Author: Nicolas Rougier

def iterate(Z):
    # Count neighbours
    N = (Z[0:-2,0:-2] + Z[0:-2,1:-1] + Z[0:-2,2:] +
         Z[1:-1,0:-2]                + Z[1:-1,2:] +
         Z[2:  ,0:-2] + Z[2:  ,1:-1] + Z[2:  ,2:])

    # Apply rules
    birth = (N==3) & (Z[1:-1,1:-1]==0)
    survive = ((N==2) | (N==3)) & (Z[1:-1,1:-1]==1)
    Z[...] = 0
    Z[1:-1,1:-1][birth | survive] = 1
    return Z

In [27]:
Z = np.random.randint(0, 2, (10,10))
for i in range(10):
    Z = iterate(Z)
    print(Z)
    print()

[[0 0 0 0 0 0 0 0 0 0]
 [0 1 0 1 1 0 1 1 1 0]
 [0 1 0 1 1 0 1 1 0 0]
 [0 1 0 0 0 0 0 0 1 0]
 [0 1 1 1 1 0 1 0 1 0]
 [0 0 0 0 0 0 1 0 1 0]
 [0 1 1 1 1 1 0 0 0 0]
 [0 0 0 1 1 1 1 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]

[[0 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 1 0 1 0 1 0]
 [0 1 0 1 1 0 1 0 0 0]
 [0 1 0 0 0 0 1 0 1 0]
 [0 1 1 1 0 1 0 0 1 0]
 [0 0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 1 0 0 0]
 [0 0 0 0 1 1 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]

[[0 0 0 0 0 0 0 0 0 0]
 [0 0 1 1 1 0 0 1 0 0]
 [0 0 0 1 1 0 1 0 0 0]
 [0 1 0 0 0 0 1 0 0 0]
 [0 1 1 0 0 1 1 0 0 0]
 [0 1 0 1 0 0 1 1 0 0]
 [0 0 0 0 0 0 1 1 0 0]
 [0 0 0 0 0 1 1 0 0 0]
 [0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]

[[0 0 0 0 0 0 0 0 0 0]
 [0 0 1 0 1 1 0 0 0 0]
 [0 0 0 0 1 0 1 1 0 0]
 [0 1 0 1 1 0 1 1 0 0]
 [0 1 0 0 0 1 0 0 0 0]
 [0 1 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 1 0 1 0 0]
 [0 0 0 0 0 1 1 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]

[[0 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 1 1 1 0 0 0]
 [0 0 1 0 0 0 0 1 0 0]
 [0