# 7. Other Useful NumPy Functions

In [None]:
import numpy as np

In this part, we will look at some other useful NumPy functions.

The function `np.where()` returns elements chosen from two arrays based on a condition. This is maybe best demonstrated with an example.

In [None]:
arr = np.array([10, 20, 30, 40, 50])

condition = arr > 30
x = np.array([-1, -2, -3, -4, -5])
y = np.array([1, 2, 3, 4, 5])
result = np.where(condition, x, y)

print(f"Original Array: {arr}")
print(f"Condition (array > 30): {condition}")
print(f"x array (selected when True): {x}")
print(f"y array (selected when False): {y}")
print(f"Result of np.where: {result}")

The function `np.unique()` finds unique elements of an array. If we pass `return_counts=True`, the function will also return the count of each unique element (how many times it appeared in the original array). We can also apply `np.unique()` along a given axis.

In [None]:
arr = np.array([1, 2, 2, 3, 1, 5, 6, 5])
unique_elements = np.unique(arr)

print(f"Original Array: {arr}")
print(f"Unique Elements: {unique_elements}")

# With counting
unique_elements, counts = np.unique(arr, return_counts=True)
print(f"Unique Elements: {unique_elements}")
print(f"Counts: {counts}")

In [None]:
# Unique values along an axis (unique rows in this case)
arr = np.array([[1, 0],
                [3, 1],
                [1, 1],
                [1, 0],
                [6, 2],
                [3, 1],
                [1, 3],
                [1, 0]])

unique_elements = np.unique(arr, axis=0)
print(f"Original Array:\n{arr}")
print(f"Unique Elements:\n{unique_elements}")

There are two main sorting functions provided by NumPy:
- `np.sort()` - sorts elements of an array.
- `np.argsort()` - returns indices of sorted elements.

In [None]:
array = np.array([3, 1, 2, 5, 4])

# Using np.sort()
sorted_array = np.sort(array)
print(f"Original Array: {array}")
print(f"Sorted Array: {sorted_array}\n")

# Using np.argsort()
sorted_indices = np.argsort(array)
sorted_array = array[sorted_indices]
print(f"Original Array: {array}")
print(f"Indices of Sorted Array: {sorted_indices}")
print(f"Sorted Array: {sorted_array}")


Also these sorting functions supports the `axis=` argument.

In [None]:
array_2d = np.array([[3, 1, 4],
                     [1, 5, 9],
                     [2, 6, 5]])

# Sort along axis 0 (columns)
sorted_axis0 = np.sort(array_2d, axis=0)
sorted_indices_axis0 = np.argsort(array_2d, axis=0)

# Sort along axis 1 (rows)
sorted_axis1 = np.sort(array_2d, axis=1)
sorted_indices_axis1 = np.argsort(array_2d, axis=1)

print("Original 2D Array:")
print(array_2d)
print()

print("Sorted along Axis 0 (columns):")
print("Sorted Array:")
print(sorted_axis0)
print("Indices of Sorted Array:")
print(sorted_indices_axis0)
print()

print("Sorted along Axis 1 (rows):")
print("Sorted Array:")
print(sorted_axis1)
print("Indices of Sorted Array:")
print(sorted_indices_axis1)

## Exercise

You are given two files, `data_dates.npy` and `measurements.npy`. Both files contain NumPy arrays. The file `data_dates.npy` contains an array of shape `(367,)` containing date strings. The file `measurements.npy` contains an array of shape `(367, 2)` where the first and second column correspond to the average temperature and amount of precipitation for a given day, respectively. 

The measurements are taken from [Seklima](https://seklima.met.no/observations/) and were recorded at Florida, Bergen.

In [None]:
dates = np.load("data_dates.npy", allow_pickle=True)
measurements = np.load("measurements.npy")

print(f"Data shape: {measurements.shape}")

print("First 10 rows of data:")
print("Date\t\tTemperature\tPrecipitation")
for date, temp, prec in zip(dates[:10], measurements[:10, 0], measurements[:10, 1]):
    print(f"{date}\t{temp}\t\t{prec}")

### Task 1

Using `np.argsort()` find and print the 10 dates with the highest and lowest average temperatures in the dataset.

Print the dates together with the temperature on that day. The output should be something like this:

```
Top 10 dates with the highest temperatures
14.06.2023 Temperature: 19.4
21.05.2024 Temperature: 19.4
08.09.2023 Temperature: 19.5
...

Top 10 dates with the lowest temperatures
06.01.2024 Temperature: -7.5
05.01.2024 Temperature: -6.9
09.02.2024 Temperature: -5.2
...
```

**Hint:** You can extract the temperatures from `measurements` by indexing the first column with `measurements[:, 0]`. This will give you a 1-dimensional array consisting only of the temperature measurements.

In [None]:
# Your code here
sorted_idxs = ...
highest_idxs = ...
lowest_idxs = ...

### Task 2

Use `np.argmin()` and `np.argmax()` to find the two days with the lowest and highest temperatures.

Does this agree with your results from the previous task? What happens when there is a tie (as in this case for `argmax`)?

Your output should be something like this:
```
Lowest temperature was -7.5 on 06.01.2024
Highest temperature was 21.4 on 09.07.2023
```

In [None]:
# Your code here
temperatures = ...
max_idx = ...
min_idx = ...

### Task 3

Compute the mean temperature and precipitation for all dates.

**Hint:** Use `np.mean()` on `measurements` together with the `axis=` argument to compute both means at the same time. Which axis should we compute the mean along?

Your output should be something like this: `Mean temperature: 8.67, mean precipitation: 6.59`.

In [None]:
# Your code here
means = ...

### Task 4

Create an array `labels` using `np.where()` on the precipitation measurements such that it has value `0` if the precipitation is zero and `1` if precipitation that day was greater than zero.

Then use `np.stack()` to stack the precipitation measurements and the labels and print the first 20 rows.

At last, print the sum of all values in `labels`. What does this number mean? Can you find other ways to compute the same number using NumPy?

The output should be:
```
[[ 0.   0. ]
 [ 0.   0. ]
 [ 0.   0. ]
 [ 0.   0. ]
 [ 0.   0. ]
 [ 0.   0. ]
 [ 0.3  1. ]
 [ 0.2  1. ]
 [ 7.   1. ]
 [ 1.6  1. ]
 [ 0.3  1. ]
 [ 0.   0. ]
 [ 0.   0. ]
 [47.1  1. ]
 [ 0.   0. ]
 [10.5  1. ]
 [ 0.1  1. ]
 [ 3.   1. ]
 [ 8.7  1. ]
 [ 2.6  1. ]]
238
```

In [None]:
# Your code here
precipitation = ...
labels = ...
stacked = ...