# Exercise - NumPy to CuPy - `ndarray` Basics

Let's revisit our first NumPy exercise and try porting it to CuPy.

**TODO: Add an import of CuPy, update `xp`, and rerun the cells one by one to see if there's any issues.**

In [None]:
import numpy as np

xp = np

Create the input data array with the numbers `1` to `500_000_000`.

In [None]:
arr = xp.arange(1, 500_000_001)
arr

Calculate how large the array is in GB with `nbytes`.

In [None]:
arr.nbytes / 1e9

How many dimensions does the array have?

In [None]:
arr.ndim # `len(arr.shape)` also works, but is longer to type.

How many elements does the array have?

In [None]:
arr.size # For 1D array, `arr.shape[0]` also works, but `arr.size` multiplies the size of all dimensions.

What is the shape of the array?

In [None]:
arr.shape

Create a new array with `5_000_000` elements containing equally spaced values between `0` to `1000` (inclusive).

In [None]:
arr = xp.linspace(0, 1000, 5_000_000, endpoint=True)
arr

Create a random array that is `10_000` by `5_000`.

In [None]:
arr = xp.random.rand(10_000, 5_000)
arr

Sort that array.

In [None]:
arr = xp.sort(arr)
arr

Reshape the CuPy array to have the last dimension of length `5`.

In [None]:
arr = arr.reshape((-1, 5))
# -1 will infer the size of that dimension from the rest.  Would also accept: arr.reshape((10_000_000, 5))
arr

Find the sum of each row. Rows are axis 0, but the sum is being applied across columns, which are axis 1.

In [None]:
arr_sum = xp.sum(arr, axis=1) # You could also write `arr.sum(axis=1)`.
arr_sum

Normalize each row of the original random array by dividing by the sum you just computed using broadcasting.

In [None]:
arr_normalized = arr / arr_sum[:, xp.newaxis]
arr_normalized

Prove that your normalized array is actually normalized by checking that every row sums to 1.

**TODO: Try changing `xp.testing.assert_allclose` to `np.testing.assert_allclose`. What happens?**

In [None]:
xp.testing.assert_allclose(xp.sum(arr_normalized, axis=1), 1.0)

**TODO: Create two arrays (one NumPy, one CuPy) that discretize the sine function from 0 to 2π with `50_000_000` points. Benchmark how long it takes NumPy and CuPy to sort the array.**

_Hint: You can use `linspace` to help generate the data - see the example in earlier cells._

_Hint: To accurately time both NumPy and CuPy calls, use [`cupyx.profiler.benchmark`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.profiler.benchmark.html). Don't go overboard with the `n_repeat` parameter._

In [None]:
import cupyx as cpx

arr_np = ...
arr_cp = ...

...

**EXTRA CREDIT: Benchmark with different array sizes and find the size at which CuPy and NumPy take the same amount of time. Try to extract the timing data from `cupyx.profiler.benchmark`'s return value and customize how the output is displayed. You could even make a graph.**

In [None]:
...