### CS2101 - Programming for Science and Finance
Prof. Götz Pfeiffer<br />
School of Mathematical and Statistical Sciences<br />
University of Galway

# Computer Lab 5

Provide answers to the problems in the boxes provided.  Partial marks will be awarded for
participation and engagement.

**Important:** When finished, print this notebook into a **pdf** file and submit this pdf to
**canvas**.  (Submissions in other formats will not be accepted.)

**Deadline** is next Monday at 5pm.

## Setup

This is a `jupyter` notebook.   You can open and interact
with the notebook through one of sites recommended at
its [github repository](https://github.com/gpfeiffer/cs2101).

Or, you can
install and use `jupyter` as a `python` package on your own laptop or PC.  

* First, import some packages

In [None]:
import numpy as np
import matplotlib.pyplot as plt

* Set up a random nimber generator `rng`.

In [None]:
rng = np.random.default_rng()

## 1. Sorting and Ranking

1. Use the random number generator to make a list of $10$ random values between $0$ and $1$, perhaps as a numpy array `values` of length $10$. 

2. Look up the documentation for the functions `np.sort` and `np.argsort`.  What do they have in common, what is the difference?
   Use `np.sort` to compute as array `ranked` a sorted copy of the `values` array.  Use `np.argsort` to compute as array `ranks` the list of sorting indices for the array `values`.

3.  Check that the array `values[ranks]` is equal to array `ranked`. 

##  2. Random Points

1. Use the random number generator to make a list of $10$ random points in the $x, y$-plane, with coordinates between $0$ and $1$, perhaps as a numpy array `points` of shape $10 \times 2$.

2. Use slicing and indices to select the $x$-values of your points and plot them against `range(10)`.  Do the same for the $y$-values of your points.

3. Plot the $10$ points as a scatter plot in the $x, y$-plane.

## 3. Extra Dimensions

* One way in which Numpy extends Python's indexing scheme to its multidimensional arrays is by allowing us to add an extra dimension.  This is done by using the constant `None`, or `np.newindex`, as an index-or-slice. The effect is the same as if using an extra argument `1` in a `reshape` command.
  Look up the documentation for `np.newindex`.

1. What is the shape of `points[:,np.newaxis,:]` ?

2. What is the shape of `points[np.newaxis,:,:]` ?

3. What is the shape of `points[None,:,None,:]`?

## 4. Broadcasting

* Recall Broadcasting: two numpy arrays can be added if the have the "same" shape. Same shape means same number of dimensions and in each dimension, the same size unless one of the sizes is $1$.  In that case the entry is repeated as often as needed in that dimension.  Lokok up the documentation of `np.broadcast` and `np.boradcast_to`.

1. Compute `np.broadcast_to(points[:,None,:], (10,10,2))` and assign the result to `points1`.  What is the shape of `points1`?

2. Compute `np.broadcast_to(points[None,:,:], (10,10,2))` and assign the result to `points2`.  What is the shape of `points2`?

3.  Compute the sum of `points1` and `points2` and assign it to `sums`.  What are the entries in the resulting array?  

4.  Check that `sums` is equal to the sum of `points[:,None]` and `points[None,:]`.  Look up the documentation of `np.array_equal`.

## 5. Nearest Neighbors

* Let's plot $50$ random points in the $x, y$-plane, and connect each point with its two nearest neighbors by an edge, as follows.

1. Use the random number generator to make a list of $50$ random points in the $x, y$-plane, with coordinates between $0$ and $1$, as a numpy array `points` of shape $50 \times 2$

2. Plot the $50$ points as a scatter plot in the $x, y$-plane.

3. Using `None` as a slice, convert the array `points` into an array of shape `(50, 1, 2)` and assign the result to `points1

4. Using `None` as a slice, convert the array `points` into an array of shape `(1, 50, 2)` and assign the result to `points2`.

5.  Now, compute and rank the distances between any pair of points as follows. Recall that the square distance between points $a = (a_0, a_1)$ and $b = (b_0, b_1)$ in the $x, y$-plane is the sum of the squared differences in each dimension:
  $$
  \|a - b\|^2 = (a_0 - b_0)^2 + (a_1 - b_1)^2.
  $$
  First, using two suitably reshaped (via `None` slices) variants of the `points` array, compute a 3D array `diffs` whose $i,j$-entry is `points[i] - points[j]`.

6. Compute the squares of the values in `diffs` and assign the result to `diffs2`.

7. Using `np.sum` with argument `axis=-1`, add the $x$ and $y$ components of the squared differences in `diff2` and assign the result to `dist2`.

8. Optionally, display `dist2` as an image  

9. Check if the diagonal is $0$.  Look up the documentation of `np.diagonal`.

10. Next, for each point $a$, rank its neighbours by distance. That is: apply `np.argsort` to each row of `dist2`.  Assign the result to `nearest`.

11.  Finally, plot the points and join each point with its two nearest neighbors by an edge:  start with a scatter plot of all $50$ points.  Then, loop over all the points.  For each point `points[i]` find the indices `j` of its two nearest neighbors, then use a command like
  ```python
  plt.plot(*zip(points[i], points[j]), color='r', alpha=0.5)
  ```
  to draw an edge between them.

##  Submit your work in PDF format!