### Programming for Science and Finance

*Prof. Götz Pfeiffer, School of Mathematical and Statistical Sciences, University of Galway*

# Notebook 5: Scientific Programming I

This notebook accompanies **Part II: Scientific Programming**. You will:

* learn to create and manipulate **NumPy arrays**, the core data structure for scientific computing in Python;
* explore powerful **array generators** such as `zeros`, `arange` and `linspace`;
* practice **multidimensional indexing and slicing** to access, reshape, and modify data efficiently;
* apply array operations to **digital images**, understanding how pixel data can be processed numerically;
* work with the **RGB colour model** and perform simple image transformations using array arithmetic;
* and experience the speed and elegance of **vectorized operations**, where entire datasets are processed without explicit loops.

By the end of this notebook, you’ll see how the same array-based approach supports both **scientific analysis** and **visual applications**, setting the stage for numerical simulation and data modelling in later notebooks.

## Task 1. Numpy Arrays.

When working with large amounts of data, one soon notices

* Python lists need a lot of space;
* Python loops need a lot of time.

One reason for this is Python's dynamic typing, which allows lists to contain inhomogeneous data and variables to have no fixed type.

If you know that you are working with homogeneous data, there are lots of savings to be made, both in terms of space and in terms of time.

That's what the `numpy` package promises to provide.

In [None]:
import numpy as np
print(np.__version__)

`numpy` provides a new **data type** `np.array` for **homogeneous** lists of data.

In [None]:
l = [3, 1, 4, 1, 5, 9, 2, 6]
a = np.array(l)
print(a)
a

Even with such a small example, the fact that an array only needs to store the common **type** of its entries once can lead to big savings in terms of space.

In [None]:
from sys import getsizeof
print("Size of list:", getsizeof(l) + sum(getsizeof(x) for x in l))
print("Size of array:", getsizeof(a))

`numpy` arrays can be multidimensional, e.g., 2-D matrices.

In [None]:
ll = [list(range(i, i+4)) for i in [2,5,7]]
print(ll)
aa = np.array(ll)
print(aa)

---
**Exercises.**

1. Create a $3\times 3$ matrix `A` of integers and a scalar `s = 2.5`.
Predict and verify what happens for `A + s` and `s * A`.

2. Compute the transpose of `A` as `A.T` and verify that `A.T.T` equals `A`.

3. Convert `A` to a matrix of floating point numbers via `A.astype(float)`.

1. Create and print a $3 \times 3 \times 3$ array of integers.
---

## Task 2.  Array Generators

Each array `a` has a number of attributes, most importantly, `a.shape` and `a.dtype`.

* `a.shape`: its **length** in each dimension,
* `a.dtype`: the common **type** of all its entries.


In [None]:
print(a.shape)
print(a.dtype)
print(aa.shape)
aa.dtype

Further attributes are derived from an array's shape:

* `a.ndim`: the number of dimensions (i.e., `len(a.shape)`),
* `a.size`: its total number of entries (i.e., `math.prod(a.shape)`).

In [None]:
print(a.ndim)
print(a.size)
print(aa.ndim)
print(aa.size)

The product of a list of numbers can be computed with `math.prod`.

In [None]:
from math import prod
prod(aa.shape)

There are a number of ways to quickly make arrays of a certain kind.
For example, an array full of **zeros**, of a given **shape** and **type**:

In [None]:
np.zeros((3, 4), int)

Or **ones**:

In [None]:
np.ones(10, float)

Or matrices with random entries, which can be useful, e.g. for testing purposes.
For this, you first need to make a **random number generator**, let's call it `rng`.

In [None]:
rng = np.random.default_rng()

With this, we can create, say, a $3 \times 4$ array of uniformly distributed random values between $0$ and $1$:

In [None]:
rng.random((3,4))

Or a $2 \times 3 \times 5$ array of random integers in the range `range(0, 10)`:

In [None]:
aaa = rng.integers(0, 10, (2, 3, 4))
print(aaa)

---
**Exercises.**

1. Create a $3 \times 3$ array of integers all equal to $1$.
2. Use `np.arange` to create an array of integers corresponding to `range(1, 10)`.
3. Use `np.linspace` to create an array of $5$ values, evenly spaced between $1$ and $3$ (inclusive).
3. Compare `np.arange(0, 1.1, 0.2)` to `np.linspace(0, 1, 6)`.
   What is the difference in how the endpoints are handled? 
4. Use `np.eye` to create a $3 \times 3$ identity matrix.
5. Use `np.array` to convert the `Matrix` object `ma` below into an array.
   ```python
   from tensor import *
   ma = Matrix([Vector([1, 0, 1]), Vector([2, 1, 1]),Vector([0, 1, 1]), Vector([1, 1, 2])])
   ```

---

## Task 3. Multidimensional Indexing and Slicing.

**Recall** that `l[i]` accesses the $i$th element of a list `l`, counting from $0$.  
This kind of **indexing** can also be used to assign a new value to the $i$th element of the list `l`, as in `l[i] = 2`.  
**Negative indices** count from the end of the list, so that `l[-1]` is the last element of `l`.

In similar way, **slicing** is used to **extract a sublist** of values from a list `l`.  
In general, `l[start:stop:step]` yields the list of values from positions `start` up to (but excluding) `stop`, with step size `step`.  
Here the part `:step` is optional, with default step size $1$.  
Also, `start` can be omitted and defaults to $0$, `stop` can be omitted and defaults to `len(l)`, so that `l[:]` results in an exact copy of `l`.

In a **multi-dimensional** numpy array, elements and sublists can be accessed with the same syntax, using a single index or a slice, one for **each dimension**.

In [None]:
print(aaa[1,1,3])
aaa[0,2,:]

Unlike with plain python lists, array slices are **views** rather than **copies** of the array data.  
Hence, updating elements of a view will update the original array as well. 

**Reshaping** is another useful operation on arrays.  As long as the `size` of an array remains the same, its elements can be refitted into any given shape.

In [None]:
b = np.arange(1,13).reshape(3,4)
print(b)
print(b.reshape(4,3))
print(b.reshape(2,2,3))

---
**Exercises.**

1. For this and the following exercises, set `x = np.arange(1, 11)`.
2. Use slicing to select the first $5$ elements from `x`.
3. Use slicing to select the last $5$ elements from `x`.
4. Use slicing to select the elements at positions $4$, $5$ and $6$ from `x`.
5. Use slicing to select every other element from `x`, i.e., the elements at the **even** positions.
6. Use slicing to select the elements at the odd positions in `x`.
7. Use slicing to list the elements of `x` in reverse order.
8. Use slicing to list every other element of `x` in reverse order, starting at position $7$.

---

## Task 4. Digital Image Manipulation.

* Motto: What looks like an array is an array ...
* We will treat a digital image as a numpy array of suitable shape and type.
* This will allow us to formulate (and program) certain image manipulation tasks as matrix operations.
 
* `Pillow` is the **Python Image Library**. Import it as `PIL`. Use mainly its `Image` class.

In [None]:
import PIL
print(PIL.__version__)
from PIL import Image

* Use `Image.open` (on a filename) to load image data into a python session.
* Here, the image is stored as a file `"long_walk.png"` in the folder `"images"`.

In [None]:
img = Image.open("images/long_walk.png")
img

* The image as such is not an array.

In [None]:
type(img)

* But we can easily **convert** an image into array format ...

In [None]:
pic = np.array(img)
print(pic.shape)
print(pic.dtype)

* ... and convert the array back to an image.

In [None]:
Image.fromarray(pic)

---
**Exercises.**

1. What is the image corresponding to the slice `pic[:100,:100,:]`?
2. What image corresponds to `pic[:100,:100]`?
3. What image results from slicing `pic[::-1]`?
4. What is `pic[:,::-1]`?

---

## Task 5.  The RGB color model

* Interestingly, the image becomes a 3-dimensional array, that is a matrix with $201$ rows and $1000$ columns, where each entry is itself an array of length $3$.
* This corresponds to the **RGB color model**, where an image is represented as a rectangular grid (i.e., a matrix) of **pixels**.
* And each pixel is a collection of $3$ intensity values $(r, g, b)$, one for each of the colors Red, Green and Blue.
* The possible values of $r, g, b$ are the integers in the range between $0$ and $255$, that is unsigned $8$-bit integers, denoted as type `uint8` in numpy.

In [None]:
pic[99,99]

* We can used **slicing** to crop an image, e.g., to pick just the top left corner.

In [None]:
small = pic[:101,:201,:]
Image.fromarray(small)

* We can **separate the $3$ colors** by setting $2$ color intensities to $0$ in a **copy** of the image.  Again, slicing avoids explicit `for` loops.

In [None]:
r = pic.copy()
r[:,:,1:] = 0
print(r[99,99])
Image.fromarray(r)

In [None]:
b = pic.copy()
b[:,:,:2] = 0
print(b[99,99])
Image.fromarray(b)

In [None]:
g = pic.copy()
g[:,:,::2] = 0
print(g[99,99])
Image.fromarray(g)

* Addition: note how the original image is the **sum** of `r`, `g` and `b`.

In [None]:
Image.fromarray(r + g + b)

---
**Exercises.**

1. Separate the colors and display as grayscale images by setting

   ```python
   R, G, B = pic[:,:,0], pic[:,:,1], pic[:,:,2]
   ```
   and then convert each of these $2$-dimensional arrays into an `Image`.
2. Compute and display a grayscale version of the image as

   ```python
   0.299*R + 0.587*G + 0.114*B
   ```

---

##  Task 6.  Vectorized Operations

Note that the sum `r + g + b` of three arrays of the same shape is a `numpy` shorthand for this more explicit Python code:

In [None]:
rgb = np.zeros(r.shape, dtype=r.dtype)
for i in range(r.shape[0]):
    for j in range(r.shape[1]):
        for k in range(r.shape[2]):
            rgb[i, j, k] = r[i, j, k] + g[i, j, k] + b[i, j, k]

Image.fromarray(rgb)

* **Scaling** is another example:  Suppose we want a darker version of `pic`, with each color intensity just half of its original value. 
In priniple, all we need to do is to multiply each entry of `pic` by $0.5$.
However, the entries in `pic` have type `uint8`, and multiplying such a number with a `float` results in a ``float`.

In [None]:
p = pic[99,99,0]
print(p)
print(0.5 * p)
type(0.5 * p)

* But we can always convert this number back to the type of `pic[99, 99, 0]`, using the type `np.uint8` as a type conversion function.

In [None]:
np.uint8(0.5 * p)

* So, here we go, looping over the pixels ...

In [None]:
rgb = np.zeros(pic.shape, dtype=pic.dtype)
for i in range(pic.shape[0]):
    for j in range(pic.shape[1]):
        for k in range(pic.shape[2]):
            rgb[i, j, k] = np.uint8(0.5 * pic[i, j, k])

Image.fromarray(rgb)

* Again, `numpy` allows us to replace all three `for` loops by **vectorized operations** that apply to the entire array in one go, including the type conversion to `uint8`.

In [None]:
Image.fromarray(np.uint8(0.5 * pic))

* Next, let's try and make a lighter version of the picture, by scaling pixels with a value bigger than $1$.

In [None]:
Image.fromarray(np.uint8(1.5 * pic))

* The result is partially disturbed by some random noise.  What happened?
* A possible explanation is that scaling by $1.5$ can result in values larger than $255$, which get truncated to unintended small values when converted back to `uint8`.

In [None]:
np.uint8(1.5 * np.uint8(200))

* In order avoid this truncation, we should take care and not let the scaled value exceed $255$.

In [None]:
np.uint8(min(1.5 * np.uint8(200), 255))

* Equipped with this technique, we can now loop over the pixels again.

In [None]:
rgb = np.zeros(pic.shape, dtype=pic.dtype)
for i in range(pic.shape[0]):
    for j in range(pic.shape[1]):
        for k in range(pic.shape[2]):
            rgb[i, j, k] = np.uint8(min(1.5 * pic[i, j, k], 255))

Image.fromarray(rgb)

* Can this be vectorized?  Unfortunately
  ```python
      min(1.5 * pic, 255)
  ```
  does not work, and neither does
  ```python
      np.min(1.5 * pic, 255)
  ```
  (why?)  
* But `np.minimum` does work! (Check its documentation to find out how it is different from `np.min`.)

In [None]:
Image.fromarray(np.uint8(np.minimum(1.5 * pic, 255)))

* Finally: Flip!

In [None]:
Image.fromarray(np.flip(pic, 0))

In [None]:
Image.fromarray(np.flip(pic, 1))

In [None]:
Image.fromarray(np.flip(pic, 2))

In [None]:
Image.fromarray(pic)

---
**Exercises.**

1. Use `np.flip` (or a corresponding slice) to rotate the image by 180 degrees.

2. Let `x = np.linspace(0, 2*np.pi, 1000)`. 
   Compute `np.sin(x)**2 + np.cos(x)**2` and verify it equals 1.

3. Create two arrays, an array `A` of shape $(3,1)$ and an array `B` of shape $(1,3)$, and compute
   `A + B`.  Explain the result.
---

## Summary

- NumPy arrays are the foundation of scientific programming in Python.
- Array operations replace explicit loops, leading to concise and efficient code.
- Slicing and broadcasting enable powerful manipulation of data in any dimension.
- Images are naturally represented as 3-D arrays (height × width × colour).
- Vectorization is the key to high-performance numerical computing.
