## Announcements
* Tutorial 4 due tomorrow at 12pm
* HW 3 due Thu 9/11 at 12pm

# NumPy and Arrays, Part I

<!-- <a href="https://www.northernsoul.me.uk/wonder-materials-graphene/created-by-digital-micrograph-gatan-inc/" target="_blank"><img src="img/graphene.jpg" width=500px /> </a> -->
<a href="https://www.northernsoul.me.uk/wonder-materials-graphene/created-by-digital-micrograph-gatan-inc/" target="_blank"><img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/graphene.jpg" width=500px /> </a>


## PHYS 2600: Scientific Computing

## Lecture 5

Especially as scientists, it's easy to appreciate that _relationships between data_ are almost as important as the data themselves.  For example, force on an electric charge $q$:

$$
\mathbf{F} = q(\mathbf{E} + \mathbf{v} \times \mathbf{B}),
$$

To compute $\mathbf{F}$ we need 10 numbers: the charge $q$, and 3 components each for $\vec{E}, \vec{B}, \vec{v}$.

Now imagine trying to calculate this in Python by assigning 10 different variables:

In [None]:
q = -1.6e-19
Ex = 1.4
Ey = 1.4
Ez = 0.0

Fx = q * Ex
Fy = q * Ey
Fz = q * Ez

And this is only a tiny part of the whole formula so far!

The code we would go on to write in this example would be guilty of some cardinal sins of programming:

1. __Too much repetition:__ The same lines over and over, just swapping `x` --> `y` --> `z`.  Repeating in programming is bad - wasted effort, and hard to change later.

2. __Not enough abstraction:__ If this is part of a larger E&M simulation, there will probably be lots of dot products and cross products - again, we don't want to copy/paste those formulas and change variable names!  _Abstraction_ (defining functions) both save repetition _and_ make our code easier to read.


To fix these problems, we need to be able to write code which is aware of the _connections_ between data - e.g. that `Ex`, `Ey` and `Ez` are all related to each other as components of a 3-vector.

## Introducing the array

There are many ways to store data relationships in Python; one of the most rigid and the most useful for scientific data is the __array__ (a "_typed list_", roughly.)  An array is an ordered collection where __all entries share a common data type__.

The type restriction lets us easily store an array as a _contiguous block of memory_, divided into data-sized chunks.  For example, storing (15, 2, 4, 13):

<!-- <img src="img/lva-array.png" /> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/lva-array.png" />

(We haven't covered binary yet, don't worry about the details.)  The point of writing it like this is to expose a useful feature: __vector operations__ act on the array as a whole.  (Adding this to another 4-integer array is just adding two big binary numbers once!)

By far the most widely used array module for Python is __NumPy__.

<!-- <img src="img/numpylogo.png" width=500px/> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/numpylogo.png" width=500px/>


You would be hard-pressed to find _any_ scientific computing project in Python that doesn't use NumPy somewhere!


In [None]:
import numpy as np ## `np` is the conventional 'short name'
a = np.array([15, 2, 4, 13])
print(a)
print(type(a))

__Note the syntax__: with `np.array` we need both `()` and `[]`.  (This is actually _type-casting_ an array from a different Python object, the __list__, which is written with `[]`.  We'll come back to lists later.)

Any NumPy array has type `numpy.ndarray`, but __this doesn't tell us the data type of its elements.__



The data type is a __attribute__ of the array, accessed as `.dtype` using dot notation:  

In [None]:
print(a.dtype)

(dot notation is for objects that belong to other objects - `dtype` of array `a`.)  Here the `dtype` is `int64` - "integer, 64 bits".  (64 bits is the amount of storage per integer - we'll come back to this in the future.)  The other most common type we'll see is `float64`, which stores decimals (like the Python `float`.)

There are plenty of other dtypes available; [here's a list of many of them](https://numpy.org/doc/stable/user/basics.types.html).

Reminder: You can use `dir(obj)` to get a list of all the things that belong to an object `obj`, and `help(obj)` or `help(obj.thing)` to show documentation for objects or the things that belong to them.

* An __attribute__ is any named piece of data that belongs to an object.
* A __method__ is a function that belongs to an object.

In [None]:
dir(a)
# help(a)
# help(a.dtype)


What happens if we try to make an array using _different_ kinds of data?  NumPy will try to identify an appropriately general data type to store everything as:

In [None]:
np.array([1.1, 3.2, 'hi', 7])  # "U32" = Unicode string

This forces typecasting, and can lead to strange and unexpected behaviors!  If there's ever any uncertainty, you can choose the data type explicitly:

In [None]:
np.array([1.1, 3.2, '7'], np.int64)

## Vector math in NumPy

NumPy arrays deal with operations like vector addition or scalar multiplication naturally:

In [None]:
v1 = np.array([2,1,3])
v2 = np.array([0, -3, 1])
v1_plus_v2 = v1 + v2
three_times_v2 = 3 * v2
print(v1_plus_v2)
print(three_times_v2)

Things like dot and cross product are available too!  You'll see them on the tutorial.  We can write out the force from above nice and simply using NumPy abstraction:

$$
\mathbf{F} = q(\mathbf{E} + \mathbf{v} \times \mathbf{B}),
$$

becomes:

`F = q*(E + np.cross(v, B))`


## From vectors to arrays

We can use them to deal with ordinary vectors, but arrays are actually much more general and powerful!  There are lots of examples of structured data that is perfect for array storage, but far from what we usually think of as a vector in physics.

Can you think of some examples of information that would be useful to store as an _array_, that isn't a vector in the ordinary sense?

* __An image file__ is really just an array of numbers: for example, a black-and-white 640x480 picture could be represented by an array of length $640*480=307,200$, with each entry being an integer from 0 (black) to 255 (white).
* If we're running a mechanics experiment, and we measure the position of something every second for 30 seconds, __the experimental measurements__ can be written as a length-30 array of positions.
* __Your preferences on Netflix__ could be stored as a big array of ratings for every movie and TV show in their catalog: thumbs up (+1), thumbs down (-1), or not rated (0).  (And something like this array is a key input into what Netflix uses to make recommendations to you!)

In computing, it's common to refer to all of these as "vectors".

In fact, we can see this explicitly if we load some data, like an image!  (Don't worry about the details below; we'll meet the `matplotlib` module soon!)

In [None]:
from skimage import data
import matplotlib.pyplot as plt

f = data.camera()
plt.imshow(f, cmap='gray')
plt.axis('off')
plt.show()

When we generalize from spatial vectors to computing "vectors", new and different operations start to make sense.  The most important is a __vectorized function__.  The idea is simple: if we have $f(x)$ which acts on a _single element_ of our array, then the vectorized function acts on each element to make a new array:

$$
\begin{aligned}
v &= [v_0, v_1, \dots, v_n] \\
f(v) &= [f(v_0), f(v_1), \dots, f(v_n)]
\end{aligned}
$$

Again, because of the array structure, $f(v)$ can be done efficiently as single operations over huge binary numbers.

As an example, suppose we want to convert a week of temperature measurements from Farenheit to Celsius.  The measurements are a natural length-7 array:


In [None]:
temps_F = np.array([55, 64, 81, 86, 61, 73, 52])

Now we apply the usual conversion function:
$$
T_C = \frac{5}{9} (T_F - 32)
$$

In [None]:
temps_C = 5/9 * (temps_F - 32)
print(temps_C)

to get the array in Celsius!  

Note that although this is perfectly correct, it also includes operations that are _undefined_ on traditional vectors (if you write $\vec{E} - 32$ on an E&M quiz, you'll definitely lose some points!). NumPy makes this legal by using __broadcasting__:

* First treats the scalar 32 as if it were an array of the same shape as `temps_F`
  $$
    [55, 64, 81, \dots] - 32 \longrightarrow [55, 64, 81, \dots] - [32, 32, 32, \dots]
  $$
* Then it performs ordinary elementwise subtraction.
  $$
   [55, 64, 81, \dots] - [32, 32, 32, \dots] = [55-32, 64-32, 81-32, \dots]
  $$

## Tutorial 5

Time for our next tutorial!