# Sub-Arrays in NumPy

<!-- <a href="https://arxiv.org/abs/1707.08563" target="_blank"><img src="img/exomoon_kepler_transit.gif" width=460px /></a> -->
<a href="https://arxiv.org/abs/1707.08563" target="_blank"><img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/exomoon_kepler_transit.gif" width=460px /></a>


## PHYS 2600: Scientific Computing

## Lecture 10

# FAQ: any and all
In the tutorial last week we had several questions about `np.any` and `np.all`. Both of these functions apply a logical test to __every__ element of an array. Then they return __one Boolean value__ from the results of all those tests.
- `np.any` returns `True` if the test is passed for __at least one__ element in the array. This is equivalent to chaining the test on each element of the array with `or`.
- `np.all` returns `True` if the test is passed for __every__ element in the array. This is equivalent to chaining the test on each element of the array with `and`.

Let's look at an example of using this to look for negative numbers in arrays.

In [None]:
import numpy as np

ex_array = np.linspace(-3, 3, 7)
print(ex_array)
ex_array < 0

In `ex_array`, the first three values are negative. So when I do the test `ex_array<0`, the first three entries are `True`. 

What will be the result of `np.any` and `np.all` on `ex_array < 0`?

In [None]:
print(np.any(ex_array < 0))
print(np.all(ex_array < 0))

# FAQ: compact logical tests

For HW 5, I got questions about functions with conditional branching, like this:

In [None]:
def one_is_odd(x, y):  # Version of the function that enumerates every possible case
    if x % 2 != 0:  # x is odd
        if y % 2 != 0:  # y is odd
            return False  # Return False because both are odd
        else:  # y is even
            return True  # Return True because only x is odd
    else:  # x is even
        if y % 2 != 0:  # y is odd
            return True  # Return True because only y is odd
        else:  # y is even
            return False  # Return False because neither is odd


one_is_odd(3, 4)

This code is correct! But it's not very __compact__. Most programmers prefer compact code for several reasons:
- Laziness.
- It's easier to read
- Writing compact code makes mistakes less likely and debugging easier.
- Compact code is more aesthetically appealing.

How could I make `one_is_odd` more compact? First, I can directly return the results of Boolean tests. Compare these two ways of writing the code:

In [None]:
# the long way
if x % 2 != 0:  # x is odd
    return True
else:  # x is even
    return False

# the short way
return x % 2 != 0  # True if x is odd, False if x is even

Second, I can chain multiple tests into one line with `and`, `or`, `not`, ....


In [None]:
def one_is_odd(x, y):
    return (x % 2 != 0) != (y % 2 != 0)  # one-line function!


one_is_odd(3, 4)

## Array indexing

So far, we've only done operations on _whole_ arrays.  But sometimes, we want to work with just part of an array.  Let's start with getting _individual values_ out.

Each entry in an array is labeled by its __index__ - its position counting from left to right (or right to left with negative indices):

<!-- <img src="img/array-index-forward-backward.png" width=400px /> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/array-index-forward-backward.png" width=400px />

Python follows a __zero-based__ index convention.  (There are arguments about whether starting at 0 or 1 is better; at this point 0 is the more common historical convention, inherited from languages like C.)

Getting an array element (called "__indexing__") is done with square brackets `[]`:

In [None]:
import numpy as np

a = np.array([1, 2, 4, 8, 16])
a[3]

What will happen if I try to access `a[5]`?

In [None]:
a[5]

Trying to get an array element that doesn't exist results in an `IndexError` (with 5 entries, the index runs from 0 to 4)

If we're passed an array and we don't know how long it is, we can use the `len()` built-in function to find its length:

In [None]:
len(a)  # Valid indices run from 0 to len(a) - 1.

In addition to reading individual entries in an array, we can also _assign_ to them using the same notation:

In [None]:
a = np.arange(5)
print(f"{a=}")
print(f"{a.dtype=}")
a[2] = -7.4
print(f"{a=}")  # Notice the automatic typecast!

Remember, our mental picture of an array is that the name `a` _points_ to a contiguous chunk of memory that we reserved when we created it:

<!-- <img src="img/lva-array.png" /> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/lva-array.png" />

So it's only natural that we can overwrite some part of this memory (subject to the constraint that everything in the array is the _same type_, which is why `-7.4` became `-7` in an integer array.)

This picture brings us to a little warning about an effect called __mutation__, which can lead to nasty surprises if you forget about it!  

In Python, there is an important distinction between the _name_ `a`, and the _values_ in the array that it points to.  

What does this imply if we assign `b = a`? 

In [None]:
a = np.arange(5)
print(f"{a=}")
b = a
b += 2
a[4] = 99
print(f"{a=}, {b=}")

<!-- <img src="img/array-alias.png" style="float:left;margin:10px;"/> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/array-alias.png" style="float:left;margin:10px;"/>

By assigning `b=a`, we had __two names pointing to the same array__.  Then any changes are reflected in both `a` and `b`!


Because of the way names work in Python, it's important to be on your guard for situations where two names point to the same thing.  When in doubt, explicitly _make a new object_ before you make changes.

In [None]:
a = np.arange(5)
print(f"{a=}")
b = a.copy()  # calling the Numpy array's copy() method
c = np.copy(a)  # using the numpy.copy function
d = 1 * a  # multiplying by 1
a[4] = 99
print(f"{a=}")
print(f"{b=}, {c=}, {d=}")

## Extending and combining arrays

Another natural consequence of the idea that `np.array()` creates a continguous chunk of memory for the array is that we can't make an array longer by assigning to values past the end:

In [None]:
a = np.arange(5)
print(f"{a=}, a[4]={a[4]}")
a[5] = 5

This index error is the same thing we got for attempting to read `a[5]`.  

So how can we make an array longer?

The short answer is __we can never make an array longer!__  When an array is stored, the computer gives us _just enough_ memory and will give out the _next_ few bits to some other program that asks.

So, "enlarging" an array (or shortening it) always means _making a new array with a new chunk of memory._  For example, `np.append()` will combine two arrays end to end:

In [None]:
print(f"{a=}")
c = np.append(a, np.arange(3, 10))
print(f"{c=}")

Since `c` is a brand-new array, we can modify it without changing `a`:

In [None]:
c[0] = -4
print(f"{a=}, {c=}")

If we really want to "change" an array, we can perform the simple trick of overwriting the _name_:

In [None]:
print(f"{b=}")
b = np.delete(b, 2)  # Remove the value at position 2
print(f"{b=}")

There are a [lot of NumPy routines for array manipulation](https://docs.scipy.org/doc/numpy-1.15.0/reference/routines.array-manipulation.html); again, it's better to look them up when you need them than trying to memorize the list.  (Most of them are used on multi-dimensional arrays, which we haven't met yet.)

## Slicing

Since `numpy` arrays hold so much data, we often need to access and manipulate a smaller sub-array.  The __slice__ notation makes this easy and intuitive!

Slice notation takes the form `a[i:j:k]`.  The way the numbers work is the same as `range()` and `np.arange()` functions: 
- `i` is the first element
- `j-1` is the last element
- `k` is the increment

In other words, we select all elements from `a[i]` to `a[j-1]`, counting by `k`.



In [None]:
print(f"{a=}")
print(f"{a[0:4:2]=}")
print(f"{a[2:5:1]=}")

For convenience, there are defaults for all three parts `i,j,k` of the slice, so we can skip some of them.  
- `i` defaults to `0` (the start of the array)
- `j` defaults to `N` (the end of the array)
- `k` defaults to 1

Here are some common usage patterns:

* `a[i:j]`: all elements from `a[i]` to `a[j-1]` (counting by 1.)
* `a[i:]`: all elements from `a[i]` to the end of the array.
* `a[:j]`: all elements from `a[0]` to `a[j-1]`.
* `a[::k]`: all elements in the array, but counting by `k`.


The last one most often appears as a tricky way of getting an array _backwards_, as `a[::-1]`.

In [None]:
print(f"{a[3:]=}, {a[:3]=}")
print(f"{a[::-1]=}")  # negative values of k let us count backwards

Note that a slice of a numpy array is _not_ a copy - __it points to part of the original array__.  This means we can cause mutation with slices; sometimes useful, but you should be aware!

## Generalized indexing 

We can take indexing to another level of abstraction!

"_Generalized indexing_" or "_advanced indexing_" lets us do even more general operations than slicing.  It lets us get sub-arrays by passing an __array of indices__:

In [None]:
a = np.array([17, -3, -44, 8, 56])
some_elements = np.array([0, 1, 4])
print(f"{a[some_elements]=}")

Or we can pass a __mask__, which is an array of Boolean values (of the same length as our array) that tells NumPy which values we want to select:

In [None]:
my_mask = np.array([True, True, False, False, True])
print(f"{a[my_mask]=}")

We can create masks by hand, but they naturally show up if we run some sort of Boolean test on our array:

In [None]:
a = np.array([17, -3, -44, 8, 56])
pos_mask = a > 0
print(f"{pos_mask=}")
print(f"{a[a > 0]=}")

You can think of a mask in the literal sense as an array of opaque and transparent cells, which we just overlay onto our initial array:

<!-- <img src="img/mask.png" width=400px /> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/mask.png" width=400px />

This is great for _selecting_ a subset of our data, but a more powerful application involves __running the arrows backwards__: using the mask to _modify_ a subset of our data.  For example, we could zero out all of the positive entires in `a` using our mask:

In [None]:
a[pos_mask] = 0
print(f"{a=}")

<!-- <img src="img/mask-reverse.png" width=400px /> -->
<img src="https://raw.githubusercontent.com/wlough/CU-Phys2600-Fall2025/main/lectures/img/mask-reverse.png" width=400px />

By the way, __a mask is a new (Boolean) array__, which means that it doesn't automatically change when we change the original array!

In [None]:
print(f"{a=}, {pos_mask=}")

## Tutorial 10

Go ahead and open `tut10`.