<a href="https://csdms.colorado.edu"><img style="float: center; width: 75%" src="https://raw.githubusercontent.com/csdms/ivy/main/media/logo.png"></a>

# Arrays

Arrays are containers that store multiple values of the same type.
Values in an array are stored contiguously in memory,
which makes operations on arrays fast.
Arrays can be mutlidimensional.

In this lesson,
we'll explore how to work with arrays using topographic data
in the form of a digital elevation model.

We'll use a library called NumPy.

## NumPy

NumPy is a fundamental Python package for scientific computing.
NumPy uses a high-performance data structure known as the *n-dimensional array* or *ndarray*, a multi-dimensional array object, for efficient computation.
Use NumPy arrays if you want to do numerical computation on a large number of data values.

Import NumPy with its common nickname, `np`.

In [None]:
import numpy as np

## Exploring topographic data

Once we’ve loaded the library, we can
call a function inside that library to read a data file included in the Ivy coursefiles:

In [None]:
numpy.loadtxt("data/topo.asc", delimiter=",")

The expression `numpy.loadtxt(...)` is a function call
that asks Python to run the function `loadtxt` that belongs to the `numpy` library.
This dotted notation, with the syntax `thing.component`, is used
everywhere in Python to refer to parts of things.

The function call to `numpy.loadtxt` has two parameters:
the name of the file we want to read,
and the delimiter that separates values on a line.
Both need to be character strings (or strings, for short)
so we write them in quotes.

Within the Jupyter Notebook, pressing Shift+Enter runs the
commands in the selected cell. Because we haven't told iPython what to
do with the output of `numpy.loadtxt`, the notebook just displays it on
the screen. In this case, that output is the data we just loaded. By
default, only a few rows and columns are shown (with `...` to omit
elements when displaying big arrays).

Our call to `numpy.loadtxt` read the file but didn’t save it to memory.
In order to access the data, we need to assign the values to a variable.
A variable is just a name that refers to an object. Python’s variables
must begin with a letter and are case sensitive. We can assign a
variable name to an object using `=`.

## Objects and their names

What happens when a function is called but the output is not assigned to
a variable is a bit more complicated than simply not saving it. The call
to `numpy.loadtxt` read the file and created an object in memory that
contains the data, but because we didn't assign it to a variable name,
there is no way for us to call this object. 

Let’s re-run numpy.loadtxt and assign the output to a variable name:

In [None]:
topo = numpy.loadtxt("data/topo.asc", delimiter=",")

This command doesn’t produce any visible output. If we want to see the
data, we can print the variable’s value with the command `print`:

In [None]:
print(topo)

Using its variable name, we can see that [type](reference.html#type) of object the variable:

In [None]:
type(topo)

The function `type` tells us that the variable name `topo` currently
points to an N-dimensional array created by the NumPy library. We can also get the shape of the
array:

In [None]:
topo.shape

This tells us that `topo` has 500 rows and 500 columns. The file
we imported contains elevation data (in meters, 2 degree spacing) for an
area along the Front Range of Colorado, so the area that this array represents is 1 km x 1 km.

The object of
type `numpy.ndarray` that the variable `topo` is assigned to contains the values of the array
as well as some extra information about the array. These are the members or attributes of the object, and they
describe the data in the same way an adjective describes a noun. The
command `topo.shape` calls the `shape` attribute of the object with the variable name
`topo` that describes its dimensions. We use the same dotted notation
for the attributes of objects that we use for the functions inside
libraries because they have the same part-and-whole relationship.

 ## Who's who in the memory

 You can use the `whos` command at any time to see what variables you have created and what modules you have loaded into the computers memory. As this is an IPython command, it will only work if you are in an iPython
 terminal or the Jupyter Notebook.
 
 Try it, check what is currently on your memory

In [None]:
whos

## Indexing

We can access individual values in an array using an [index](reference.html#index) in square brackets:

In [None]:
print("elevation at the corner of topo:", topo[0, 0], "meters")

In [None]:
print("elevation at a point in topo:", topo[137, 65], "meters")

When referring to entries in a two dimensional array, the indices are
ordered `[row,column]`. The expression `topo[137, 65]` should not surprise you but `topo[0,0]` might. Programming languages like Fortran and MATLAB
start counting at 1 because that’s what (most) humans have done for
thousands of years. Languages in the C family (including C++, Java,
Perl, and Python) count from 0 because that’s simpler for computers to
do. So if we have an M×N array in Python, the indices go from 0 to M-1
on the first axis (rows) and 0 to N-1 on the second (columns). In
MATLAB, the same array (or matrix) would have indices that go from 1 to
M and 1 to N. Zero-based indexing takes a bit of getting used to, but
one way to remember the rule is that the index is how many steps we have
to take from the start to get to the item we want.

Python also allows for **negative indices** to refer to the position of
elements with respect to the end of each axis. An index of -1 refers to
the last item in a list, -2 is the second to last, and so on. Since
index `[0,0]` is the upper left corner of an array, index `[-1,-1]`
therefore the lower right corner of the array. 

Print the lower right corner of the `topo` array: 

In [None]:
print(topo[-1, -1])

Print the upper left corner of the `topo` array: 

In [None]:
print(topo[0, 0])

> ## In the Corner
>
> What may also surprise you is that when Python displays an array,
> it shows the element with index `[0, 0]` in the upper left corner
> rather than the lower left.
> This is consistent with the way mathematicians draw matrices,
> but different from the Cartesian coordinates.
> The indices are (row, column) instead of (column, row) for the same reason,
> which can be confusing when plotting data.

## Slicing

A command like `topo[0,0]` selects a single element in the array `topo`.
Indices can also be used to [slice](reference.html#slice) sections of the array. For example, we
can select the top left quarter of the array like this:

In [None]:
print(topo[0:5, 0:5])

The slice `[0:5]` means "Start at index 0 and go along the axis up to,
but not including, index 5".

We don’t need to include the upper or lower bound of the slice if we
want to go all the way to the edge. If we don’t include the lower bound,
Python uses 0 by default; if we don’t include the upper bound, the slice
runs to the end of the axis. If we don’t include either (i.e., if we
just use ‘:’), the slice includes everything. 

Print out the first 5 rows and last 6 columns op the topo array:

In [None]:
print(topo[:5, -6:])

 ## Point elevations: Practice your skills 
 
 Use indexing to answer the following questions and check your answers
 against the data visualization:
 
 * Is the NW corner of the region higher than the SW corner? What's the elevation difference? You can assume the NW corner to be in the upper left corner of the matrix (NW of at [0,0], not the Cartesian NW, see also (In the Corner)
 * What's the elevation difference between the NE corner and the SE corner?
 * What's the elevation at the center of the region shown in the array?

In [None]:
print(topo[0, 0] - topo[-1, 0])
print(topo.shape[0] / 2)
print(topo[int(topo.shape[0] / 2), int(topo.shape[1] / 2)])

## Numerical operations on arrays

We can perform basic mathematical operations on each individual element of a NumPy array. We can create a new array with elevations in feet:

In [None]:
topo_in_feet = topo * 3.2808
print("Elevation in meters:", topo[0, 0])
print("Elevation in feet:", topo_in_feet[0, 0])

Arrays of the same size can be used together in arithmatic operations:

In [None]:
double_topo = topo + topo
print("Double topo:", double_topo[0, 0], "meters")

We can also perform statistical operations on arrays:

In [None]:
print("Mean elevation:", topo.mean(), "meters")

> ## Methods vs. attributes
> 
> `mean` is a method that belongs to the array `topo`, i.e., it is a
> function `topo` can inherently call just because of its type.
> When we call `topo.mean()`, we are asking `topo` to calculate its mean
> value. Because it is a function, we need to include parenthesis in the
> command. Because it is an `np.array`, `topo` also has an attribute called `shape`, but it doesn't include parenthesis because
> attributes are objects, not functions.

Python will kindly tell us if we mix up the parentheses:
 

In [None]:
topo.mean

NumPy arrays have many other useful methods. Print the min and max elevation of the topo dataset

In [None]:
print("Highest elevation:", topo.max(), "meters")
print("Lowest elevation:", topo.min(), "meters")

We can also call methods on slices of the array:

In [None]:
half_len = int(topo.shape[0] / 2)

print("Highest elevation of NW quarter:", topo[:half_len, :half_len].max(), "meters")

print("Highest elevation of SE quarter:", topo[half_len:, half_len:].max(), "meters")

Methods can also be used along individual axes (rows or columns) of an
array. If we want to see how the mean elevation changes with longitude
(E-W), we can use the method along `axis=0`:

In [None]:
print(topo.mean(axis=0))

To see how the mean elevation changes with latitude (N-S), we can use
`axis=1`:

In [None]:
print(topo.mean(axis=1))