While a lot of powerful tools are built into languages like Python, even more tools exist in libraries.

In order to load our temperature data, we need to import a library called NumPy. You should use this library if you want to do fancy things with numbers, especially if you have matrices or arrays. We can load NumPy using:

In [1]:
import numpy

Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to Python, much like a new piece of equipment adds functionality to a lab space. Once we’ve loaded the library, we can use a tool inside that library to read the data file:

In [2]:
numpy.loadtxt('topo.asc', delimiter=',')

array([[ 3198.8391,  3198.123 ,  3197.1584, ...,  2583.3293,  2585.4368,
         2589.1079],
       [ 3198.3306,  3197.5242,  3196.4102, ...,  2582.6992,  2584.9167,
         2587.801 ],
       [ 3197.9968,  3196.9197,  3195.7188, ...,  2581.8328,  2583.8159,
         2586.0325],
       ..., 
       [ 3325.1509,  3334.7822,  3343.3154, ...,  2780.8191,  2769.3235,
         2762.373 ],
       [ 3325.0823,  3335.0308,  3345.4963, ...,  2775.3345,  2765.7131,
         2759.6555],
       [ 3326.6824,  3336.5305,  3348.1343, ...,  2769.7661,  2762.5242,
         2756.6877]])

The expression `numpy.loadtxt(...)` is a function call. It asks Python to run the function `loadtxt` that exists within the library `numpy`. This dotted notation, with the syntax `thing.component`, is used everywhere in Python to refer to parts of things.

The function call to `numpy.loadtxt` has two parameters: the name of the file we want to read and the delimiter that separates values on a line. Both need to be character strings (or strings for short) so we write them in quotes.

Within the Jupyter iPython notebook, pressing Shift+Enter runs the commands in the selected cell. Because we haven't told iPython what to do with the output of `numpy.loadtxt`, the notebook just displays it on the screen. In this case, that output is the data we just loaded. By default, only a few rows and columns are shown (with `...` to omit elements when displaying big arrays).

Our call to `numpy.loadtxt` read the file but didn’t save it to memory. In order to access the data, we need to assign the values to a variable. A variable is just a name that refers to an object. Python’s variables must begin with a letter and are case sensitive. We can assign a variable name to an object using `=`.

## Naming objects {.callout}

What happens when a function is called but the output is not assigned to a variable is a bit more complicated than simply not saving it. The call to `numpy.loadtxt` read the file and created an object in memory that contains the data, but because we didn't assign it to a variable name, there is no way for us to call this object. While this difference might seem irrelevant (and, in practice, it is!), it will be important to consider how variable names are assigned to objects when we talk about mutable and immutable objects later on.

A good explanation of how Python handles variables and objects can be found here: https://jeffknupp.com/blog/2012/11/13/is-python-callbyvalue-or-callbyreference-neither/

Let’s re-run numpy.loadtxt and assign the output to a variable name:

In [3]:
topo = numpy.loadtxt('topo.asc', delimiter=',')

This command doesn’t produce any visible output. If we want to see the data, we can print the variable’s value with the command `print`:

In [4]:
print topo

[[ 3198.8391  3198.123   3197.1584 ...,  2583.3293  2585.4368  2589.1079]
 [ 3198.3306  3197.5242  3196.4102 ...,  2582.6992  2584.9167  2587.801 ]
 [ 3197.9968  3196.9197  3195.7188 ...,  2581.8328  2583.8159  2586.0325]
 ..., 
 [ 3325.1509  3334.7822  3343.3154 ...,  2780.8191  2769.3235  2762.373 ]
 [ 3325.0823  3335.0308  3345.4963 ...,  2775.3345  2765.7131  2759.6555]
 [ 3326.6824  3336.5305  3348.1343 ...,  2769.7661  2762.5242  2756.6877]]


Using its variable name, we can ask that type of object `topo` refers to:

In [5]:
print type(topo)

<type 'numpy.ndarray'>


The function `type` tells us that the variable name `topo` currently refers to an N-dimensional array created by the NumPy library. The file we imported contains elevation data (in meters, 2 degree spacing) for an area along the Front Range of Colorado. We can get the shape of the array:

In [6]:
print topo.shape

(500, 500)


This tells us that `topo` has 500 rows and 500 columns. The object of type `numpy.ndarray` that the variable `topo` refers to has some information associated with it called attributes. This extra information describes the data in the same way an adjective describes a noun. The command `topo.shape` calls the `shape` attribute of the object assigned to `topo` that describes its dimensions. We use the same dotted notation for the attributes of objects that we use for the functions inside libraries because they have the same part-and-whole relationship.

## Who's who in the memory {.callout}

You can use the whos command at any time to see what variables you have created and what modules you have loaded into the computers memory. As this is an IPython command, it will only work if you are in an IPython terminal or the Jupyter Notebook.

In [7]:
whos

Variable   Type       Data/Info
-------------------------------
numpy      module     <module 'numpy' from '/Us<...>ages/numpy/__init__.pyc'>
topo       ndarray    500x500: 250000 elems, type `float64`, 2000000 bytes (1 Mb)


## Indexing

We can access individual values in an array by providing an index in square brackets:

In [8]:
print 'elevation at the corner of topo:', topo[0,0], 'meters'

elevation at the corner of topo: 3198.8391 meters


In [9]:
print 'elevation at some random spot in topo:', topo[137,65], 'meters'

elevation at some random spot in topo: 3251.1179 meters


When referring to values in a two dimensional array, the indices are ordered `[row,column]`. The expression `topo[137, 65]` may not surprise you but `topo[0,0]` might. Programming languages like Fortran and MATLAB start counting at 1 because that’s what (most) humans have done for thousands of years. Languages in the C family (including C++, Java, Perl, and Python) count from 0 because that’s simpler for computers to do. So if we have an M×N array in Python, the indices go from 0 to M-1 on the first axis (rows) and 0 to N-1 on the second (columns). In MATLAB, the same array (or matrix) would have indices that go from 1 to M and 1 to N. Zero-based indexing takes a bit of getting used to, but one way to remember the rule is that the index is how many steps we have to take from the start to get to the item we want.

Python also allows for negative indices to refer to the position of elements with respect to the end of each axis. Since index `[0,0]` is the upper left corner of an array, index `[-1,-1]` is therefore the lower right corner of the array.

In [18]:
print topo[-1,-1]

2756.6877


## In the Corner {.callout}

It may also surprise you that Python displays an array with the element with index [0, 0] in the upper left corner rather than the lower left. This is consistent with the way mathematicians draw matrices but different from Cartesian coordinates. The indices are (row, column) instead of (column, row) for the same reason, which can be confusing when plotting data.

## Slicing

A command like `topo[137,65]` selects a single element in the array `topo`. Indices can also be used to select sections of an array. For example, we can select the top left quarter of the array like this:

In [12]:
print topo[0:5, 0:5]

[[ 3198.8391  3198.123   3197.1584  3196.2017  3193.8813]
 [ 3198.3306  3197.5242  3196.4102  3194.7559  3191.9763]
 [ 3197.9968  3196.9197  3195.7188  3193.3855  3190.5371]
 [ 3198.054   3196.7031  3194.9573  3192.4451  3189.5288]
 [ 3198.3289  3196.9111  3195.335   3192.7874  3190.0085]]


The slice `[0:5]` means "Start at index 0 and go along the axis up to, but not including, index 5".

We don’t need to include the upper or lower bound of the slice if we want to go all the way to the edge. If we don’t include the lower bound, Python uses 0 by default; if we don’t include the upper bound, the slice runs to the end of the axis. If we don’t include either (i.e., if we just use ‘:’), the slice includes everything:

In [13]:
print topo[:len(topo)/2, len(topo)/2:]

[[ 3008.1116  3012.2922  3015.3018 ...,  2583.3293  2585.4368  2589.1079]
 [ 3009.9558  3014.0007  3016.5647 ...,  2582.6992  2584.9167  2587.801 ]
 [ 3010.8604  3014.1228  3016.7412 ...,  2581.8328  2583.8159  2586.0325]
 ..., 
 [ 3370.0918  3368.5371  3366.7148 ...,  2687.8396  2682.4326  2676.8521]
 [ 3370.478   3368.7561  3366.8923 ...,  2685.9941  2681.2888  2676.9924]
 [ 3371.2021  3369.3376  3367.3677 ...,  2687.7014  2685.5146  2683.1936]]


## len() and other built-in functions {.callout}

The function `len()` returns the length of the longest axis of a sequence (a numpy array, a list, etc.). Because it is a built-in function, it is always available for the Python interpreter and doesn't have to be imported. The function `type()` is another built in function. You can read about them here: https://docs.python.org/2/library/functions.html

## Numerical operations on arrays

We can perform basic mathematical operations on each individual element of a NumPy array. We can create a new array with elevations in feet:

In [20]:
topo_in_feet = topo * 3.2808
print 'Elevation in meters:', topo[0,0]
print 'Elevation in feet:', topo_in_feet[0,0]

Elevation in meters: 3198.8391
Elevation in feet: 10494.7513193


Arrays of the same size can also be used for arithmatic operations:

In [24]:
double_topo = topo + topo
print 'Double topo:', double_topo[0,0], 'meters'

Double topo: 6397.6782 meters


We can also perform statistical operations on arrays:

In [26]:
print 'Mean elevation:', topo.mean(), 'meters'

Mean elevation: 3153.62166407 meters


## Methods vs. attributes {.callout}

`mean` is a method that belongs to the array `topo`, i.e., it is a function that belongs to `topo` just like the attribute `shape` does. When we call `topo.mean()`, we are asking `topo` to calculate its mean value. Because it is a function, we need to include parenthesis in the command. A call to `topo.shape` doesn't include parenthesis because attributes are objects, not functions.

Python will kindly tell us if we mix up the parentheses:

In [29]:
topo.mean

<function mean>

In [30]:
topo.shape()

TypeError: 'tuple' object is not callable

NumPy arrays have many other useful methods:

In [31]:
print 'Highest elevation:', topo.max()
print 'Lowest elevation:', topo.min()

Highest elevation: 3831.2617
Lowest elevation: 2565.0293


We can also call methods on slices of the array:

In [45]:
print 'Highest elevation of NW quarter:'
print topo[:len(topo)/2, :len(topo)/2].max(), 'meters'

print 'Highest elevation of SE quarter:'
print topo[len(topo)/2:, len(topo)/2:].max(), 'meters'

Highest elevation of NW quarter:
3600.709 meters
Highest elevation of SE quarter:
3575.3262 meters


Methods can also be used along individual axes (rows or columns) of an array. If we want to see how the mean elevation changes with longitude (E-W), we can use the method along `axis=0`:

In [49]:
print topo.mean(axis=0)

[ 3428.2054708  3427.6972338  3427.2261988  3426.584768   3426.0234734
  3425.8775458  3425.8402916  3425.8877396  3426.0181264  3426.367201
  3426.5001356  3426.5056046  3426.7196578  3426.83595    3426.8902076
  3426.8796098  3427.0106752  3427.1497438  3426.9835134  3426.5728362
  3426.2113722  3425.9928652  3426.2578698  3426.8117054  3427.2077288
  3427.4824742  3427.6150816  3427.7935558  3428.1331802  3428.6451532
  3428.8814682  3428.1315612  3426.4833232  3424.2040212  3420.8957744
  3417.5235518  3414.35544    3411.0524528  3407.5421916  3403.9730066
  3400.180012   3396.443263   3392.9023536  3389.4827594  3386.0996288
  3382.5170704  3378.9200258  3375.4427244  3371.995064   3368.3281432
  3364.5542568  3360.6775356  3356.9672814  3353.675611   3350.8723902
  3348.2830278  3345.8318692  3343.3299914  3340.7651926  3338.4619194
  3336.1401564  3333.9848948  3332.149046   3330.5379514  3329.1352982
  3328.1674988  3327.3276044  3326.4997932  3325.6943524  3325.0253948
  3324.

To see how the mean elevation changes with latitude (N-S), we can use `axis=1`:

In [47]:
print topo.mean(axis=1)

[ 2942.8326116  2943.959394   2945.045799   2945.9719808  2946.8093426
  2947.596404   2948.420486   2949.274708   2949.997857   2950.5324138
  2950.9564236  2951.5676038  2952.2330212  2953.0946878  2954.1665572
  2955.4393058  2956.5564424  2957.7191494  2959.0154026  2960.4356226
  2961.9122632  2963.4793924  2965.1737982  2966.7024046  2968.1888318
  2969.7119052  2971.2346754  2972.723161   2974.1720514  2975.805006
  2977.9770652  2980.2357216  2982.4065244  2984.609059   2986.675906
  2988.7993386  2990.8434744  2992.8401032  2994.9146994  2996.9524666
  2998.9516138  3000.9031134  3002.6750622  3004.3238364  3005.8296008
  3007.2238078  3008.5737466  3009.9403134  3011.3514118  3012.7873008
  3014.2559672  3015.721623   3017.2519796  3018.9853812  3020.974201
  3023.0632786  3025.2736084  3027.3475148  3029.1742274  3030.797473
  3032.199445   3033.479611   3034.4935618  3035.3736924  3036.234053
  3036.9581828  3037.488056   3037.9723376  3038.5729022  3039.3633732
  3040.1873

## Plotting