# How do we work with dimensionality?

In [1]:
import numpy as np

### First let's work with pure python

In [2]:
simplePython = [0, 1, 2, 3]
twoDims      = [[0, 1, 2, 3], [4, 5, 6, 7]]
threeDims    = [[[0, 1], [2, 3]], [[4, 5], [6, 7]]]

As you can see, every time we have another layer of nested lists within lists, a new dimension is added to our data. This can go on forever, or to be more accurate, until your computer complains about it.

To get the number two from each of these lists we use indexes:

In [3]:
print([simplePython[2], twoDims[0][2], threeDims[0][1][0]])

[2, 2, 2]


We can also do slices, but accessing the next level after we slice is tricky.

In [4]:
print([simplePython[:], twoDims[0][:], threeDims[0][:]])

[[0, 1, 2, 3], [0, 1, 2, 3], [[0, 1], [2, 3]]]


Numpy has all the same options of course, but you can use commas and be more concise:

In [5]:
simplePython = np.array([0, 1, 2, 3])
twoDims      = np.array([[0, 1, 2, 3], [4, 5, 6, 7]])
threeDims    = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])

print([simplePython[:], twoDims[0,:], threeDims[0,:]])

[array([0, 1, 2, 3]), array([0, 1, 2, 3]), array([[0, 1],
       [2, 3]])]


### Numpy axis operations work on the dimensions.

The zeroth axis is always the outermost list - the "top." If we take the mean of twoDims with axis zero, numpy will work with every list in the the outermost list.

With those lists, it will start with the zeroth value in each of those lists. Here, the zeroth values in twoDims are 0 and 4.

twoDims.mean is going to output a list, so now we know that the first value of the list is 2 (the average of 0 and 4.) Have a look:

In [6]:
twoDimsMean = twoDims.mean(axis= 0)
print(twoDimsMean)

[ 2.  3.  4.  5.]


The next axis, axis 1, will simply take the mean of every value in each list.

In [7]:
twoDimsMean = twoDims.mean(axis= 1)
print(twoDimsMean) #Observe how they differ

[ 1.5  5.5]


# How does this work in 3-D?

As you see below, the zeroth axis still gives the same values, but the output is nested to reflect the number of dimensions.

In [8]:
threeDimsMean = threeDims.mean(axis= 0)
print(threeDimsMean)

[[ 2.  3.]
 [ 4.  5.]]


At axis 1, Numpy will do the same thing we saw above for the 2-D array, but now it will have to do it twice. In effect, the operations are the same as doing np.mean(axis=0) on two separate 2-D arrays.

In [9]:
threeDims[0]

array([[0, 1],
       [2, 3]])

In [10]:
threeDims[0].mean(axis=0)

array([ 1.,  2.])

In [11]:
threeDimsMean = threeDims.mean(axis= 1)
print(threeDimsMean)

[[ 1.  2.]
 [ 5.  6.]]


What about axis 2? It's a 3-D array after all?

In [12]:
threeDimsMean = threeDims.mean(axis= 2)
print(threeDimsMean)

[[ 0.5  2.5]
 [ 4.5  6.5]]


As you can see, just like with the 2-D array, it simply takes the average of elements in each of the lowest level lists.