# 2. Multidemensional Arrays


An array is an ordered and structured collection of elements. Arrays are structured around the number of dimensions they contain, as well as how many elements exist along each dimension. Today we will focus on arrays that are more than one dimension.

A one-dimensional array that contains six elements looks like:

  4 5 6 7 8 9

Or we could have a two-dimensional array that contains six elements with two rows and three columns:

  4 5 6
  
  7 8 9

In general, an array can be $n$-dimensional. One way to think of $n$-dimensional arrays in terms of the bookshelf analogy:

- 1d array is a single row of a bookshelf, where a book can be identified by its position in the row
- 2d array is the whole bookshelf, where a book can be identified by its row number and its position in the row
- 3d array is a room full of bookshelves, where a book can be identified by the number of the bookshelf, row, and position in the row
- 4d array is a library with rooms with bookshelves, where a book can be identified by the room, bookshelf, row and position in the row
- ...and so on...

The constraints for multi-dimensional numpy arrays are the same as for one-dimensional arrays: 
- all elements must be of the same type
- all elements are contiguous in memory to allow for fast computations

Similar to 1-dimensional arrays, we can use four methods to create multi-dimensional arrays:
- From Python lists and tuples
- From a file
- Using intrinsic Numpy functions
- Using functions in the random module in Numpy

In the ramaining, we will see examples on creating multi-dimensional Numpy arrays.

## 2.1 Creating multidemensional arrays 

### 2.1.1 Creating multidemensional arrays from Python lists and tuples

Last week we focused on creating arrays that contain one dimesion. We have learned about multiple attributes we can access to get insight about the array characteristics including:
- `x.ndim` : the number of dimensions.
- `x.shape` : the length of each dimension.
- `x.size` : the total number of elements.
- `x.dtype`: the datatype of the elements in the array.

In [None]:
import numpy

b = numpy.array([1, 2, 3])
print( b )
print( b.ndim, b.shape, b.size, b.dtype )


When creating two dimensional arrays, the simplist way to do so is based on python structures. Specifically, create the numpy array based off a python list, where each element of the list is another list. In this case the outer list is a list of the rows and each inner list specifies the elements of that row.

In [None]:
a = numpy.array([[1, 3, 6, 5], [2, 4, 3, 0]])
#a = numpy.array(((1, 3, 6, 5), (2, 4, 3, 0)))

print( a )
print( a.ndim, a.shape, a.size, a.dtype )

A true 2-dimensional array (also called a matrix) is created only if the number of elements in each row is the same across all rows. Otherwise, a one-dimensional numpy array of python lists is created:

In [None]:
b = numpy.array([[1, 3, 6, 5], [2, 4, 3]])

print( b )
print( b.ndim, b.shape, b.size, b.dtype )


N-dimensional arrays can be created in the same way as two-dimensional arrays but with more nested lists:

In [None]:
# create a three dimensional array:
z1 = numpy.array([[[1, 3], [2, 4]], [[11, 13], [12, 14]] ])

# dimensions/shape/size of the array:
print(z1)
print( "Number of dimensions:", z1.ndim )
print( "Length of each dimension:", z1.shape )
print( "The total number of elements:", z1.size )

## Exercise 2.1

First, create a 2x3x2 three-dimensional array (you decide which values to include). Then, create 2x2x3 three-dimensional array. Verify the arrays have the same number of dimensions and elements but different shapes.

### 2.1.2 Reading and writing multidimensional arrays from/to file

There are two major numpy based methods for saving objects: saving them as a text file, or as a numpy-specific binary file. The first option is very similar to a csv file, the second preserves the numpy structure (like python's pickle library). The way to use these methods are:

- `numpy.savetxt(filename, array)` : save a 1-dimensional or 2-dimensional array to a text file. 
- `numpy.save(filename, array)` : save an array to a binary file in numpy `.npy` format.

The corresponding functions for reading these formats are `numpy.loadtxt()` and `numpy.load()`.

In [None]:
# create a two dimensional array:
a = numpy.array([[1, 3, 6, 5], [2, 4, 3, 0]])
print(a, a.shape, a.dtype)

# saving the array in a text file
numpy.savetxt('textfile.txt', a, fmt='%d', delimiter=',')

In [None]:
# loading a two dimensional array form the disk
b = numpy.loadtxt('textfile.txt', dtype='int', delimiter=',')
print(b, b.shape, b.dtype)

In [None]:
# create a three dimensional array:
a = numpy.array([[[1, 3], [2, 4]], [[11, 13], [12, 14]] ])
print(a, a.shape, a.dtype)

# saving the array in a binary file
numpy.save('binaryfile.npy', a)

In [None]:
# loading an array from a binary file
b = numpy.load('binaryfile.npy')
print(b, b.shape, b.dtype)

## Exercise 2.2

Part 1: Generate a one dimensional array. Save to disk as a text file, and as a npy file. Create a new array that is based on reading the array back from the text file, and a new array that is based on the npy file. Is the information preserved in both formats?

Part 2: Repeat part 1 with a two-dimensional array and then a three-dimensional array. What works and what does not?

### 2.1.3 Creating N-dimensional arrays using Numpy functions

Creating arrays by specifying nested lists of elements is tedious. Therefore, there are nice tools for creating mulit-dimensional arrays.

An empty array or an array with only zeros or ones can be created with the `empty`, `zeros` and `ones` functions. 
Be careful with the empty array. This is the fastest way to inialize an array but the content of the empty array can be anything (it will be whatever was already written in that location in memory).

The shape of the array is the input for these functions:


In [None]:
# create a one-dimensional empty array:
a = numpy.empty([3])
print(a, a.shape)

In [None]:
# create two-dimensional array with only zeros:
b = numpy.zeros([2, 3])
print(b, b.shape)

In [None]:
# create a four-dimensional array with only ones:
c = numpy.ones([2,2,2,2])
print(c, c.shape)

An identity matrix is a two-dimensional square matrix (same number of rows and columns) in which all values are zeros, except for ones along the diagonal. 
The identity matrix can be created with the `eye(n)` function. Since the identity matrix is always a square, only one input parameter is needed to create the 2-dimensional matrix.

In [None]:
# create n by n identity matrix:
print(numpy.eye(4))

### 2.1.4 Creating random N-dimensional arrays
Specifying the shape of the array, we can create multi-dimensional arrays with any arbitrary dimensions:


In [None]:
a = numpy.random.random([2,2])
print(a, a.shape)


In [None]:
a = numpy.random.randint(0, 100, [10,2])
print(a, a.shape)


In [None]:
a = numpy.random.uniform(0, 100, [3,2,2])
print(a, a.shape)


In [None]:
a = numpy.random.normal(0, 5, [2,2])
print(a, a.shape)

## 2.2 Array stacking and splitting

More complex arrays can be created by combining two or more one-dimensional arrays. This is called vector stacking. There are two functions to stack arrays and they produce arrays with different shapes:

- horizontal stack: `numpy.hstack([x, y, z])`
- vertical stack: `numpy.vstack([x, y, z])`

In [None]:
x = numpy.arange(0, 5)                     
y = numpy.arange(5, 10)   
z = numpy.arange(10, 15)

In [None]:
print("Horizontal stack: " )
a = numpy.hstack([x, y, z])
print(a)
print(a.shape)

In [None]:
print("Vertical stack: ")
a = numpy.vstack([x, y, z])
print(a)
print(a.shape)

The opposite of vector stacking is to split a two-dimensional array into a list of one-dimensional arrays (vectors). The first argument is the 2D array and the second argument is the number of 1D arrays to create. It is important that that the number of pieces you want is possible given the number of rows or columns. Two different ways of splitting:
- Vertical split: `numpy.vsplit(x, N)`
- Horizontal split: `numpy.hsplit(x, N)`


In [None]:
a = numpy.arange(0, 5)                     
b = numpy.arange(5, 10)   
c = numpy.arange(10, 15)
d = numpy.vstack([a, b, c])

print(d, d.shape)

In [None]:
print("\nVertical split: ")
print( numpy.vsplit(d, 3) )

In [None]:
print("\nHorizontal split: ")
print( numpy.hsplit(d, 5) )

## 2.3 Reshaping arrays

We can use `reshape()` function to rearrange the elements of a multi-dimensional numpy array along its different dimensions. 

However, it is critical that the the total number of elements matches. For example, if the length of the initial list is 24, then we can reshape it to a 4 by 6 matrix. We could also transform it into a three-dimensional 2 by 3 by 4 array.

When reshaping, the shape of the new array is specificed as a tuple that lists the number of entries along each dimension. For creating a two-dimensional array, the first element in the tuple will be the number of rows, the second is the number of columns.

In [None]:
# reshape into a two dimensional array:
y = numpy.arange(2, 14, 2)
print(y)
print( y.ndim, y.shape, y.size)

z = y.reshape((2, 3))
print( z )
print( z.ndim, z.shape, z.size)

In a three-dimensional array, however, the first value specifies the number of "layers", the second specifies the number of "rows" and the third specifies the "columns".

In [None]:
# reshape into a three dimensional array:
z2 = numpy.random.random([6,4]).reshape((2, 3, 4))

# dimensions/shape/size of the array:
print( z2 )
print( "Number of dimensions:", z2.ndim )
print( "Length of each dimension:", z2.shape )
print( "The total number of elements:", z2.size )

## Exercise 2.3

First, create a 2x3x2 three-dimensional array containing random values from a normal distribution. Then, reshape that array into a 2x2x3 three-dimensional array. Verify the shape has changed.

### 2.3.1 Shallow vs. deep copying

Reshaping a numpy array is (usually) blazingly fast. This is because numpy does not copy all the elements in memory. Instead, a new variable points to the same memory location (this is called a shallow copy). This can intruduce unanticipated behavior:

In [None]:
a = numpy.arange(8)
print(a)
print()

b = a.reshape((2,4))
print(b)
print()

b[0,0] = -42

print("b changed:")
print(b)
print()

print("but so did a:")
print(a)
print()

The `base` attribute in numpy indicates if a variable is pointing to it's own memory or to the memory of another variable. If a variable's `base` is None, then it doesn't point to the memory of another variable. 

In [None]:
print(a.base)

In [None]:
print(b.base)
print(b.base is a)

To avoid a shallow copy, use the numpy function copy() to create a deep copy that creates a new location in memory:

In [None]:
a = numpy.arange(8)
print(a)
print()

b = a.copy()
print(b)
print()

b[0] = -42

print("b changed:")
print(b)
print()

print("but because of the copy() function, a did not:")
print(a)
print()

## 2.4 Multi-Dimensional Array Indexing

### 2.4.1 Indexing a Single Element
We can use index along different dimensions to retrieve a single element from a multi-dimensional array:

In [None]:
c = numpy.random.uniform(-0.5, .5, (3,4,5))
print(c)
print("========")
print(c[1,2,0])

In [None]:
print(c[1])
print('------------------------------------')
print(c[1][2])
print('------------------------------------')
print(c[1][2][0])

### 2.4.2 Slicing

We can fetch all the elements along a specific dimension using `:` opertaor:

In [None]:
z = numpy.arange(24).reshape((6, 4))
print( z )

In [None]:
# Fetching all the elements in the first row:
print( z[0, :] )

In [None]:
# Fetching all the elements in the third column:
print( z[:, 2] )

We can alternatively use the `:` opertaor to retrieve specific slices of an array: 

In [None]:
print(z)
print('-----------------------')
print( z[2:5, 1:3] )

In [None]:
print( z[:3,-3:-1] )

## Exercise 2.4

Generate a $5\times 5 \times 5$ 3D array of random numbers between -10.0 and 10.0. Reshape it to a $5 \times 25$ matrix. Extract the last 10 elements of the first two rows of this matrix.


### 2.4.3 Boolean Indexing
We can use boolean indexing to retrieve the lements of an array which satisfy a certain condition. Note that the ourput will be a vector (i.e. a one dimensional array).

In [None]:
a = numpy.random.normal(0,5, [3,5])
print(a)

In [None]:
b = a[a<0]
print(b, b.shape)

### 2.4.4 Using `ix_` function
For more complext indexing, we can use `ix_` from numpy. This function constructs an open mesh from multiple sequences and can be used to retrieve arbitrary dimensions of a multi-dimensional array. 

In [None]:
a = numpy.random.normal(0,5, [3,5])
print(a)
print('-----------------')
print(a[numpy.ix_([0,2], [1,2,4])])



For complete information about indexing see http://docs.scipy.org/doc/numpy/user/basics.indexing.html.


## 2.5 An example of indexing in RGB images

An RGB image is composed of three color channels: Red (R), Green (G), and Blue (B). Each of these channels can be thought of as a 2D array (or matrix) where each entry in the matrix corresponds to the intensity of that color for a particular pixel in the image. When these channels are combined, they form a 3D array that represents the RGB image. Each entry in the 3D array is an integer value between 0 and 255 (for an 8-bit image), indicating the intensity of the color:
- 0 represents no intensity (black).
- 255 represents full intensity.

The 3D array representing an RGB image has dimensions (height, width, 3), where:

- height is the number of pixels in the vertical direction of the image.
- width is the number of pixels in the horizontal direction of the image.
- 3 represents the three color layers (R, G, B). The first layer (or depth slice) of the 3D array corresponds to the Red channel. The second layer corresponds to the Green channel. The third layer corresponds to the Blue channel.


In [None]:
import matplotlib.pyplot as plt
color_image = plt.imread('lena_color.bmp')
plt.imshow(color_image)
print(color_image.shape, color_image.dtype)


To access the RGB values of a specific pixel at (x, y) coordinates:

- Red value: array[y, x, 0]
- Green value: array[y, x, 1]
- Blue value: array[y, x, 2]

We can use array indexing to retrieve different the intesities of different color plans in an RGB image:

In [None]:
# retrieving and showing the intesities of the red layer for all pixels
red_plane = numpy.zeros(color_image.shape, dtype='uint8')
red_plane[:,:,0] = color_image[:,:,0]
plt.imshow(red_plane)

In [None]:
# retrieving and showing the intesities of the green layer for all pixels
green_plane = numpy.zeros(color_image.shape, dtype='uint8')
green_plane[:,:,1] = color_image[:,:,1]
plt.imshow(green_plane)

In [None]:
# retrieving and showing the intesities of the blue layer for all pixels
blue_plane = numpy.zeros(color_image.shape, dtype='uint8')
blue_plane[:,:,2] = color_image[:,:,2]
plt.imshow(blue_plane)

We can also use array indexing to crop an RGB image:

In [None]:
# Cropping the image using array indexing
plt.imshow(color_image[100:400, 100:355, :])