## The Array Object <a class="anchor" id="array-object"></a>

**Learning outcome:** by the end of this section, you will be able to manipulate NumPy array objects through indexing and explain key features and properties of NumPy arrays.

The multidimensional array object is at the core of all of NumPy's functionality.  Let's explore this object some more.

### Array properties

Let's create a NumPy array and take a look at some of its properties.

In [None]:
import numpy as np

arr = np.ones((3, 2, 4))

print("Array shape:", arr.shape)
print("Array element dtype:", arr.dtype)

Note that although the data we put into the array are integers, the default behaviour of NumPy is to create arrays in  has automatically

#### Exercise

##### Part 1

Use the code cells below to explore features of this array.

* What is the `type` of the array created above? (Hint: you can find the type of an object in Python using `type(<object>)`, where `<object>` is replaced with the name of the variable containing the object.)

##### Part 2

Consider the following NumPy array properties:

* `ndim`,
* `nbytes`,
* `size`,
* `T`.

For each of these properties:

* Use information from the array documentation to find out more about each array property.
* Apply each property to the array `arr` defined above. Can you explain the results you get in each case?

### Indexing

You can index NumPy arrays in the same way as other Python objects, by using square brackets `[]`. This means we can index to retrieve a single element, multiple consecutive elements, or a more complex sequence:

In [None]:
arr = np.array([1, 2, 3, 4, 5, 6])
print(arr)

In [None]:
print("arr[2] = {}".format(arr[2]))
print("arr[2:5] = {}".format(arr[2:5]))
print("arr[::2] = {}".format(arr[::2]))

You can also index multidimensional arrays using an enhanced indexing syntax, which allows for multi-element indexing. (Sometimes this is referred to as "extended slicing".) Remember that Python uses zero-based indexing!

In [None]:
lst_2d = [[1, 2, 3], [4, 5, 6]]
arr_2d = arr.reshape(2, 3)

print("2D list:")
print(lst_2d)

In [None]:
print("2D array:")
print(arr_2d)

In [None]:
print("Single array element:")
print(arr_2d[1, 2])

In [None]:
print("Single row:")
print(arr_2d[1])
print("First two columns:")
print(arr_2d[:, :2])

If you only provide one index to slice a multidimensional array, then the slice will be expanded to "`:`" for all of the remaining dimensions:

In [None]:
print('Second row: {} is equivalent to {}'.format(arr_2d[1], arr_2d[1, :]))

This is known as **ellipsis**. Ellipsis can be specified explicitly using "`...`", which automatically expands to "`:`" for each dimension unspecified in the slice:

In [None]:
arr1 = np.empty((4, 6, 3))
print('Original shape: ', arr1.shape)
print(arr1[...].shape)
print(arr1[..., 0:2].shape)
print(arr1[2:4, ..., ::2].shape)
print(arr1[2:4, :, ..., ::-1].shape)

#### Boolean Indexing

NumPy provides syntax to index conditionally, based on the data in the array.

You can pass in an array of True and False values (a boolean array), or, more commonly, a condition that returns a boolean array.

In [None]:
print(arr_2d)

In [None]:
bools = arr_2d % 2 == 0
print(bools)

In [None]:
print(arr_2d[bools])

#### Exercise

##### Part 1

Why do these indexing examples give the stated results?

 * result of `arr_2d[1, 0]` is `4`
 * result of `arr_2d[0]` is `[1, 2, 3]`
 * result of `arr_2d[1, 1:]` is `[5, 6]`
 * result of `arr_2d[0:, ::2]` is `[[1, 3], [4, 6]]`

##### Part 2

How would you index `arr_2d` to retrieve:

 * the third value: resulting in `3`
 * the second row: resulting in `[4 5 6]` 
 * the first column: resulting in `[1 4]`
 * the first column, retaining the outside dimension: resulting in `[[1] [4]]`
 * only values greater than or equal to 3: resulting in `[3 4 5 6]`

### Arrays are not lists

Question: why do the following examples produce different results?

In [None]:
print(lst_2d[0:2][1])
print(arr_2d[0:2, 1])

The result we just received points to an important piece of learning, which is that in most cases NumPy arrays behave very differently to Python lists. Let's explore the differences (and some similarities) between the two.

#### dtype

A NumPy array has a fixed data type, called `dtype`.  This is the type of all the elements of the array.  

This is in contrast to Python lists, which can hold elements of different types.

#### Exercise

* What happens in Python when you add an integer to a float?
* What happens when you put an integer into a NumPy float array?
* What happens when you do numerical calculations between arrays of different types?


### Generating 2D coordinate arrays

A common requirement of NumPy arrays is to generate arrays that represent the coordinates of our data.

When orthogonal 1d coordinate arrays already exist, NumPy's meshgrid function is very useful:

In [None]:
x = np.linspace(0, 9, 3)
y = np.linspace(4, 8, 3)
x2d, y2d = np.meshgrid(x, y)
print(x2d)
print(y2d)

#### The Array Object: Summary of key  points
 * properties : `shape`, `dtype`.
 * arrays are homogeneous; all elements have the same type: `dtype`.
 * indexing arrays to produce further arrays: views on the original arrays.
 * multi-dimensional indexing (slicing) and boolean indexing.
 * combinations to form 2D arrays: `meshgrid`
 