# NumPy ndarray

## What is NumPy?

The [NumPy](http://docs.scipy.org/doc/numpy/index.html) package is the foundation of most scientific computations performed in Python. Much of NumPy's core functionality is implemented in C/Fortran, so performance is greatly improved over native Python data types.  NumPy provides:

- the **ndarray** (n dimensinal array), numpy's primary object
- fast array operations
- large library of linear algebra procedures
- and much, much, more...

It is customary to import NumPy as

In [None]:
import numpy as np

## The ndarray

A NumPy `ndarray`, or `array`, object represents a multidimensional, homogeneous array of fixed-size items. An associated data-type object describes the format of each element in the array.  In NumPy, dimensions are referred to as axes. The number of axes is the rank.  

<div style="background-color: #FFFFFF; margin-right: 10px; padding-bottom: 8px; padding-left: 8px; padding-right: 8px; padding-top: 8px; border: 2px solid black;">
<b>Important</b><br/>
In NumPy we speak of <tt>n</tt> dimensional arrays.  It is common in mathematics to refer to a 2 dimensional array as a <em>matrix</em>.  In NumPy, however, this is not the convention.  As you are learning about arrays, I encourage you to think about them as arrays and not matrices.
</div>

Arrays can be constructed in a number of ways, a common method is to construct an array from an existing sequence:

In [None]:
a = np.array([10, 20, 30, 40])
b = np.array(['crunchy frog', 'ram bladder', 'lark vomit'])

The first example is an array of four integers. The second is an array of three character strings. Unlike lists, the elements of an ndarray have same type.  For example, it is illegal to assign a string to an element of `a`.

In [None]:
a[0] = 'string'

The data-type of each element in an array is stored in the arrays `dtype` attribute:

In [None]:
print(a.dtype)
print(b.dtype)

The data-type is inferred from its arguments at the time of construction. When elements of different types are passed to a constructor, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).  The data-type can also be explicitly specified:

In [None]:
c = np.array([1, 2, 3], dtype=np.float64)
print(c)
print(c.dtype)

Multidimensional arrays are constructed from nested sequences:

In [None]:
m = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
m

The number of axes in an array is stored in the `ndim` attribute

In [None]:
m.ndim

The `shape` attribute gives number of elements along each axis:

In [None]:
m.shape

The total number of elements in an array is given by the `size` attribute

In [None]:
m.size

## Methods for creating arrays

### ones
The function `ones` creates an array full of ones

In [None]:
np.ones(5)

The dimension of the array can be defined at the time of construction

In [None]:
np.ones((5,5))

**Attention matlab users:** the shape passed to `ones` must be a tuple, unlike the function of the same name in matlab.

### zeros
The functions `zeros` creates an array full of zeros

In [None]:
np.zeros(5)

In [None]:
np.zeros((2,2))

### empty
The function `empty` creates an array whose initial content is random and depends on the state of the memory.

In [None]:
a = np.empty(5)
a

The array method `fill` fills the elements with a constant value

In [None]:
a.fill(3)
a

### arange
To create sequences of numbers, NumPy provides a function analogous to Python builtin `range` that returns an array instead of a list

In [None]:
start, end, step = 0, 10, 1
np.arange(start, end, step)

When `arange` is used with floating point arguments, it is generally not possible to predict the number of elements obtained, due to the finite floating point precision. 

### linspace
It is usually better to use the function `linspace` that receives as an argument the number of elements that we want, instead of the step:

In [None]:
np.linspace(-5, 10, 6)

### identity
The `identity` and `eye` functions return a 2 dimensional identity array

In [None]:
np.identity(3)

In [None]:
np.eye(2)

The keyword `k` allows specifying the diagonal

In [None]:
np.eye(4, k=1)

### loadtxt
The function `loadtxt` reads in text files and converts their contents to an array

In [None]:
a = np.loadtxt('aux/data.csv', delimiter=',', skiprows=1)
a

### Other methods for creating arrays

NumPy provides many other functions to create arrays, including:

    zeros_like, ones_like, empty_like, full, full_like, asarray
    copy, frombuffer, fromstring, fromfunction, asmatrix
    
and several others.

## Changing the Shape of an Array

The size of an array is *fixed* at the time of creation.  But the *shape* of the array can be changed through a variety of means.

In [None]:
ar = np.arange(12)
ar

The `reshape` function returns its argument with a modified shape

In [None]:
ar = ar.reshape(4,3)
print(ar)

### Array storage
By default, the last index of an array changes most rapidly as one moves through the array as stored in memory (so called Row-major storage). For a 2 dimensional array, this is equivalent to the statement that a matrix is stored by rows.  This is different than languages such as matlab and Fortran that use Column-major storage.  The storage convention can be observed by viewing a flattened multidimensional array:

In [None]:
ar.flatten()

The functions `flatten()` and `reshape()` can also be instructed, using an optional argument, to use Fortran-style arrays, in which the leftmost index changes the fastest.

The shape of the array can be explicitly changed, though the total size must not change

In [None]:
ar.shape = 6, 2
ar

In [None]:
ar.transpose()  # equivalently, ar.T

Whatever reshaping operation is used, the new shape must be consistent with the size of the original array:

In [None]:
ar.reshape(5,3)

The `reshape` function returns its argument with a modified shape, but leaves the original array intact.  

The `resize` method modifies the array itself:

In [None]:
ar.resize((3,4))
ar

If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated:

In [None]:
ar.reshape(2,-1)

## Joining different arrays

Since the size of NumPy arrays is *fixed* at the time of creation, methods such as `append` or `extend` methods of the list object make little sense (they change the size of the object).  NumPy provides similar *functions* that return new arrays.

The `np.concatenate` function concatenates 2 or more arrays:

In [None]:
t1 = np.array([1,2,3])
t2 = np.array([3,4,5])
t3 = np.array([6,7,8])
np.concatenate((t1, t2))

In [None]:
np.concatenate((t1, t2, t3))

The `column_stack` and `row_stack` functions create multi dimensional arrays from one dimensional arrays:

In [None]:
print(np.column_stack((t1, t2)))

In [None]:
print(np.row_stack((t1, t2)))

If using `concatenate` to join multidimensional arrays, the concatenation occurs along the first axis

In [None]:
t4 = np.array([[1,2,3],[4,5,6]])
t5 = np.array([[7,8,9],[10,11,12]])
np.concatenate((t4, t5))

The `axis` argument allows concatenating along a different axis

In [None]:
np.concatenate((t4, t5), axis=1)

Setting `axis=None` concatenates flattend arrays

In [None]:
np.concatenate((t4, t5), axis=None)

Other functions for joining arrays include:

    append, hstack, vstack, dstack

## Indexing, slicing, and iterating

The syntax for accessing the elements of an array is the same as for accessing the elements of a list - the bracket operator. The expression inside the brackets specifies the index. Remember that the indices start at 0:

In [None]:
a[0]

For arrays with dimension > 1, a tuple is used to specify the index.  By default, matrix convention for indexing is used by arrays, i.e., the first index is the row.

In [None]:
m[0,1] # second element in first row of m

The slice operator works on arrays

In [None]:
m[:, 2]  # third column of m

In [None]:
m[[0,-1], :]  #  first and last row of m

In [None]:
m[:, [0, -1]]  # first and last column of m

### Fancy Indexing

In addition to the usual indexing and slicing, arrays support *fancy* indexing, or, indexing with arrays of integers and arrays of booleans:

In [None]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
i = np.array([0, 3])
a[i]

In [None]:
j = np.array([True, True, False, False, True, False, True, False, True])
a[j]

### Multi-dimensional indices

Indices can be given for more than one dimension. The arrays of indices for each dimension must have the same shape.

In [None]:
a = np.arange(12).reshape(3,4)
a

In [None]:
i = np.array([[0,1],        # indices for the first dim of a
              [1,2]])
j = np.array([[2,1],        # indices for the second dim
              [3,3]])
a[i,j]                  # i and j must have equal shape

In [None]:
a[:,j]                  # i.e., a[ : , j]

In [None]:
a[i,2]

### Mutability of arrays

Like lists, arrays are mutable. When the bracket operator appears on the left side of an assignment, it identifies the element of the list that will be assigned.

In [None]:
numbers = np.array([17., 123])
numbers[1] = 5
numbers

In [None]:
matrix = np.array([[1, 2], [3, 4]])
matrix[0,1] = 12
matrix

Indexing with arrays can be used as a target to assign to:

In [None]:
a = np.arange(12)
a[[0, 2, 4]] = [53, 54, 55]
a

### Broadcasting
Subject to some constraints, NumPy will “broadcast” a smaller array across the larger array so that they have compatible shapes:

In [None]:
a.resize(3,4)
a[:,1] = 44
a

### Iterating

Iterating over multidimensional arrays is done with respect to the first axis:

In [None]:
b = np.array([[0, 1, 2, 3],
              [10, 11, 12, 13],
              [20, 21, 22, 23],
              [30, 31, 32, 33],
              [40, 41, 42, 43]])
for row in b:
    print(row)

However, if one wants to perform an operation on each element in the array, the flat attribute, which is an iterator over all the elements of the array, can be used:

In [None]:
for element in b.flat:
    print(element, end=' ')

## Copies and views

### No copy

Simple assignments make no copy of array objects or of their data.

In [None]:
a = np.arange(12)
b = a            # no new object is created

In [None]:
b is a           # a and b are two names for the same ndarray object

In [None]:
b.shape = 3,4    # changes the shape of a
a.shape

### View

Different array objects can share the same data. The `view` method creates a new array object that looks at the same data.

In [None]:
c = a.view()
c is a

In [None]:
c.base is a                        # c is a view of the data owned by a

In [None]:
c.flags.owndata

In [None]:
c.shape = 2,6                      # a's shape doesn't change
a.shape

In [None]:
c[0,4] = 1234                      # a's data changes
a

Slicing an array returns a view of it:

In [None]:
s = a[ : , 1:3]   
# spaces added for clarity; could also be written "s = a[:,1:3]"
s[:] = 10           # s[:] is a view of s. Note the difference between s=10 and s[:]=10
a

### Deep copy

The copy method makes a complete copy of the array and its data.

In [None]:
d = a.copy()                          # a new array object with new data is created
d is a

In [None]:
d.base is a                           # d doesn't share anything with a

In [None]:
d[0,0] = 9999
a