# Arrays

Arrays are n-dimensional objects of ordered variables.  In Python, the NumPy package allows fast manipulation of array objects. The [NumPy Reference Manual](http://docs.scipy.org/doc/numpy/index.html) is a good source for more information. In IPython Notebook, NumPy is alread implemented.  Otherwise you would need to invoke the import numpy command.

All components of a NumPy array are of one variable type (note that there are specialized arrays called structured arrays and object arrays, not considered here, that can break this rule).  This allows very fast implementation of functions across an array.  Arrays can be one-dimensional, looking like a list, but because of the uniformity of structure will be processed much faster than a list.

Arrays are created by the array( ) function.  The shape property gives the dimensions of the array and can be useful for debugging. The size property gives the overall number of elements.  The dtype property gives the type of variable.  Note that there are many types of numerical variable types in NumPy arrays; see [dtype documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html) for more info. The different types of scalar variables can be found [here](http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html).

Numpy routines (usually) need to be imported before they can be used.  There are two different ways to import in Python, using `import ... (as)` (including an optional alias) or `from...import...`.  We use both examples below.

In [None]:
import numpy as np  #here we import the numpy package using np as an alias

v = np.array([1,2,3,4])

#dimensions are represented by nested square brackets

m = np.array([[1.0, 2.0], [3.0, 4.0]])

print v.shape
print m.shape

In [None]:
print v.size
print m.size

In [None]:
print v.dtype
print m.dtype

In [None]:
#dtype can be specified at creation
m = np.array([[1, 2], [3, 4]], dtype = 'int8')
print m
print m.dtype

One of the big advantages of NumPy is speed.  These two functions do the same thing - add each of the elements of two series together (more on NumPy math later).  The following example demonstrates the speed bump using NumPy arrays.

In [None]:
import time  #time module  - use time? for more info

t1 = time.time()
X = range(10000000)
Y = range(10000000)
Z = []
for i in range(len(X)):
    Z.append(X[i] + Y[i])
print time.time() - t1

t1 = time.time()
X = np.arange(10000000)
Y = np.arange(10000000)
Z = X + Y
print time.time() - t1

### Generating arrays

There are several methods to generate arrays.  Some common ways include arange (equivalent of the list function range), linspace, random, zeroes, ones, mgrid, diag. You can find more info on array creation routines [here](http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html).

In [None]:
from numpy import arange,linspace,random,zeros,ones,mgrid,diag #alternative way to import functions

x = arange(1, 10, 0.5) #start, stop, step
print x

In [None]:
print linspace(-5, 5, 15)

In [None]:
print random.rand(10)

In [None]:
print zeros(10)

In [None]:
print ones((3,3))

In [None]:
x = mgrid[0:5,0:5]
print x

In [None]:
print diag([1,2,3])

In [None]:
print diag([1,1,1])

In [None]:
print diag([1,2,3], k=1)# k sets offset

### Indexing and Slicing Arrays

Indexing and slicing follow Python rules with added dimensions separated by commas.

In [None]:
example_array = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

print example_array.shape

In [None]:
print example_array[1,2]

In [None]:
print example_array[1, :]

Index slicing uses the format lower:upper:step for each dimension.

In [None]:
print example_array[0:3:2,:]

In [None]:
print example_array[:,::2]

### Fancy indexing

Fancy indexing (or fancy slicing) allows selection of subsets of an array. A list of integers serves as an index for the array.

In [None]:
matrix = np.array([[n+m*10 for n in range(5)] for m in range(5)])
    
print matrix

In [None]:
print matrix[[0,2,4]]

In [None]:
print matrix[:,[0,2,4]]

In [None]:
print matrix[[0,2,4],0:3:2]

In [None]:
print matrix[[0,4,2]] #You can change the order

In [None]:
print matrix[:,[0,2,-2]]#negative numbers index from the end

In [None]:
matrix[[0,2,4],0:3:2] += 5
print matrix

### Masks

Performing an evaluation of an array creates a boolean mask.  Masks can be used to index an array and extract values or to reassign values.

In [None]:
example_array = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print example_array

In [None]:
mask = example_array>3
print mask

In [None]:
print example_array[mask]

In [None]:
example_array[mask] = 0
print example_array

### Copies and Views

A somewhat confusing aspect of NumPy arrays is the relationship between array elements of linked arrays. Simple assignments make no copy of array objects or of their data.  Rather corresponding elements point to the same objects in arrays.

In [None]:
a = np.arange(12)
b = a
print b is a

In [None]:
print a
b[2]=4
print a

In [None]:
print b
a[0] = 1
print b

The view( ) method makes a linked view of an array, sometimes called a shallow copy.  These arrays are not the same but elements will point to the same objects.  

In [None]:
a = np.arange(12)
b = a
c = a.view()
print b is a
print c is a
print a
print b
print c

In [None]:
print a
c[2]=0
print a

In [None]:
print b.shape
print c.shape
a.shape = (2,6) #changes the shape of the array (more later)
print b.shape
print c.shape

print a
print b
print c

Slicing an array makes a view of the array.  Although it has a different shape, it looks at the same data.

In [None]:
example_array = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
b = example_array[1, :]
print example_array.shape
print b.shape

In [None]:
print example_array
b[0]=1
print example_array
print b

The copy( ) method makes a complete copy.  Elements are no longer linked.

In [None]:
a = np.arange(12)
d = a.copy()
print d
a[0] = 1
print d
print a

# Operations on arrays

### Basic operations

In [None]:
# math with scalars

a = np.array([1, 2, 3, 4])
print a

In [None]:
print a + 1

In [None]:
print 2**a

Math operations can be performed across arrays of the same shape.

In [None]:
a = np.array([1, 2, 3, 4])
b = np.ones(4)+1
c = np.array([1, 2])

print a
print b
print a+b

In [None]:
print a
print b
print a-b

In [None]:
print a
print b
print a*b

In [None]:
print a/b 

In [None]:
print c+b #arrays are different shape...

### Array broadcasting

If arrays are compatible along one axis dimension, then NumPy will perform operations using [broadcasting rules](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).  Dimensions are compatible if they are:
* the same length
* the dimension of one array is 1.

In [None]:
a = np.tile(np.arange(0, 40, 10), (3, 1)).T #T transposes an array - more later

print a.shape
print a

In [None]:
b = np.array([0, 1, 2])

print b.shape
print b

In [None]:
print a + b

If the shared axis length is 1 for one array then broadcasting creates a new array using the longer axis.

In [None]:
c = np.arange(0, 40, 10)
c = c[:, np.newaxis]  # adds a new axis -> 2D array - more later
print c

In [None]:
d = np.array([0, 1, 2])
print d

In [None]:
e= c+d
print e

In [None]:
print c.shape
print d.shape
print e.shape

In [None]:
print c-d
print c*d

### Universal functions

Universal functions operate elementwise on an array, producing an array as output.  These include math and trig operations, and comparison functions.  See [here](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs) for details.

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([4, 3, 2, 1])
print np.log2(a)

In [None]:
print np.sin(a)

In [None]:
print np.maximum(a,b)

There are corresponding [string functions](http://docs.scipy.org/doc/numpy/reference/routines.char.html) that can be used element-wise on an array.  Note that you may have to direct to the submodule to get the function to work.

In [None]:
fishes = np.array(['Oryzias latipes','Danio rerio','Takifugu rubripes'])

print numpy.core.defchararray.find(fishes,'rer')

### Array reduction

Array reduction functions provide summaries across dimensions of the array.  In Numpy, axis 0 is the vertical axis (down columns), axis 1 is horizontal (across rows).

In [None]:
a = np.array([[1,1], [2,2]])
print a
print a.sum(axis=0)#cols


In [None]:
print a
print a.sum(axis=1) #rows

In [None]:
b = np.array([[1, 2, 3], [5, 6, 7]])
print b
print b.mean()

In [None]:
print b
print b.mean(axis=1)#rows
print b.mean(axis=0)#cols

# Changing array shape

### Flipping arrays

One simple way to change an array is to transpose axes.  There are several ways to do this.  These functions return a view of the array.

In [None]:
a = np.array([[n+m*10 for n in range(3)] for m in range(4)])
print a

.T transposes a 2-axis array

In [None]:
print a.T

swapaxes(array,axis 1,axis 2) swaps any 2 axes.

In [None]:
print np.swapaxes(a,1,0)

transpose(array, axes=None) will change the order of all axes according to axes = dimensions to be permuted as a tuple of integers.

In [None]:
print np.transpose(a,(1,0))

There are also functions to flip values in axes: fliplr( ), flipud( )

In [None]:
print np.fliplr(a)

In [None]:
print np.flipud(a)

rot90(array, k =1 ) rotates 90° counterclockwise for k iterations

In [None]:
print np.rot90(a) 
#compare to transpose
print a.T

In [None]:
print np.rot90(a,3)

### Changing the dimensions of arrays

reshape( (tup) ) is used to reshape an array.  It returns a view of the array.

In [None]:
a = np.array([[n+m*10 for n in range(3)] for m in range(4)])
print a
print a.shape

In [None]:
b = a.reshape((2,6))#takes a tuple for its argument
print a
print b

Array shape can also be changed by direct assignment.

In [None]:
print b
b.shape = (4,3)
print b

Using -1 as an array dimension gives "whatever is needed".

In [None]:
c = np.arange(30)
print c
c.shape = 2,-1,3
print c
print c.shape

ravel( ) creates a one-dimensional view of an array.  

In [None]:
d = np.ravel(c)
print d

flatten( ) creates a one-dimensional copy of an array.

In [None]:
e= c.flatten()
d[2] = 0
print c
print e

Dimensions can be restored.

In [None]:
e.shape = c.shape
print e

### Joining and splitting arrays

There are several functions to join arrays.  hstack( (tup) ) joins arrays horizontally (column-wise), producing a new array.  vstack( ) joins vertically (row-wise) and dstack( ) joins depth-wise (along z axis).

In [None]:
a = np.array([[1,2],[5,6],[9,10]])
b = np.array([[3,4],[7,8],[11,12]])
c = np.hstack((a,b))#tuple for argument
print a
print b
print c

In [None]:
d= np.dstack((c,c))
print d
print d.shape

The split functions work similarly. hsplit( (tup) ) splits arrays horizontally (column-wise), producing new arrays.  vsplit( ) splits vertically (row-wise) and dsplit( ) splits depth-wise (along z axis). Each take (array, indices or sections) for arguments. See [split](http://docs.scipy.org/doc/numpy/reference/generated/numpy.split.html#numpy.split) for more information.

In [None]:
a = np.arange(16.0).reshape(4, 4)
print a

In [None]:
b,c =  np.hsplit(a,2)#split into two equal portions
print b
print c

In [None]:
b,c,d = np.hsplit(a,[1,2])#list of indices to split
print b
print c
print d

### Adding a dimension

newaxis adds a dimension to an array with value None.

In [None]:
a = np.arange(4)
print a
print a.shape

In [None]:
b=a[:,np.newaxis]
print b
print b.shape

In [None]:
c=a[np.newaxis,:]
print c
print c.shape

### Adding items

Numpy arrays have several functions to add items:  [insert( )](http://docs.scipy.org/doc/numpy/reference/generated/numpy.insert.html#numpy.insert), [delete ( )](http://docs.scipy.org/doc/numpy/reference/generated/numpy.delete.html#numpy.delete), [append ( )](http://docs.scipy.org/doc/numpy/reference/generated/numpy.append.html#numpy.append).  These functions create copies of the array.

#### insert( )

In [None]:
a = np.array([10,20,30,40])
np.insert(a,[1,3],50) # insert value 50 before elements [1] and [3]

In [None]:
np.insert(a,[1,3],[50,60]) # insert value 50 before element [1] and value 60 before element [3]

In [None]:
a = np.array([[10,20,30],[40,50,60],[70,80,90]])
np.insert(a, [1,2], 100, axis=0) # insert row with values 100 before row[1] and before row[2]

In [None]:
np.insert(a, [0,1], [[100],[200]], axis=0)

In [None]:
np.insert(a, [0,1], [100,200], axis=1)

#### delete( )

In [None]:
a = np.array([0, 10, 20, 30, 40])
np.delete(a, [2,4]) # remove a[2] and a[4]

In [None]:
a = np.arange(16).reshape(4,4)
print a
b = np.delete(a, np.s_[1:3], axis=0) # remove rows 1 and 2
print b

In [None]:
c = np.delete(a, np.s_[1:3], axis=1) # remove columns 1 and 2
print c

#### append( )

In [None]:
a = np.array([10,20,30,40])
np.append(a,50)

In [None]:
np.append(a,[50,60])

In [None]:
a = np.array([[10,20,30],[40,50,60],[70,80,90]])
np.append(a,[[15,15,15]],axis=0)

In [None]:
np.append(a,[[15],[15],[15]],axis=1)