# numpy (numerical python)

For a more comprehensive introduction to NumPy, see the official tutorial here:  [https://docs.scipy.org/doc/numpy-dev/user/quickstart.html](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html).

In [3]:
import numpy

* The main object in NumPy is the multidimensional array.  In NumPy, dimensions are called *axes*.  The **```numpy.array()```** function is how you can create a simple array.  Note that two parentheses that are needed **```(())```** for correct syntax:

In [26]:
array1 = numpy.array(( [1,2,3], [4,5,6], [7,8,9] ))
array2 = numpy.array([ (1,2), (3,4) ]) # brackets and parentheses both work
array3 = numpy.array(( (1,2), (3,4) ))
array4 = numpy.array(( 'apple', 'orange', 'banana' )) # they can hold strings too!

In [27]:
print(array1)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


* If your array contains multiple data types, they'll be converted to the same one
* Read more about numpy data types (dtypes) [here](https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html)

In [29]:
array0 = numpy.array(( 'apple', 7.0))
print(array0)
print(type(array0[1]))

['apple' '7.0']
<class 'numpy.str_'>


## NOW YOU TRY

__DEBUGGING IN PYTHON:__
* A line number is given, which is sometimes helpful.  Toggle lines on/off in the a notebook by typing:   ```esc``` and then ```shift```+```L```
* Here, it says ```invalid syntax```, so we've done something wrong... __can you fix it?__

In [30]:
print(array1) # print function requires () in Python 3.x
print() # print empty line for separator
print(array1.shape)
print()
print(array2)
print()
print(array3
print()
print(array4)

SyntaxError: invalid syntax (<ipython-input-30-ed729f202d23>, line 8)

In [31]:
print("I had an " + array4[0] + " with lunch.") # note the print statement accepts a + to concatenate strings

I had an apple with lunch.


## numpy array copying versus assigning

In [32]:
array5 = array3 # assignment:  array3 and array5 are the same array, where python is concerned
print(array5)

[[1 2]
 [3 4]]


In [34]:
array5 *= 2 # multiply array5 by 2, in place
print(array5)

[[ 4  8]
 [12 16]]


In [35]:
print(array3) # note array3 is the SAME as array5, and changes accordingly

[[ 4  8]
 [12 16]]


* Try to avoid weird array behavior by using either [```numpy.copy()``` or ```numpy.deepcopy()```](https://docs.python.org/2/library/copy.html)

In [36]:
# avoid this behavior by using numpy.copy
array5 = numpy.copy(array3)
array5 *= 2
print(array5)
print(array3)

[[ 8 16]
 [24 32]]
[[ 4  8]
 [12 16]]


## Placeholder arrays:  zeros, ones, and empty

You can create a large placeholder array to fill in later.  This can be done using **```numpy.zeros()```**, **```numpy.ones()```**, or **```numpy.empty()```**:

In [37]:
zeros_array = numpy.zeros((5,5)) # (nrows,ncols)
ones_array = numpy.ones((5,5))
empty_array = numpy.empty((3,3))

In [38]:
print(empty_array)

[[2.37663529e-312 2.29175545e-312 4.99006302e-322]
 [2.41907520e-312 2.33419537e-312 2.14321575e-312]
 [2.05833592e-312 2.05833592e-312 2.05833592e-312]]


## Means, sums, and dimensions

* You can take the mean of an array two different ways:
  * ```numpy.mean(ARRAY)```
  * ```ARRAY.mean()```
* Note it also accepts an axis argument; if you don't specify an axis, it flattens the array and takes the mean of everything

In [53]:
# create a 5x5 array of random floats between 0 and 1
random_array = numpy.random.random((5,5))

In [54]:
numpy.mean(random_array, axis=0) # takes mean down columns

array([0.5575241 , 0.46772763, 0.63513285, 0.45329772, 0.4845157 ])

In [55]:
random_array.mean(axis=0)

array([0.5575241 , 0.46772763, 0.63513285, 0.45329772, 0.4845157 ])

In [56]:
print(array1.shape) # same as numpy.shape(array1)
print(array1.mean()) # same as numpy.mean(array1)
print(array1.sum()) # same as numpy.sum(array1)

(3, 3)
5.0
45


## NaNs in numpy
* A NaN value in numpy is declared via numpy.nan
* Use ```numpy.nanmean()```, ```numpy.nansum()```, ```numpy.nanstd()```, ```numpy.nanvar()```, ```numpy.nanmin()```, ```numpy.nanmax()```, etc.

In [75]:
random_array[3,3] = numpy.nan
print(random_array)

[[0.73902065 0.28209815 0.90723246 0.16299982 0.97776084]
 [0.10435583 0.52874617 0.86795418 0.29510036 0.11521195]
 [0.58674807 0.95751604 0.17274166 0.13875863 0.16312933]
 [0.36319015 0.3986568  0.90902882        nan 0.69228615]
 [0.99430581 0.17162099 0.31870713 0.84230233 0.47419021]]


In [76]:
numpy.mean(random_array)

nan

In [77]:
numpy.nanmean(random_array)

0.5068192723693312

## The ```numpy.arange()``` and ```numpy.linspace``` functions

* **```numpy.arange()```** is analogous to **```range()```**, but here, an array is returned

In [84]:
print(numpy.arange(10)) # if you enter an integer, it will be of numpy.int data type
print(numpy.arange(0, 1.1, 0.1)) # adding the decimal ensures it is of numpy.float type
print(numpy.linspace(0, 1, 11)) # numpy.linspace(start,stop,size)

[0 1 2 3 4 5 6 7 8 9]
[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]


# Operations on arrays

Basic arithmetic operations on NumPy arrays occur *element-wise*.  

In [69]:
A = numpy.array(( [1,2], [3,4], [5,6] ), dtype=float)
print(A)

[[1. 2.]
 [3. 4.]
 [5. 6.]]


In [70]:
B = numpy.array(( [7,8], [9,10], [11,12] ))
print(B)

[[ 7  8]
 [ 9 10]
 [11 12]]


In [71]:
print(A+B)
print(A-B)

[[ 8. 10.]
 [12. 14.]
 [16. 18.]]
[[-6. -6.]
 [-6. -6.]
 [-6. -6.]]


In [72]:
print(A*B) # element-wise mutiplication

[[ 7. 16.]
 [27. 40.]
 [55. 72.]]


### Dot products on arrays

A dot product of two arrays needs to have matching interior dimensions: ```[i x j] . [j x k] = [j x k]```

__Can you solve the error below?  Hint:  Transpose an array in place using ```array_name.T``` or ```numpy.transpose(array_name)```__

In [73]:
print(A.shape)
print(B.shape)
print(numpy.dot(A,B)) # matrix dot product

(3, 2)
(3, 2)


ValueError: shapes (3,2) and (3,2) not aligned: 2 (dim 1) != 3 (dim 0)

Quick computations of means, sums, min, and max can be computed using either the **```A.function()```** or the **```numpy.function(A)```** notation:

In [74]:
print(A.mean(), \
      A.sum(), \
      A.min(), \
      A.max())
print(numpy.mean(A), \
      numpy.sum(A), \
      numpy.min(A), \
      numpy.max(A))

3.5 21.0 1.0 6.0
3.5 21.0 1.0 6.0


### Indexing, slicing, iterating

To index a NumPy array, use square brackets **```[i,j]```** for **```[row,column]```**.

(Don't forget zero indexing.)

In [22]:
C = numpy.arange(100).reshape((10,10)) # note array is reshaped upon creation
C[0,0]

0

In [23]:
C[:2,:3] # prints first 2 rows and 3 columns
#C[0:2,0:3] # does the same

array([[ 0,  1,  2],
       [10, 11, 12]])

In [24]:
D = numpy.arange(20)
print(D)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


Reverse arrays using the **```[::-1]```** syntax:

In [25]:
D[::-1]

array([19, 18, 17, 16, 15, 14, 13, 12, 11, 10,  9,  8,  7,  6,  5,  4,  3,
        2,  1,  0])

Skip every nth element sing the **```[::N]```** syntax

In [26]:
D[::2]

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

To print every nth element:

In [27]:
D[0:16:3] # prints first 16 elements, skips by 3

array([ 0,  3,  6,  9, 12, 15])

### Reshaping versus resizing

The **```numpy.reshape()```** function returns a *new* argument with the specified shape.
The **```numpy.resize()```** function resizes the array *in place*.

Be aware of this difference when flattening or reshaping arrays.

In [28]:
F = numpy.linspace(0,10,50)
F.shape

(50,)

In [29]:
F.reshape(10,5)

array([[ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653],
       [ 1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469],
       [ 2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286],
       [ 3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102],
       [ 4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918],
       [ 5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735],
       [ 6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551],
       [ 7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367],
       [ 8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184],
       [ 9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ]])

In [30]:
F.shape # F has not been rewritten!  Shape is still (50,)

(50,)

In [31]:
F.resize(10,5)
print(F)

[[ 0.          0.20408163  0.40816327  0.6122449   0.81632653]
 [ 1.02040816  1.2244898   1.42857143  1.63265306  1.83673469]
 [ 2.04081633  2.24489796  2.44897959  2.65306122  2.85714286]
 [ 3.06122449  3.26530612  3.46938776  3.67346939  3.87755102]
 [ 4.08163265  4.28571429  4.48979592  4.69387755  4.89795918]
 [ 5.10204082  5.30612245  5.51020408  5.71428571  5.91836735]
 [ 6.12244898  6.32653061  6.53061224  6.73469388  6.93877551]
 [ 7.14285714  7.34693878  7.55102041  7.75510204  7.95918367]
 [ 8.16326531  8.36734694  8.57142857  8.7755102   8.97959184]
 [ 9.18367347  9.3877551   9.59183673  9.79591837 10.        ]]


In [32]:
print(F.shape)
print(F.size)

(10, 5)
50
