# NumPy

> why you would want to use arrays instead of lists

NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Access in reading and writing items is also faster with NumPy.

Maybe you don't care that much for just a million cells, but you definitely would for a billion cells -- neither approach would fit in a 32-bit architecture, but with 64-bit builds NumPy would get away with 4 GB or so, Python alone would need at least about 12 GB (lots of pointers which double in size) -- a much costlier piece of hardware!

The difference is mostly due to "indirectness" -- a Python list is an array of pointers to Python objects, at least 4 bytes per pointer plus 16 bytes for even the smallest Python object (4 for type pointer, 4 for reference count, 4 for value -- and the memory allocators rounds up to 16). A NumPy array is an array of uniform values -- single-precision numbers takes 4 bytes each, double-precision ones, 8 bytes. Less flexible, but you pay substantially for the flexibility of standard Python lists!

>conda install numpy

We'll cover only basics : apects of numpy as in vectors, arrays, matrices and number generation.

In [None]:
import numpy as np

### From a Python List

In [None]:
np.array(my_list)
np.array(my_matrix)

### Built-in Methods

In [None]:
np.arange(start,end)
np.arange(start,end,increment)

In [None]:
np.zeros(n)
np.zeros((n,m)) # tuple or list

np.ones(n)
np.ones((n,m))

In [None]:
np.linspace(start,end,interval) # end - inclusive

In [None]:
np.eye(n) # n*n identity matrix

In [None]:
np.random.rand(n)# from uniform distribution
np.random.rand(m,n)

np.random.randn(n)# from normal distribution
np.random.randn(m,n)

np.random.randint(low,high)#  low (inclusive) to high (exclusive)
np.random.randint(low,high,n)

np.random.seed(42)

In [None]:
my_arr.dtype
my_arr.shape
my_arr.reshape(n,m)

my_arr.max()
my_arr.argmax()

my_arr.min()
my_arr.argmin()

###  Bracket Indexing and Selection

In [None]:
# arr is numpy array
arr[n]                #Get a value at an index
arr[0:5]              #Get values in a range
arr[0:5]=100          #Setting a value with index range (Broadcasting)
arr_copy = arr.copy() #To get a copy, need to be explicit
arr_2d                # Format is arr_2d[row][col] or arr_2d[row,col]
arr_2d[:2,1:]         #Shape (2,2) from top right corner


### Conditional Selection

In [None]:
bool_arr = arr>4  # returns boolean value at index
arr[bool_arr]     # considers at TRUE

### Arithmetic


In [None]:
arr + arr
arr * arr
arr - arr
arr / arr # raise an error when divided by zero
1 / arr
arr**3

### Universal Array Functions

In [None]:
np.sqrt(arr)
np.exp(arr)
np.sin(arr)
np.log(arr)

### Summary Statistics on Arrays

In [None]:
arr.sum()
arr.mean()

arr.max()
arr.min()

arr.var()
arr.std()

### Axis Logic
![axis_logic.png](attachment:axis_logic.png)

In [None]:
arr_2d
arr_2d.shape
arr_2d.sum(axis=0) # vertical axis - along rows
arr_2d.sum(axis=1) # horizontal axis - along columns