In [1]:
name = '2017-09-25-numpy-intro'
title = 'Intro to NumPy'
tags = 'numpy, basics'
author = 'Denis Sergeev'

In [2]:
from nb_tools import connect_notebook_to_post
from IPython.core.display import HTML

html = connect_notebook_to_post(name, title, tags, author)

Basic data containers in Python include

* high-level number objects: integer, floating point, complex, etc.
* containers: lists (costless insertion and append), dictionaries (fast lookup)

### Lists

* One-dimensional
* Can contain items of different types
* Mutable, i.e. items can be added, dropped, or replaced
* Similar to MATLAB's cell arrays

In [3]:
my_collection = [1, 4, 6, 10]

In [4]:
my_collection.append(100000)

In [5]:
my_collection.remove(1)

In [6]:
my_collection[1] = 'abcdef'

In [7]:
my_collection

[4, 'abcdef', 10, 100000]

### Zero-based indexing

In [8]:
a = [10, 20, 30, 40, 50, 60, 70]

In [9]:
low, high = 2, 4

In [10]:
a[:low]

[10, 20]

In [11]:
a[low:high]

[30, 40]

In [12]:
a[high:]

[50, 60, 70]

Works with any index-supporting objects, including strings:

In [13]:
s = 'qwerty'

In [14]:
s[1:-1]

'wert'

#### another example

Given a 2D image, `img`, stored in row-major order, we want to find the linear position in the array of the element at
position (x, y). Using **zero**-based indexing, that linear position is simply `img[y * width + x]`, whereas with **one**-based indexing it is `img((y - 1) * width + x)`. Now there is a `-1` in there!

### Why not to use lists?

* Lists in Python are quite general, can have arbitrary objects as elements.

* Addition and scalar multiplication are defined for lists, but not what we want for numerical computation, e.g.

Addition results in concatenation

In [15]:
x = [1, 2, 3]
y = [10, 20, 30]
x + y

[1, 2, 3, 10, 20, 30]

And multiplication results in repeating:

In [16]:
x = [2, 3]
x * 3

[2, 3, 2, 3, 2, 3]

## Enter NumPy arrays

### Aside: import conventions

In [17]:
import numpy as np

### NumPy arrays

***NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In Numpy dimensions are called axes. The number of axes is rank.***

**Why it is useful:** Memory-efficient container that provides fast numerical operations.

#### Let's compare it to list operations

In [18]:
l = list(range(1000))

In [19]:
%timeit [i**2 for i in l]

266 µs ± 1.13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [20]:
a = np.arange(1000)

In [21]:
%timeit a**2

1.15 µs ± 26.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


### Creating arrays

#### Manually

In [22]:
a = np.array([3, 4, 5, 6])

In [23]:
a

array([3, 4, 5, 6])

The class is called

In [24]:
type(a)

numpy.ndarray

In [25]:
a.ndim

1

Scalar array

In [26]:
a0 = np.array(7)

In [27]:
a0.ndim

0

In [28]:
b = np.array([[10, 20, 30], [9, 8, 7]])

In [29]:
b

array([[10, 20, 30],
       [ 9,  8,  7]])

In [30]:
c = np.array([[[1], [2]], [[3], [4]]])

In [31]:
c.shape

(2, 2, 1)

In [32]:
c.max()

4

Equivalent to `size(c)` in MATLAB.

#### Common mistakes

In [33]:
try:
    a = np.array(1,2,3,4) # WRONG, throws ValueError
except ValueError as e:
    print(e)

only 2 non-keyword arguments accepted


In [34]:
a = [1,2,3,4]

In [35]:
a = np.array(a) # RIGHT

In [36]:
b = a.copy()

Do not use `np.ndarray` function to create an array

### Functions for creating arrays

#### evenly spaced

In [37]:
np.arange(1, 9, 2) # start, end (exclusive), step

array([1, 3, 5, 7])

#### by a number of points

In [38]:
np.linspace(0, 1, 6)   # start, end, num-points

array([ 0. ,  0.2,  0.4,  0.6,  0.8,  1. ])

In [39]:
np.logspace(-3,2,7)

array([  1.00000000e-03,   6.81292069e-03,   4.64158883e-02,
         3.16227766e-01,   2.15443469e+00,   1.46779927e+01,
         1.00000000e+02])

#### filled with specific number

* Zeros

In [40]:
np.zeros((2, 3))

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

* Ones

In [41]:
np.ones((3, 2))

array([[ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]])

* Empty

In [42]:
np.empty([2,3])

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

The function `empty` creates an array whose initial content is random and depends on the state of the memory. By default, the dtype of the created array is float64.

* Random numbers

In [43]:
np.random.seed(1234)

In [44]:
np.random.rand(4)       # uniform in [0, 1]

array([ 0.19151945,  0.62210877,  0.43772774,  0.78535858])

In [45]:
np.random.randn(4)      # Gaussian

array([-0.72058873,  0.88716294,  0.85958841, -0.6365235 ])

#### Special cases

In [46]:
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [47]:
np.diag(np.array([1, 2, 3, 4]))

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

### Missing data

In [48]:
a = np.array([1, 2, 3, np.nan])

In [49]:
b = list(a)

In [50]:
a = np.array([1, 2, 3])

In [51]:
b = np.ma.masked_less(a, 2)

In [52]:
b

masked_array(data = [-- 2 3],
             mask = [ True False False],
       fill_value = 999999)

In [53]:
b.data

array([1, 2, 3])

In [54]:
b.mask

array([ True, False, False], dtype=bool)

**We will have a separate session on Masked Arrays in NumPy.**

### Looking for help

* Interactive help

In [55]:
np.rollaxis??

In [56]:
np.*space*?

* with NumPy: a built-in search engine

In [57]:
np.lookfor('create array')

Search results for 'create array'
---------------------------------
numpy.array
    Create an array.
numpy.memmap
    Create a memory-map to an array stored in a *binary* file on disk.
numpy.diagflat
    Create a two-dimensional array with the flattened input as a diagonal.
numpy.fromiter
    Create a new 1-dimensional array from an iterable object.
numpy.partition
    Return a partitioned copy of an array.
numpy.ctypeslib.as_array
    Create a numpy array from a ctypes array or a ctypes POINTER.
numpy.ma.diagflat
    Create a two-dimensional array with the flattened input as a diagonal.
numpy.ma.make_mask
    Create a boolean mask from an array.
numpy.ctypeslib.as_ctypes
    Create and return a ctypes object from a numpy array.  Actually
numpy.ma.mrecords.fromarrays
    Creates a mrecarray from a (flat) list of masked arrays.
numpy.ma.mvoid.__new__
    Create a new masked array from scratch.
numpy.lib.format.open_memmap
    Open a .npy file as a memory-mapped array.
numpy.ma.MaskedArr

## References
* [MATLAB to Python: A Migration Guide](https://www.enthought.com/webinar/python-for-matlab-users/#matlab-white-paper-download)
* [NumPy docs](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)
* [SciPy lectures](http://www.scipy-lectures.org/)

In [58]:
HTML(html)