![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/39118381-910eb0c2-46e9-11e8-81f1-a5b897401c23.jpeg"
    style="width:300px; float: right; margin: 0 40px 40px 40px;"></img>

# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

NumPy major contributions are:

* Efficient numeric computation with C primitives
* Efficient collections with vectorized operations
* An integrated and natural Linear Algebra API
* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Let's develop on efficiency. In Python, **everything is an object**, which means that even simple ints are also objects, with all the required machinery to make object work. We call them "Boxed Ints". In contrast, NumPy uses primitive numeric types (floats, ints) which makes storing and computation efficient.

<img src="https://docs.google.com/drawings/d/e/2PACX-1vTkDtKYMUVdpfVb3TTpr_8rrVtpal2dOknUUEOu85wJ1RitzHHf5nsJqz1O0SnTt8BwgJjxXMYXyIqs/pub?w=726&h=396" />


![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Hands on! 

In [1]:
import sys
import numpy as np

## Basic Numpy Arrays

In [2]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [3]:
a = np.array([1, 2, 3, 4])

In [4]:
b = np.array([0, .5, 1, 1.5, 2])

In [5]:
a[0], a[1]

(np.int64(1), np.int64(2))

In [6]:
a[0:]

array([1, 2, 3, 4])

In [7]:
a[1:3]

array([2, 3])

In [8]:
a[1:-1]

array([2, 3])

In [9]:
a[::2]

array([1, 3])

In [10]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [11]:
b[0], b[2], b[-1]

(np.float64(0.0), np.float64(1.0), np.float64(2.0))

In [12]:
b[[0, 2, -1]]

array([0., 1., 2.])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Array Types

In [13]:
a

array([1, 2, 3, 4])

In [14]:
a.dtype

dtype('int64')

In [15]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [16]:
b.dtype

dtype('float64')

In [17]:
np.array([1, 2, 3, 4], dtype=np.float)

AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

In [None]:
np.array([1, 2, 3, 4], dtype=np.int8)

In [None]:
c = np.array(['a', 'b', 'c'])

In [None]:
c.dtype

In [None]:
d = np.array([{'a': 1}, sys])

In [None]:
d.dtype

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Dimensions and shapes

In [18]:
A = np.array([[1, 2, 3], [4, 5, 6]])

In [19]:
A.shape

(2, 3)

In [20]:
A.ndim

2

In [21]:
A.size

6

In [None]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])

In [None]:
B

In [None]:
B.shape

In [None]:
B.ndim

In [None]:
B.size

If the shape isn't consistent, it'll just fall back to regular Python objects:

In [None]:
C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
])

In [None]:
C.dtype

In [None]:
C.shape

In [None]:
C.size

In [None]:
type(C[0])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Indexing and Slicing of Matrices

In [29]:
# Square matrix
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

In [None]:
A[1]

In [22]:
A[1][0]

np.int64(4)

In [None]:
# A[d1, d2, d3, d4]

In [None]:
A[[0,1], [0,1]]

In [None]:
A[0:2]

In [None]:
A[:, :2]

In [None]:
A[:2, :2]

In [None]:
A[:2, 2:]

In [None]:
A

In [None]:
A[1] = np.array([10, 10, 10])

In [24]:
A[1] = np.array([10, 10, 10])

In [25]:
A

array([[ 1,  2,  3],
       [10, 10, 10]])

In [31]:
A[2] = 99

In [32]:
A

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Summary statistics

In [33]:
a = np.array([1, 2, 3, 4])

In [39]:
a.sum()

Help on int64 object:

class int64(signedinteger)
 |  Default signed integer type, 64bit on 64bit systems and 32bit on 32bit
 |  systems.
 |
 |  :Character code: ``'q'``
 |  :Canonical name: `numpy.int_`
 |  :Alias on this platform (win32 AMD64): `numpy.int64`: 64-bit signed integer (``-9_223_372_036_854_775_808`` to ``9_223_372_036_854_775_807``).
 |  :Alias on this platform (win32 AMD64): `numpy.intp`: Signed integer large enough to fit pointer, compatible with C ``intptr_t``.
 |
 |  Method resolution order:
 |      int64
 |      signedinteger
 |      integer
 |      number
 |      generic
 |      builtins.object
 |
 |  Methods defined here:
 |
 |  __abs__(self, /)
 |      abs(self)
 |
 |  __add__(self, value, /)
 |      Return self+value.
 |
 |  __and__(self, value, /)
 |      Return self&value.
 |
 |  __bool__(self, /)
 |      True if self else False
 |
 |  __buffer__(self, flags, /)
 |      Return a buffer object that exposes the underlying memory of the object.
 |
 |  __divmod__(

In [35]:
a.mean()

np.float64(2.5)

In [None]:
a.std()

In [None]:
a.var()

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [None]:
A.sum(axis=1)

In [None]:
A.mean()

In [None]:
A.std()

In [None]:
A.sum(axis=0)

In [None]:
A.sum(axis=1)

In [None]:
A.mean(axis=0)

In [None]:
A.mean(axis=1)

In [None]:
A.std(axis=0)

In [None]:
A.std(axis=1)

And [many more](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods)...

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Broadcasting and Vectorized operations

In [50]:
a = np.arange(4)

In [51]:
a

array([0, 1, 2, 3])

In [42]:
b='Letter'

In [52]:
a + 10

array([10, 11, 12, 13])

In [None]:
a * 10

In [None]:
a

In [54]:
a += 100

In [55]:
a

array([100, 101, 102, 103])

In [57]:
l = [0, 1, 2, 3]

In [58]:
for i in l:
    i= i*10
    print(i)

0
10
20
30


In [59]:
[i * 10 for i in l]

[0, 10, 20, 30]

In [None]:
a = np.arange(4)

In [None]:
a

In [None]:
b = np.array([10, 10, 10, 10])

In [None]:
b

In [None]:
a + b

In [None]:
a * b

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Boolean arrays
_(Also called masks)_

In [60]:
a = np.arange(4)

In [61]:
a

array([0, 1, 2, 3])

In [63]:
a[0], a[-1]

(np.int64(0), np.int64(3))

In [62]:
a[[0, -1]]

array([0, 3])

In [None]:
a[[True, False, False, True]]

In [None]:
a

In [64]:
a >= 2

array([False, False,  True,  True])

In [68]:
a[a >= 2]

array([2, 3])

In [69]:
a.mean()

np.float64(1.5)

array([False, False,  True,  True])

In [71]:
a[  a > a.mean()             ]

array([2, 3])

In [None]:
a[~(a > a.mean())]

In [None]:
a[(a == 0) | (a == 1)]

In [72]:
a[(a <= 2) & (a % 2 == 0)]

array([0, 2])

In [86]:
np.random.seed(3)
np.random.randint(20)

10

In [91]:
A = np.random.randint(100, size=(2, 3, 3))

In [92]:
A

array([[[71, 37, 46],
        [33,  1, 85],
        [74, 99, 91]],

       [[16, 80, 32],
        [16, 18, 75],
        [55, 96, 95]]], dtype=int32)

In [None]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

In [93]:
A > 30

array([[[ True,  True,  True],
        [ True, False,  True],
        [ True,  True,  True]],

       [[False,  True,  True],
        [False, False,  True],
        [ True,  True,  True]]])

In [94]:
A[A > 30]

array([71, 37, 46, 33, 85, 74, 99, 91, 80, 32, 75, 55, 96, 95],
      dtype=int32)

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Linear Algebra

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [None]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

In [None]:
A.dot(B)

In [None]:
A @ B

In [None]:
B.T

In [None]:
A

In [None]:
B.T @ A

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Size of objects in Memory

### Int, floats

In [95]:
# An integer in Python is > 24bytes
sys.getsizeof(1)

28

In [96]:
# Longs are even larger
sys.getsizeof(10**100)

72

In [97]:
# Numpy size is much smaller
np.dtype(int).itemsize

8

In [98]:
# Numpy size is much smaller
np.dtype(np.int8).itemsize

1

In [99]:
np.dtype(float).itemsize

8

### Lists are even larger

In [100]:
# A one-element list
sys.getsizeof([1])

64

In [101]:
# An array of one element in numpy
np.array([1]).nbytes

8

### And performance is also important

In [102]:
l = list(range(100000))

In [103]:
a = np.arange(100000)

In [104]:
%time np.sum(a ** 2)

CPU times: total: 0 ns
Wall time: 916 μs


np.int64(333328333350000)

In [105]:
%time sum([x ** 2 for x in l])

CPU times: total: 15.6 ms
Wall time: 9.4 ms


333328333350000

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Useful Numpy functions

### `random` 

In [None]:
np.random.random(size=2)

In [None]:
np.random.normal(size=2)

In [None]:
np.random.rand(2, 4)

---
### `arange`

In [None]:
np.arange(10)

In [None]:
np.arange(5, 10)

In [None]:
np.arange(0, 1, .1)

---
### `reshape`

In [None]:
np.arange(10).reshape(2, 5)

In [None]:
np.arange(10).reshape(5, 2)

---
### `linspace`

In [None]:
np.linspace()

In [None]:
np.linspace(0, 1, 5)

In [None]:
np.linspace(0, 1, 20)

In [None]:
np.linspace(0, 1, 20, False)

---
### `zeros`, `ones`, `empty`

In [None]:
np.zeros(5)

In [None]:
np.zeros((3, 3))

In [None]:
np.zeros((3, 3), dtype=np.int32)

In [None]:
np.ones(5)

In [None]:
np.ones((3, 3))

In [None]:
np.empty(5)

In [None]:
np.empty((2, 2))

---
### `identity` and `eye`

In [None]:
np.identity(3)

In [None]:
np.eye()

In [None]:
np.eye(8, 4)

In [106]:
np.eye(8, 4, k=1)

array([[0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [107]:
np.eye(8, 4, k=-3)

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])

In [108]:
np.flip()

TypeError: flip() missing 1 required positional argument: 'm'

In [None]:
"Hello World"[6]

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)