In [2]:
import numpy as np

# IPython or Jupyter Related

## Put an egg on it

 - Put ? at the end of almost anything to get docs

`np.arange?`

 - Tab to autocomplete
 - `%timeit` to time it
 - `%time` to time just once

In [7]:
np.arange?

In [9]:
arr = np.random.random(10000)

%timeit sum(arr)
%timeit np.sum(arr)

659 µs ± 1.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
6.78 µs ± 38.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


# Numpy

## About ufunc

> A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features

So they are fast

**TODO:** Custom ufunc with or without using CPython

## axis parameter

There is a nice sentence about it : "The `axis` keyword specifies the dimension of the array that will be collapsed, rather than the dimension that will be returned."

`M.max(axis=1)`

So columns will be collapsed, if `M.shape` equals to `(3, 4)` then new shape will be `(3,)` since columns are collapsed.

In [11]:
M = np.random.random((3, 4))
print(M.shape)
print(M.max(axis=1).shape)

(3, 4)
(3,)


## Broadcasting

Rules of Broadcasting : 
    - Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
    - Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match to match the other shape.
    - RUle 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

In [8]:
M = np.ones((2, 3))
a = np.arange(3)
print(M + a)

a = np.arange(3).reshape((3, 1))
b = np.arange(3)
print(a + b)

M = np.ones((3, 2))
a = np.arange(3)

try:
    print(M + a)
except ValueError as e:
    print(e)

[[1. 2. 3.]
 [1. 2. 3.]]
[[0 1 2]
 [1 2 3]
 [2 3 4]]
operands could not be broadcast together with shapes (3,2) (3,) 


## Boolean Numpy

`&`, `|`,`^` and `~` operators might be useful with boolean numpy operations. Note the difference between `&` and `and`.

### Remainders

 - `np.any`

 - `np.all`

In [16]:
arr = np.arange(10)

count = np.sum(arr > 5)
print(count)

print(np.any(arr))
print(arr.any())
print(arr.all())

4
True
True
False


## Fancy Indexing

Fancy indexing in numpy refers to passing an array of indices to access multiple array elements at once.

**Note: ** The return value reflects the broadcasted shape of the indices, rather than the shape of the array being indexed.

 - Broadcasting rules apply when two indices are given (like row & col)
 - Fancy indexing can be combined with the other indexing schemes (Like boolean masks, or slicing)
 - Fancy indexing is very useful getting random rows in 2D matrix (Like training-test split)

In [37]:
arr = np.arange(9)
print(arr)

ind = np.array([[0, 2], [6, 8]])
print(arr[ind])

arr = arr.reshape((3, 3))
print(arr)

row = np.array([[0, 0], [2, 2]])
col = np.array([[0, 2], [0, 2]])
print(arr[row, col])

# Broadcasting rules apply when two indices are given (like row & col) : 
print()
print("Broadcasting rules apply when two indices are given (like row & col) : ")

print("Array")
arr = np.arange(16).reshape((4, 4))
print(arr)

print("Indices")
row = np.arange(3)[:, np.newaxis]
col = row.ravel()
print(row)
print(col)

print("Result")
print(arr[row, col])

# Combined Indexing
print()
print("Combined Indexing")

arr = np.arange(25).reshape(5, 5)
print(arr)

row = np.arange(5)[::-1]

print(arr[row, :3])

[0 1 2 3 4 5 6 7 8]
[[0 2]
 [6 8]]
[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[0 2]
 [6 8]]

Broadcasting rules apply when two indices are given (like row & col) : 
Array
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
Indices
[[0]
 [1]
 [2]]
[0 1 2]
Result
[[ 0  1  2]
 [ 4  5  6]
 [ 8  9 10]]

Combined Indexing
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
[[20 21 22]
 [15 16 17]
 [10 11 12]
 [ 5  6  7]
 [ 0  1  2]]


## Sorting

**Note : ** `argsort` and `np.argsort` are pretty cool.

 - `np.sort` returns sorted **copy** of array.

In [59]:
# argsort can be used to sort parallel arrays

x = np.array([2, 0, 3, 4])
y = x * 2

print(x)
print(y)

ind_sort = x.argsort()
x = x[ind_sort]
y = y[ind_sort]

print(x)
print(y)

# axis parameter is also available

M = np.array([[1, 9, 3, 4], [9, 8, 7, 2]])
print()
print(M)
print()
print(np.sort(M, axis=1))

[2 0 3 4]
[4 0 6 8]
[0 2 3 4]
[0 4 6 8]

[[1 9 3 4]
 [9 8 7 2]]

[[1 3 4 9]
 [2 7 8 9]]
