# Numpy

> NumPy stands for "Numerical Python". It is an extension module for Python, mostly written in C. This makes sure that the precompiled mathematical and numerical functions and functionalities of Numpy guarantee great execution speed.

NumPy enriches the programming language Python with powerful data structures, implementing **multi-dimensional** arrays and **matrices**.

> Inshort Numpy provides Array Object

### Why Numpy arrays are useful?
Memory-efficient container that provides fast numerical operations.

## python objects:
1. high-level number objects: integers, floating point
2. containers: lists (costless insertion and append), dictionaries (fast lookup)

## Numpy provides:
1. Extension package to Python for multi-dimensional arrays
2. Closer to hardware (efficiency)
3. Designed for scientific computation (convenience)
4. Also known as array oriented computing

In [1]:
# Before we can use numpy we need to import it.

# import numpy 

# But mostly we import it by renaming to np

import numpy as np

In [2]:
a = np.array([0,1,2,3,4])

print(a)
print(f"Type of a is {type(a)}")

print(np.arange(10))

[0 1 2 3 4]
Type of a is <class 'numpy.ndarray'>
[0 1 2 3 4 5 6 7 8 9]


## Proof for fast computations
- list `x` is using normal python
- list `y` is using numpy array

In [3]:
# Square the numbers from list x
x = range(1000)
%timeit [i**2 for i in x]

495 µs ± 5.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [4]:
y = np.arange(1000)
%timeit a**2

1.56 µs ± 155 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


# Creating Arrays

In [5]:
# one dimensional array
a = np.array([0, 1, 2, 3])

print(a)
print(f"Dimensions of array a: {a.ndim}")
print(f"Shape  of array a: {a.shape}")
print(f"Length of array a: {len(a)}")

[0 1 2 3]
Dimensions of array a: 1
Shape  of array a: (4,)
Length of array a: 4


In [6]:
# two dimensional array
b = np.array([[0, 1, 2, 3], [4, 5, 6, 7]])

print(b)
print(f"Dimensions of array b: {b.ndim}")
print(f"Shape  of array b: {b.shape}")
print(f"Length of array b: {len(b)}")

[[0 1 2 3]
 [4 5 6 7]]
Dimensions of array b: 2
Shape  of array b: (2, 4)
Length of array b: 2


In [7]:
# three dimensional array
c = np.array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]])

print(c)
print(f"Dimensions of array b: {c.ndim}")
print(f"Shape  of array b: {c.shape}")
print(f"Length of array b: {len(c)}")

[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]
Dimensions of array b: 3
Shape  of array b: (2, 2, 2)
Length of array b: 2


## `arange()`
- Array-valued version of the built-in python range function.
- Arguments
    - `start` (inclusive)
    - `stop`  (exclusive)
    - `step`  (by default 1)

In [8]:
a = np.arange(10)


print(a)

[0 1 2 3 4 5 6 7 8 9]


In [9]:
b = np.arange(1, 10, 2)

print(b)

[1 3 5 7 9]


## `linspace()` 
- Returns evenly spaced number over a specified interval
- Arguments
    - `start`
    - `stop`
    - `num`: number of points 
    - `endpoint`: default to `True` so it includes the stop

In [10]:
# generate 6 evenly spaced numbers between 0 and 1
a = np.linspace(0, 1, 6)

print(a)

[0.  0.2 0.4 0.6 0.8 1. ]


## `ones()`
- Returns a new array of given `shape` and `type`, filled with **ones**.

In [11]:
a = np.ones((3,3))

print(a)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


## `zeros()`
- Returns a new array of given `shape` and `type`, filled with **zeros**.

In [12]:
a = np.zeros((3,3))

print(a)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


## `eye()`
- Identity matrix
- Return a 2-D array with ones on the diagonal and zeros elsewhere.

In [13]:
a = np.eye(3)

print(a)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [14]:
# 3x2 matrix
b = np.eye(3, 2)

print(b)

[[1. 0.]
 [0. 1.]
 [0. 0.]]


## `diag()`
- Extract a diagonal or construct a diagonal array

In [15]:
# construct a diagonal array with given values [1,2,3,4]
a = np.diag([1,2,3,4])

print(a)

[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]


In [16]:
# extract the diagonal elements from a

np.diag(a)

array([1, 2, 3, 4])

## random
create a random array
- `rand()` Returns Random values in a given shape.
- `randint()` Return random integers from `low` (inclusive) to `high` (exclusive).
- `randn()` Return a sample (or samples) from the "standard normal" distribution.

> **Standard Normal Distribution :** It is a special case of normal or Gaussian distribution where the **mean = 0** and **std.deviation = 1**
$$\boxed{X \sim \mathcal{N}(\mu = 0, \sigma = 1)}$$

In [17]:
a = np.random.rand(4)
print(a)

print('*' * 50)

b = np.random.rand(3,3)
print(b)

[0.01464125 0.47550838 0.22766257 0.24523465]
**************************************************
[[0.14893099 0.88392945 0.06630808]
 [0.16661243 0.77746981 0.86901172]
 [0.07335718 0.50379842 0.73763927]]


In [18]:
# standard normal distribution
a = np.random.randn(4)
print(a)

print('*' * 50)

b = np.random.randn(3,3)
print(b)

[-0.78269038  1.18867735 -0.77541818 -0.52879142]
**************************************************
[[-1.01399824 -1.55041856  0.91123702]
 [ 0.67818505 -0.96456901 -1.38107696]
 [-0.96675033  1.99383171 -1.55064533]]


**NOTE:** 
To produce the normal distribution using `randn`
   - mu is mean of the dataset
   - sigma is std.deviation

```python
mu = 150
sigma = 25
norm_dist = sigma * np.random.randn(150) + mu
```

# Basic Data types
- The data type of elements in the array 


**Each built-in data type has a character code that uniquely identifies it.**

|Character| Description|
|---------|:-----------:|
|`'b'`| boolean |
|`'i'`| (signed) integer|
|`'u'`| unsigned integer|
|`'f'`| floating-point|
|`'c'`| complex-floating point|
|`'m'`| timedelta|
|`'M'`| datetime|
|`'O'`| (Python) objects|
|`'S'`, `'a'`| (byte-)string|
|`'U'`| Unicode|
|`'V'`| raw data (void)|

[Refer](https://numpy.org/doc/stable/user/basics.types.html)

In [19]:
a = np.arange(10)
a.dtype

dtype('int32')

In [20]:
a = np.arange(10, dtype='float64')

print(a)
print(a.dtype)

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
float64


In [21]:
a = np.zeros((3,3))

print(a)

a.dtype

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


dtype('float64')

In [22]:
a = np.array([1+2j, 2+4j])  

print(a)
print(a.dtype)

[1.+2.j 2.+4.j]
complex128


In [23]:
a = np.array([True, False, True, False]) 

print(a.dtype)

bool


In [24]:
a = np.array(['Ram', 'Robert', 'Rahim'])

print(a)

a.dtype

['Ram' 'Robert' 'Rahim']


dtype('<U6')

# Indexing and Slicing

- Items of an array can be accessed and assigned to the same way as list in python

## Indexing

In [25]:
a = np.arange(10)
print(a)

[0 1 2 3 4 5 6 7 8 9]


In [26]:
a[5]

5

In [27]:
a = np.diag([1,2,3,4])

print(a)

[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]


In [28]:
print(a[2, 2])

3


In [29]:
a[2, 1] = 7

print(a)

[[1 0 0 0]
 [0 2 0 0]
 [0 7 3 0]
 [0 0 0 4]]


## Slicing
- `[start (inclusive), stop (exclusive), step]`

In [30]:
a = np.arange(10)

print(a)

[0 1 2 3 4 5 6 7 8 9]


In [31]:
a[5:]

array([5, 6, 7, 8, 9])

In [32]:
a[2::2]

array([2, 4, 6, 8])

In [33]:
a[:-3]

array([0, 1, 2, 3, 4, 5, 6])

In [34]:
a[6:] = 10

print(a)

[ 0  1  2  3  4  5 10 10 10 10]


In [35]:
b = np.arange(5)

print(b)

[0 1 2 3 4]


In [36]:
a[5:] = b[::-1]

print(a)

[0 1 2 3 4 4 3 2 1 0]


# Copies and Views

A slicing operation creates a view on the original array, which is just a way of accessing array data. Thus the original array is not copied in memory. You can use `np.may_share_memory()` to check if two arrays share the same memory block. 

In [37]:
a = np.arange(10)

print(a)

[0 1 2 3 4 5 6 7 8 9]


In [38]:
b = a[::2]
print(b)

[0 2 4 6 8]


In [39]:
np.shares_memory(a,b)

True

In [40]:
b[0] = 10
b

array([10,  2,  4,  6,  8])

In [41]:
# eventhough we modified in "b", 
# it updated "a" because both shares same memory
a

array([10,  1,  2,  3,  4,  5,  6,  7,  8,  9])

In [42]:
a = np.arange(10)
b = a[::2].copy()

b

array([0, 2, 4, 6, 8])

In [43]:
np.shares_memory(a,b)

False

In [44]:
np.may_share_memory(a, b)

False

In [45]:
b[0] = 10

print(f"a: {a}")
print(f"b: {b}")

a: [0 1 2 3 4 5 6 7 8 9]
b: [10  2  4  6  8]


# Fancy Indexing

NumPy arrays can be indexed with slices, but also with boolean or integer arrays **(masks)**. This method is called **fancy indexing**. It creates copies not views.

In [46]:
# randint(low, high, size)
a = np.random.randint(0, 20, 15)
print(a)

[14 12 16 11  7 17  7  0  9 17  3 14  4  3 14]


## Using Boolean mask

In [47]:
mask = (a % 2 == 0)

print(mask)

[ True  True  True False False False False  True False False False  True
  True False  True]


In [48]:
extract_from_a = a[mask]

print(extract_from_a)

[14 12 16  0 14  4 14]


**Indexing with a mask can be very useful to assign a new value to a sub-array:**

In [49]:
a[mask] = -1

print(a)

[-1 -1 -1 11  7 17  7 -1  9 17  3 -1 -1  3 -1]


## Indexing with an array of integers

In [50]:
a = np.arange(0, 100, 10)
print(a)

[ 0 10 20 30 40 50 60 70 80 90]


In [51]:
a[[2,3,2,4,2]]

array([20, 30, 20, 40, 20])

In [52]:
a[[9,7]] = -200

print(a)

[   0   10   20   30   40   50   60 -200   80 -200]


# Elementwise operations

## Basic Operations

### with scalars

In [53]:
a = np.array([1, 2, 3, 4])
a + 1

array([2, 3, 4, 5])

In [54]:
a ** 2

array([ 1,  4,  9, 16], dtype=int32)

In [55]:
### Arithmetic operates elementwise

In [56]:
b = np.ones(4, dtype='int32') + 1

print(f"a: {a}")
print(f"b: {b}")

a: [1 2 3 4]
b: [2 2 2 2]


In [57]:
print(f"a + b : {a + b}")
print(f"a - b : {a - b}")
print(f"a * b : {a * b}")
print(f"a / b : {a / b}")

a + b : [3 4 5 6]
a - b : [-1  0  1  2]
a * b : [2 4 6 8]
a / b : [0.5 1.  1.5 2. ]


### Matrix Multiplication
- if `a` and `b` are two matrices, then `a*b` is just matrix multiplication (element wise) provided both the matrices have same shape. Whereas matrix-matrix multiplication involves `dot` product (matrices having their shapes reverse ofeach other)

In [58]:
a = np.arange(1, 10).reshape(3,3)
b = np.eye(3)


print("Matrix a")
print(a)

print("\nMatrix b")
print(b)

print("\nMatrix a * b")
print(a * b)

Matrix a
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Matrix b
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Matrix a * b
[[1. 0. 0.]
 [0. 5. 0.]
 [0. 0. 9.]]


In [59]:
a = np.array([[1,2,3], [4,5,6], [7, 8, 9]])

b = np.array([[1,2], [3, 4], [5, 6]])

print("Matrix a")
print(a)

print("\nMatrix b")
print(b)

#  since these two are of not same shape it throws an error while performing a * b
# print(a * b)

print("\nMatrix a.b")
print(a.dot(b))

Matrix a
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Matrix b
[[1 2]
 [3 4]
 [5 6]]

Matrix a.b
[[ 22  28]
 [ 49  64]
 [ 76 100]]


In [60]:
c = np.diag([1, 2, 3, 4])


print("matrix")
print(c)

print("\nMultiplying matrix itself")
print("======== Using * =========\n")
print(c * c)
print("\n======== using .dot() =======\n")
print(c.dot(c))


matrix
[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]

Multiplying matrix itself

[[ 1  0  0  0]
 [ 0  4  0  0]
 [ 0  0  9  0]
 [ 0  0  0 16]]


[[ 1  0  0  0]
 [ 0  4  0  0]
 [ 0  0  9  0]
 [ 0  0  0 16]]


### Comparisions

In [61]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 2, 2, 4])

print(f"a : {a}")
print(f"b : {b}")

a : [1 2 3 4]
b : [5 2 2 4]


In [62]:
print(f"a == b : {a == b }")
print(f"a > b : {a > b}")
print(f"a < b : {a < b}")
print(f"a >= b : {a >= b}")
print(f"a <= b : {a <= b}")
print(f"a != b : {a != b }")

a == b : [False  True False  True]
a > b : [False False  True False]
a < b : [ True False False False]
a >= b : [False  True  True  True]
a <= b : [ True  True False  True]
a != b : [ True False  True False]


#### Array wise comparisions

In [63]:
#array-wise comparisions
a = np.array([1, 2, 3, 4])
b = np.array([5, 2, 2, 4])
c = np.array([1, 2, 3, 4])

print(f"a : {a}")
print(f"b : {b}")
print(f"c : {c}")

a : [1 2 3 4]
b : [5 2 2 4]
c : [1 2 3 4]


In [64]:
print(f"array 'a' equals array 'b': {np.array_equal(a, b)}")
print(f"array 'a' equals array 'c': {np.array_equal(a, c)}")

array 'a' equals array 'b': False
array 'a' equals array 'c': True


### Logical Operations

In [65]:
a = np.array([1, 1, 0, 0], dtype=bool)
b = np.array([1, 0, 1, 0], dtype=bool)

print(f"a : {a}")
print(f"b : {b}")

a : [ True  True False False]
b : [ True False  True False]


In [66]:
print(f"Logical or: {np.logical_or(a, b)}")
print(f"Logical and: {np.logical_and(a, b)}")

Logical or: [ True  True  True False]
Logical and: [ True False False False]


### Transcendental functions

In [67]:
a = np.arange(1,6)

print(f"a: {a}")

a: [1 2 3 4 5]


In [68]:
print(f"sin : {np.sin(a)}")
print(f"log : {np.log(a)}")
print(f"exp : {np.exp(a)}")

sin : [ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427]
log : [0.         0.69314718 1.09861229 1.38629436 1.60943791]
exp : [  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]


### Shape Mismatch

In [69]:
a = np.arange(4)
b = np.array([1, 2])


print(f"a : {a}")
print(f"b : {b}")

a : [0 1 2 3]
b : [1 2]


In [70]:
a + b

ValueError: operands could not be broadcast together with shapes (4,) (2,) 

## Broadcasting
Basic operations on numpy arrays (addition, etc.) are elementwise

This works on arrays of the same size. 

Nevertheless, It’s also possible to do operations on arrays of different sizes if NumPy can transform these arrays so that they all have the same size: this conversion is called **broadcasting**.

The image below gives an example of broadcasting:
![Broadcasting](./assets/broadcasting.png)

In [71]:
a = np.tile(np.arange(0, 40, 10), (3,1))

print("Matrix a")
print(a)

a = a.T
print("\nMatrix a traspose")
print(a)

Matrix a
[[ 0 10 20 30]
 [ 0 10 20 30]
 [ 0 10 20 30]]

Matrix a traspose
[[ 0  0  0]
 [10 10 10]
 [20 20 20]
 [30 30 30]]


In [72]:
b = np.array([0, 1, 2])

print("Matrix b")
print(b)

Matrix b
[0 1 2]


In [73]:
print("Matrix a + b")
print(a + b)

Matrix a + b
[[ 0  1  2]
 [10 11 12]
 [20 21 22]
 [30 31 32]]


In [74]:
a = np.arange(0, 40, 10)

print("Matrix a")
print(a)


print(f"shape of a: {a.shape}")

Matrix a
[ 0 10 20 30]
shape of a: (4,)


In [75]:
a = a[:, np.newaxis]
print("Matrix a")
print(a)


print(f"shape of a: {a.shape}")

Matrix a
[[ 0]
 [10]
 [20]
 [30]]
shape of a: (4, 1)


In [76]:
# check how broadcasting performed addition of misshape
print("Matrix a + b")
print(a + b)

Matrix a + b
[[ 0  1  2]
 [10 11 12]
 [20 21 22]
 [30 31 32]]


## Basic Reductions

### Computing sum

In [77]:
x = np.array([1, 2, 3, 4])

print(f"x : {x}")
print(f"sum of x: {np.sum(x)}")

x : [1 2 3 4]
sum of x: 10


In [78]:
# axis 0 -> columns
# axis 1 -> rows
x = np.array([[1, 1], [2, 2]])

print("Matrix x")
print(x)
print(f"sum of x on axis 0: {x.sum(axis=0)}")
print(f"sum of x on axis 1: {x.sum(axis=1)}")

Matrix x
[[1 1]
 [2 2]]
sum of x on axis 0: [3 3]
sum of x on axis 1: [2 4]


### Other reductions

In [79]:
x = np.array([1, 3, 2, 8, 9, 4, 5])

print(f"x: {x}")
print(f"Minimum in x: {x.min()}")
print(f"Maximum in x: {x.max()}")
print(f"argmin (index of minimum element) in x: {x.argmin()}")
print(f"argmax (index of maximum element) in x: {x.argmax()}")

x: [1 3 2 8 9 4 5]
Minimum in x: 1
Maximum in x: 9
argmin (index of minimum element) in x: 0
argmax (index of maximum element) in x: 4


### Logical Operations
- `all()` : Check all elements in the array are be True
- `any()` : Check any element in the array is True

In [80]:
np.all([True, True, False])

False

In [81]:
np.any([True, False, False])

True

In [82]:
#Note: can be used for array comparisions
a = np.zeros((50, 50))
np.any(a != 0)

False

In [83]:
np.all(a == a)

True

In [84]:
a = np.array([1, 2, 3, 2])
b = np.array([2, 2, 3, 2])
c = np.array([6, 4, 4, 5])
((a <= b) & (b <= c)).all()

True

### Statistics

In [85]:
x = np.array([1, 2, 3, 1])
y = np.array([[1, 2, 3], [5, 6, 1]])


print(f"x: {x}")
print(f"mean of x: {x.mean()}")
print(f"median of x: {np.median(x)}")
print(f"median of y (last axis): {np.median(y, axis=-1)}")
print(f"std. deviation of x: {x.std()}")

x: [1 2 3 1]
mean of x: 1.75
median of x: 1.5
median of y (last axis): [2. 5.]
std. deviation of x: 0.82915619758885


# Array shape manipulation

## Flattening

Return a contiguous flattened array. A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.

In [86]:
a = np.array([[1, 2, 3], [4, 5, 6]])

print("before flattening a")
print(a)

b = a.ravel()
print("flattend a ")
print(b)

before flattening a
[[1 2 3]
 [4 5 6]]
flattend a 
[1 2 3 4 5 6]


In [87]:
print("Matrix a")
print(a)

print("\nMatrix a Transpose")
print(a.T)

print("\nFlattened matrix a transpose")
print(a.T.ravel())

Matrix a
[[1 2 3]
 [4 5 6]]

Matrix a Transpose
[[1 4]
 [2 5]
 [3 6]]

Flattened matrix a transpose
[1 4 2 5 3 6]


## Reshaping
The inverse operation to flattening

In [88]:
print("Matrix a")
print(a)
print(f"shape of a: {a.shape}")

Matrix a
[[1 2 3]
 [4 5 6]]
shape of a: (2, 3)


In [89]:
b = a.ravel()
print(f"b: {b}")
print(f"shape of b: {b.shape}")

b: [1 2 3 4 5 6]
shape of b: (6,)


In [90]:
b = b.reshape((2, 3))
print("After reshaping b")
print(b)
print(f"shape of b: {b.shape}")

After reshaping b
[[1 2 3]
 [4 5 6]]
shape of b: (2, 3)


In [91]:
b[0, 0] = 88
print(b)

[[88  2  3]
 [ 4  5  6]]


**NOTE: Reshape may also return a copy**

In [92]:
a = np.zeros((3, 2))
b = a.T.reshape(3*2)

# the manipulation in b effects the a
b[0] = 50
 
print(a)
print()
print(b)

[[0. 0.]
 [0. 0.]
 [0. 0.]]

[50.  0.  0.  0.  0.  0.]


## Adding a dimension

Indexing with the `np.newaxis` object allows us to add an axis to an array

newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,

1-D array will become 2-D array

2-D array will become 3-D array

n-D array will become (n+1)-D array and so on

In [93]:
z = np.array([1, 2, 3])

print(f"z: {z}")

z: [1 2 3]


In [94]:
z[:, np.newaxis]

array([[1],
       [2],
       [3]])

## Dimension Shuffling

In [95]:
a = np.arange(4*3*2).reshape(4, 3, 2)

print(a)

print(f"\nshape of a: {a.shape}")

[[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]

 [[12 13]
  [14 15]
  [16 17]]

 [[18 19]
  [20 21]
  [22 23]]]

shape of a: (4, 3, 2)


In [96]:
a[0, 2, 1]

5

In [97]:
a[3, 1, 0]

20

## Resizing

In [98]:
a = np.arange(4)
print(f"a: {a}")
a.resize((8,))
print(f"a after resize: {a}")

a: [0 1 2 3]
a after resize: [0 1 2 3 0 0 0 0]


In [99]:
#Sorting along an axis:
a = np.array([[5, 4, 6], [2, 3, 2]])
b = np.sort(a, axis=1)
# inplace sorting: a.sort(axis=1)

print("Matrix a")
print(a)

print("After sorting along axis")
print(b)

Matrix a
[[5 4 6]
 [2 3 2]]
After sorting along axis
[[4 5 6]
 [2 2 3]]


In [100]:
#sorting with fancy indexing
a = np.array([4, 3, 1, 2])
b = np.argsort(a)

print(f"a: {a}")
print(f"argsort of a: {b}")

a: [4 3 1 2]
argsort of a: [2 3 1 0]


In [101]:
a[b]

array([1, 2, 3, 4])