# Numpy Notes

The standard Python implementation is written in C.

![image.png](attachment:11c7c95e-3c47-42f7-8a63-5bde586a4cbf.png)

![image.png](attachment:63b5bf37-706d-459f-8b49-a93c3c5e8b02.png)

Difference between Numpy array representation in memory and native python list.

In [1]:
import numpy as np

In [10]:
x = np.array([1, 2, 3, 4])

#Attributes
print('Array: ', x)
print('Type:', x.dtype)
print('Shape: ', x.shape)
print('Size: ', x.size)
print('Number of dimensions: ', x.ndim)
print('Each item bytes: ', x.itemsize)
print('Total bytes: ', x.nbytes)

Array:  [1 2 3 4]
Type: int32
Shape:  (4,)
Size:  4
Number of dimensions:  1
Each item bytes:  4
Total bytes:  16


In [12]:
x = np.array([range(i, i+3) for i in [1, 2, 3]])

#Attributes
print('Array: ', x)
print('Type:', x.dtype)
print('Shape: ', x.shape)
print('Size: ', x.size)
print('Number of dimensions: ', x.ndim)
print('Each item bytes: ', x.itemsize)
print('Total bytes: ', x.nbytes)

Array:  [[1 2 3]
 [2 3 4]
 [3 4 5]]
Type: int32
Shape:  (3, 3)
Size:  9
Number of dimensions:  2
Each item bytes:  4
Total bytes:  36


## Creating Various Numpy array

In [33]:
#Arrays with Zero element
x = np.zeros((3,3))
print('Zero array: ', x)

#Arrays with One element
x = np.ones((3,3))
print('Ones array: ', x)

#Identical array
x = np.eye(3)
print('Identical Array: ', x)

#Empty array
x = np.empty(3)
print('Empty Array: ', x)

#Random array
np.random.seed(20)
x = np.random.random((3,3))  # Uniformly distributed random values between 0 and 1
print('Random Array: ', x)

#Random array
np.random.seed(20)
x = np.random.randint(1, 10, (3,3))  # random integers between specifies [0, 10)
print('Random Array: ', x)

#Random array
np.random.seed(20)
x = np.random.normal(0, 1, (3,3))  # Normally distributed 
print('Random Array: ', x)

#Random array
np.random.seed(20)
x = np.random.randn(3, 3)  # Standard normal distribution 
print('Random Array: ', x)

#Arrage
x = np.arange(1, 20)
print('Array: ', x)

#Linespace
x = np.linspace(1, 10, 4)
print('Array: ', x)

Zero array:  [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
Ones array:  [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
Identical Array:  [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Empty Array:  [1. 1. 1.]
Random Array:  [[0.5881308  0.89771373 0.89153073]
 [0.81583748 0.03588959 0.69175758]
 [0.37868094 0.51851095 0.65795147]]
Random Array:  [[4 5 7]
 [8 3 1]
 [7 9 6]]
Random Array:  [[ 0.88389311  0.19586502  0.35753652]
 [-2.34326191 -1.08483259  0.55969629]
 [ 0.93946935 -0.97848104  0.50309684]]
Random Array:  [[ 0.88389311  0.19586502  0.35753652]
 [-2.34326191 -1.08483259  0.55969629]
 [ 0.93946935 -0.97848104  0.50309684]]
Array:  [ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
Array:  [ 1.  4.  7. 10.]


## Slicing and Reversing

In [50]:
x = np.array([range(i, i+3) for i in [1, 2, 3]])

a = x[:2, :2]
print(a)
print("")

b = x[:2, 1::-1]
print(b)
print("")

c = x[1::-1, :2]
print(c)

[[1 2]
 [2 3]]

[[2 1]
 [3 2]]

[[2 3]
 [1 2]]


Slicing creates the view of the actual array. So any changes in subarray is reflected in actual array. 

In [47]:
a[0,0] = 100
print('subarray: ', a)
print(" ")
print('Original Array: ', x)

subarray:  [[100   2]
 [  2   3]]
 
Original Array:  [[100   2   3]
 [  2   3   4]
 [  3   4   5]]


In [52]:
#Create .copy() to avoid it
x = np.array([range(i, i+3) for i in [1, 2, 3]])
a = x[:2, :2].copy()
print(a)
print("")

a[0,0] = 100
print('subarray: ', a)
print(" ")
print('Original Array: ', x)

[[1 2]
 [2 3]]

subarray:  [[100   2]
 [  2   3]]
 
Original Array:  [[1 2 3]
 [2 3 4]
 [3 4 5]]


## Reshaping of Array

In [63]:
#Reshape function create copy rather than view due to non contiguous allocation of memory.
x = np.random.random(10)
print(x.shape)

a = x.reshape(2,5)
print(a.shape)

b = x[np.newaxis, :]  #Adds new dimension or axis to the existing array
print(b.shape)

b = b[np.newaxis, :]  #Adds new dimension or axis to the existing array
print(b.shape)

(10,)
(2, 5)
(1, 10)
(1, 1, 10)


## Concatenating and Splitting

In [64]:
a = np.random.random((3,3))
b = np.random.random((3,3))
c = np.concatenate([a, b])
print(c)

[[0.76525884 0.97250331 0.915947  ]
 [0.58744678 0.44437533 0.44107594]
 [0.69399137 0.72664097 0.73473276]
 [0.38422053 0.94539591 0.79366926]
 [0.25781467 0.63648694 0.84514052]
 [0.00777515 0.07703627 0.98942514]]


In [65]:
a = np.random.random((3,3))
b = np.random.random((3,3))
c = np.concatenate([a, b], axis = 1)
print(c)

[[0.79096501 0.73688738 0.86638985 0.27593379 0.38178337 0.6762014 ]
 [0.23075412 0.10550225 0.74293527 0.73647103 0.9706047  0.68136272]
 [0.47515634 0.75965283 0.48599313 0.490452   0.12840324 0.50383093]]


In [74]:
a = np.random.random((5,5))
b = np.split(a, [3],axis = 1)
print(b)

[array([[0.98403382, 0.63645996, 0.75290762],
       [0.2202413 , 0.23810885, 0.31070211],
       [0.90491839, 0.01369274, 0.76670161],
       [0.00486931, 0.20694957, 0.32946859],
       [0.58451544, 0.81150522, 0.03213032]]), array([[0.96224455, 0.82506169],
       [0.98050899, 0.63882056],
       [0.00439819, 0.04328805],
       [0.9596451 , 0.908882  ],
       [0.75606801, 0.08422223]])]


## Miscellaneous Function

In [75]:
x = np.array([-1, 2, -3, 4])
x = np.abs(x)
print(x)

[1 2 3 4]


In [76]:
print("x =", x)
print("e^x =", np.exp(x))
print("2^x =", np.exp2(x))
print("3^x =", np.power(3, x))

x = [1 2 3 4]
e^x = [ 2.71828183  7.3890561  20.08553692 54.59815003]
2^x = [ 2.  4.  8. 16.]
3^x = [ 3  9 27 81]


In [77]:
x = np.arange(1, 6)
print(np.add.reduce(x))

15


In [78]:
print(np.add.accumulate(x))

[ 1  3  6 10 15]


In [105]:
#Min
np.random.seed(0)
x = np.random.random((3,3))
print(x)
print('Min: ', np.min(x))
print('Max: ', np.max(x))
print('Sum: ', np.sum(x))
print('Percentile: ', np.percentile(x, 75))
print('Mean: ', x.mean())
print('Std: ', x.std())
print('Argument Min: ', np.argmin(x))
print('Argument Max: ', np.argmax(x))

[[0.5488135  0.71518937 0.60276338]
 [0.54488318 0.4236548  0.64589411]
 [0.43758721 0.891773   0.96366276]]
Min:  0.4236547993389047
Max:  0.9636627605010293
Sum:  5.7742213143196475
Percentile:  0.7151893663724195
Mean:  0.6415801460355164
Std:  0.17648980804276407
Argument Min:  4
Argument Max:  8


## Broadcasting 

1. If the two arrays differ in their number of dimensions, the shape of the
one with fewer dimensions is padded with ones on its leading (left) side.
2. If the shape of the two arrays does not match in any dimension, the array
with shape equal to 1 in that dimension is stretched to match the other shape.
3. If in any dimension the sizes disagree and neither is equal to 1, an error is
raised.

In [112]:
#Centering the element
a = np.arange(1, 5)
b = a.mean()
c = a - b
print(c)

[-1.5 -0.5  0.5  1.5]


In [109]:
#Masking
x = np.random.randint(1, 10, (3,3))
print(x)
print(' ')
a = x[x > 4]
print(a)

[[3 1 1]
 [5 6 6]
 [7 9 5]]
 
[5 6 6 7 9 5]


## Sorting

Numpy use quick sort - O(NlogN)

In [117]:
x = np.array([2, 1, 4, 3, 5])
np.sort(x)

X = np.random.randint(0, 10, (4, 6))
print('Array: ',X)

# sort each column of X
print('Sorted Array: ', np.sort(X, axis=0))

Array:  [[4 8 4 3 7 5]
 [5 0 1 5 9 3]
 [0 5 0 1 2 4]
 [2 0 3 2 0 7]]
Sorted Array:  [[0 0 0 1 0 3]
 [2 0 1 2 2 4]
 [4 5 3 3 7 5]
 [5 8 4 5 9 7]]


In [118]:
#Partial Partitioning 
x = np.array([7, 2, 3, 1, 6, 5, 4])
np.partition(x, 3)

array([2, 1, 3, 4, 6, 5, 7])

## Time Series

In [10]:
# Should be in specific format as below.
x = np.datetime64("2015-07-04")
print(x)
y = np.timedelta64(4, "D")
print(y)
# datetime64 imposes a trade-off between time resolution and maximum time span.
# So if precision increases the span decreases and vice versa.
# But numpy lib loose other good functinalities in datautil.

2015-07-04
4 days
