Numpy enables fast computation in Python, the underlying implementation is in C, hence it's blazing fast.The Key feature of Numpy is the ndarray object. The data type should be homogenous, that is the array should contain elements of single data type.

In [1]:
import numpy as np

In [2]:
vector = np.array([1,2,3,4,5])
print(f"Vector :{vector}")

# Every array will have a shape. That is , its dimensions.
print(f"Shape :{vector.shape}")

# Print number of Dimension
print(f"Dim : {vector.ndim}")
print(f"Data type :{vector.dtype}")

Vector :[1 2 3 4 5]
Shape :(5,)
Dim : 1
Data type :int32


The number of dimensions numpy uses is as follows:

(depth, rows, columns)

So, a 3D array of a 3 rows 2 columns and 2 depth will have following shape:
(2,3,2)



In [9]:
a = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
a.shape = (2,3,2)

print(a)

[[[ 1  2]
  [ 3  4]
  [ 5  6]]

 [[ 7  8]
  [ 9 10]
  [11 12]]]


In [13]:
a = np.zeros((3,8,1))
print(a)

[[[0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]]

 [[0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]]

 [[0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]
  [0.]]]


### arange

The arange function is similar to Python's range function. The data type, if not specified, in many cases will be np.foat64

In [14]:
a = np.arange(15)
print(a)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]


### zeros, zeros_like

zeros(dim) will return a np.array of dim dimensions initialised with 0. Note that dim should be tuple.

zeros_like(array) will return a np.array of same dimensions as of array initialised with zeros.



same functionality is with ones, and ones_like except of course the initialization is done with ones.

same functionality wit empty, and empty_like which will create numpy arrays but won't initialise it with anything(hence, faster). By default, all the values in the array will have garbage values.

In [16]:
print("Zeros")
a = np.zeros((3,3))
print(f"A: {a}")

b = np.zeros_like(a)
print(f"B: {b}")

Zeros
A: [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
B: [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [17]:
print("Ones")
a = np.ones((3,3))
print(f"A: {a}")

b = np.ones_like(a)
print(f"B: {b}")

Ones
A: [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
B: [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


In [18]:
print("Empty")
a = np.empty((3,3))
print(f"A: {a}")

b = np.empty_like(a)
print(f"B: {b}")

Empty
A: [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
B: [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


### astype

astype will convert one data type to another. Also note that astype will create a new copy of the input array (even if the data type is same).

Also, converting from higher precision (like float) to lower precision (like int) will cause loss of information (decimal part is lost in case of float to int).

In [21]:
a = np.array([1,2,3,4.5,6,7,8,9])
print(f"A :{a}, dtype :{a.dtype}")

b = a.astype(int)
print(f"B :{b}, dtype :{b.dtype}")


A :[1.  2.  3.  4.5 6.  7.  8.  9. ], dtype :float64
B :[1 2 3 4 6 7 8 9], dtype :int32


### Vectorization and vector-scalar operations

Using for loops in code is not only prone to error but also is inefficient. We can use NumPy operaton to circumvent such for loops.
This process is called vectorization.

###### Using operations on same sized arrays produce element wise operations.

In [23]:
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[4,5,6],[1,2,3]])

c = a + b
print(c)

d = a - b
print(d)

m = a * b
print(m)

[[5 7 9]
 [5 7 9]]
[[-3 -3 -3]
 [ 3  3  3]]
[[ 4 10 18]
 [ 4 10 18]]


Using scalars with vectors will produce element wise operations.

In [24]:
a = 3
b = np.array([[1,2,3],[4,5,6]])

c = a + b
print(c)

c = a * b
print(c)

c = 1.0 / b
print(c)

[[4 5 6]
 [7 8 9]]
[[ 3  6  9]
 [12 15 18]]
[[1.         0.5        0.33333333]
 [0.25       0.2        0.16666667]]


### Slicing

You can slice by following syntax:

     array[start_index : end_index]

For n-dimensional array:

    array[start_index : end_index, start_index:end_index]
    
Slicing NumPy arrays is similar to that of Python lists. One man distinction in Python list and NumPy array is that the slice is not the copy, but the original array. Hence, if any operations on the slice will be reflected in the original array.

In [25]:
a = np.arange(20)
print(a)
a[10:15] = 5
print(a)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
[ 0  1  2  3  4  5  6  7  8  9  5  5  5  5  5 15 16 17 18 19]


If you want to avoid above scenario, you can use copy()

In [27]:
a = np.arange(20)
print(a)
b = a[10:15].copy()
# value of the original array doesn't change
print(b)
print(a)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
[10 11 12 13 14]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


### Boolean indexing

Using boolen indexing, you can use it to filter or check if any entries have any specific values.

In [28]:
a = np.array(["ramesh", "suresh", "aadesh", "rakesh", "parvesh"])
a == "suresh"

array([False,  True, False, False, False])

In [29]:
# lists entry where value != "suresh"
a[~(a == "suresh")]

array(['ramesh', 'aadesh', 'rakesh', 'parvesh'], dtype='<U7')

###### You can use | for or & for and but not Python's and, or will not work with NumPy's indexing

In [30]:
(a == "suresh") | (a == "rakesh")

array([False,  True, False,  True, False])

### Fancy indexing

You can index a list to print the array in the given order. For instance, you want to print the 1st row, 3rd row and 2nd row in that order. 

In [32]:
a = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4]])
print(a)
a[[1,3,2]]

[[1 1 1]
 [2 2 2]
 [3 3 3]
 [4 4 4]]


array([[2, 2, 2],
       [4, 4, 4],
       [3, 3, 3]])

### Transposing

You can obtains the transpose of your matrix using matrix.T where matrix is your matrix name

In [33]:
a = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4]])
print(a.T)

[[1 2 3 4]
 [1 2 3 4]
 [1 2 3 4]]


### Universal Functions

NumPy has variety of functions that can be applied to scalars as well as vectors. Some examples are sqrt, exp, log, log10, sin , cos, etc.

In [34]:
a = 20
b = np.random.rand(2,2)
print(np.exp(a))
print(np.exp(b))

485165195.4097903
[[2.51406066 1.00664691]
 [2.33869787 1.48586741]]


### meshgrid

One of the most useful function is meshgrid. It's used to visualize data boundaries of your classifier. What you do is train your classifier, then create a meshgrid of every pixel in the plot, and then classify the pixel. When you give the pixel a specific color according to the labelled class you can clearly visualize the boundaries.

Using meshgrid requires three steps.

    1. Create xs (1D array)
    2. Create ys (1D array)
    3. Create meshgrid (2D array) which corresponds to every pixel in the graph.


In [38]:
xs = np.linspace(1, 10, 100)
ys = np.linspace(1, 10, 100)

xx, yy = np.meshgrid(xs, ys)
# plot with xx and yy

### where

If you have 3 arrays x, y and condition then, np.where is replacement for using:

    if condition:
        use x
    else:
        use y

In [39]:
a = [0,-1,2,3,-4,-5]
b = [9,3,4,11,2,3]
c = [True, False, True, True, False, True]
np.where(c, a, b)

array([ 0,  3,  2,  3,  2, -5])

### mean, sum, std

NumPY provides variety of functions for statistical use. You can furthermore specify the axis you want to reduce.

In [40]:
a = np.random.rand(3,3)
print(a)
print(np.mean(a))  # both are fine
print(a.mean())

print(np.std(a))

[[0.63296933 0.1877776  0.19109017]
 [0.10336466 0.09954088 0.05026692]
 [0.89597247 0.52586115 0.37208226]]
0.3398806031309812
0.3398806031309812
0.27358075082412686
