<a href="https://www.kaggle.com/code/himanshunakrani/numpy?scriptVersionId=106736184" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# What is NumPy?

- NumPy is the fundamental package for scientific computing in Python. it is used for working with arrays. 
- It also has functions for working in domain of linear algebra, fourier transform, and matrices.
- NumPy stands for Numerical Python.

# install NumPy

In [1]:
pip install numpy

[0mNote: you may need to restart the kernel to use updated packages.


# Why Use NumPy?

Even though Python lists are great on their own, NumPy has a number of key features that give it great advantages over Python lists.

### Benefits of using NumPy

**1) Speed** -  When performing operations on large arrays NumPy can often perform several orders of magnitude faster than Python lists. This speed comes from the nature of NumPy arrays being memory-efficient and from optimized algorithms used by NumPy for doing arithmetic, statistical, and linear algebra operations.<br> 
**2) multidimensional array data structures** -  NumPy is optimized for matrix operations and it allows us to do Linear Algebra operations effectively and efficiently, making it very suitable for solving machine learning problems.<br>
**3) optimized built-in mathematical functions** - Another great advantage of NumPy over Python lists is that NumPy has a large number of optimized built-in mathematical functions. These functions allow you to do a variety of complex mathematical computations very fast and with very little code making your programs more readable and easier to understand.

## 1. Speed

In [2]:
# Why use NumPy?
import time
import numpy as np
x = np.random.random(100000000)

# Case 1
start = time.time()
sum(x) / len(x)
print("using built-in python function: ", time.time() - start)

# Case 2
start = time.time()
np.mean(x)
print("using NumPy: ", time.time() - start)

using built-in python function:  8.79297661781311
using NumPy:  0.06635451316833496


# Creating NumPy array

### 1. Using regular Python lists

In [3]:
import numpy as np

In [4]:
#  a 1-D Array of Integers (Rank #1 Array)

a = np.array([1, 2, 3, 4, 5])
print(a)
print(type(a))

[1 2 3 4 5]
<class 'numpy.ndarray'>


In [5]:
print("rank: ", a.ndim)

rank:  1


In [6]:
print(a.shape)

(5,)


In [7]:
print(a.dtype)

int64


In [8]:
# 2-D Array (Rank #2 Array)

b = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(b)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [9]:
print(b.shape)

(3, 4)


In [10]:
print("rank: ", b.ndim)

rank:  2


In [11]:
# 1-D Array of Strings (Rank #1 Array)

c = np.array(["Hello", "World"])
print(c.dtype)

<U5


In [12]:
# 1-D Array of Int and String (Rank #1 Array)

d = np.array([1, 2, 3, "Hello"])
print(d.dtype)

<U21


In [13]:
# 1-D Array of Int and Float

e = np.array([1, 2.3, 5])
print(e.dtype)

float64


In [14]:
# 1-D Array of Float, and specifying the datatype of each element as int64

f = np.array([1, 2.3, 4], dtype = np.int64)
print(f.dtype)

int64


In [15]:
#  Save the NumPy array to a File
x = np.array([1, 2, 3, 4, 5])

# save x into the current directory as 'my_array'
np.save('my_array', x)

In [16]:
# load the saved array from current directory into variable y
y = np.load('my_array.npy')
print(y)

[1 2 3 4 5]


### 2. Using built-in NumPy functions

In [17]:
# 3X4 ndarray full of zeros

X = np.zeros((3, 4))
print(X)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [18]:
# 3X4 ndarray full of ones

Y = np.ones((3, 4))
print(Y)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


In [19]:
# 3X4 ndarray full of sevens

Z = np.full((3, 4), 7)
print(Z)

[[7 7 7 7]
 [7 7 7 7]
 [7 7 7 7]]


In [20]:
# 5X5 identity matrix

I = np.eye(5)
print(I)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [21]:
# 4X4 diagonal matrix having diagonal values 1, 2, 3 and 4

D = np.diag([1, 2, 3, 4])
print(D)

[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]


In [22]:
# rank 1 ndarray that has sequential integers from 0 to 9
N = np.arange(10)

# rank 1 ndarray that has sequential integers from 10 to 29
M = np.arange(10, 30)

# rank 1 ndarray that has evenly spaced integers from 10 to 30 in steps of 3
O = np.arange(10, 30, 3)
print(N)
print(M)
print(O)

[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29]
[10 13 16 19 22 25 28]


In [23]:
# rank 1 ndarray that has 50 intgers evenly spaced between 10and 30
Q = np.linspace(10, 30, 50)
print(Q)

[10.         10.40816327 10.81632653 11.2244898  11.63265306 12.04081633
 12.44897959 12.85714286 13.26530612 13.67346939 14.08163265 14.48979592
 14.89795918 15.30612245 15.71428571 16.12244898 16.53061224 16.93877551
 17.34693878 17.75510204 18.16326531 18.57142857 18.97959184 19.3877551
 19.79591837 20.20408163 20.6122449  21.02040816 21.42857143 21.83673469
 22.24489796 22.65306122 23.06122449 23.46938776 23.87755102 24.28571429
 24.69387755 25.10204082 25.51020408 25.91836735 26.32653061 26.73469388
 27.14285714 27.55102041 27.95918367 28.36734694 28.7755102  29.18367347
 29.59183673 30.        ]


In [24]:
#  reshape M into a 5X4 ndarray
m = M.reshape(5, 4)
print(m)

[[10 11 12 13]
 [14 15 16 17]
 [18 19 20 21]
 [22 23 24 25]
 [26 27 28 29]]


In [25]:
n = np.arange(0, 100).reshape(10, 10)
print(n)

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]


In [26]:
# 3X3 ndarray with random floats in the half-open interval [0.0, 1.0)
r = np.random.random((5, 5))
print(r)

[[0.38267208 0.93649719 0.0841749  0.97859433 0.9368141 ]
 [0.23820201 0.21201156 0.83688133 0.89706083 0.61604438]
 [0.87850504 0.42729314 0.65233941 0.2540982  0.27443699]
 [0.89783604 0.12439072 0.67884378 0.75380297 0.75369976]
 [0.41866344 0.52158947 0.73981456 0.52258119 0.90904998]]


In [27]:
#  5X4 ndarray with random integers in the half-open interval [4, 20)
r = np.random.randint(4, 20, (5, 4))
print(r)

[[ 4  9 18 19]
 [15 14  7 17]
 [10  4 19 14]
 [11 17 13 10]
 [ 5 13  4  8]]


In [28]:
# normal(Gaussian) distribution, mean = 0, standard deviation = 0.1

r = np.random.normal(0, 0.1, size = (100, 100))

print("mean: ", r.mean())
print("std: ", r.std())
print("max: ", r.max())
print("min: ", r.min())
print("Positive: ", (r>0).sum())
print("Negative: ", (r<0).sum())

mean:  -0.0001244873616076588
std:  0.09900873078420691
max:  0.38330555037630487
min:  -0.37731305810663573
Positive:  5067
Negative:  4933


# Accessing, Deleting, and Inserting Elements Into ndarrays

In [29]:
# indexing

x = np.array([1, 2, 3, 4, 5])

In [30]:
print("1st element: ", x[0])
print("2nd element: ", x[1])
print("3rd element: ", x[2])
print("5th element: ", x[4])

1st element:  1
2nd element:  2
3rd element:  3
5th element:  5


In [31]:
# negative indexing
print("1st element: ", x[-5])
print("2nd element: ", x[-4])
print("3rd element: ", x[-3])
print("5th element: ", x[-1])


1st element:  1
2nd element:  2
3rd element:  3
5th element:  5


In [32]:
y = np.arange(1, 10).reshape(3, 3)
print(y)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [33]:
print("element at (0, 0): ", y[0, 0])
print("element at (1, 2): ", y[1, 2])
print("element at (2, 1): ", y[2, 1])

element at (0, 0):  1
element at (1, 2):  6
element at (2, 1):  8


In [34]:
# modifying element

y[0, 0] = 100
print(y)


[[100   2   3]
 [  4   5   6]
 [  7   8   9]]


In [35]:
# deleting element

X = np.delete(x, [0, 4])
print(X)

[2 3 4]


In [36]:
Y = np.delete(y, [0, 0])
print(Y)

[2 3 4 5 6 7 8 9]


In [37]:
# appending an element

W = np.append(y, [[10, 20, 30]], axis = 0)
print(W)

[[100   2   3]
 [  4   5   6]
 [  7   8   9]
 [ 10  20  30]]


In [38]:
V = np.append(y, [[10], [20], [30]], axis = 1)
print(V)

[[100   2   3  10]
 [  4   5   6  20]
 [  7   8   9  30]]


In [39]:
# inserting an element at a position

Z = np.insert(W, 2, [9, 8, 7], axis = 0)
print(Z)

[[100   2   3]
 [  4   5   6]
 [  9   8   7]
 [  7   8   9]
 [ 10  20  30]]


In [40]:
x = np.array([1,2])
y = np.array([[3,4],[5,6]])


# stack x on top of Y
K = np.vstack((x, y))
print("K:\n", K)

# x on the right of Y
M = np.hstack((y,x.reshape(2,1)))
print("M:\n", M)

K:
 [[1 2]
 [3 4]
 [5 6]]
M:
 [[3 4 1]
 [5 6 2]]


# Slicing ndarrays

- three types of slicing:

1. array[ start:end ]
2. array[ start: ]
3. array[ :end ]

start - inclusive
end - exclusive


In [41]:
X = np.arange(0, 20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [42]:
# [row, column]
y = X[1:3, 2:4]
print(y)

[[ 7  8]
 [12 13]]


In [43]:
# rows- second through last, columns- third through last
y = X[1:, 2:]
print(y)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [44]:
# rows- first through third, columns- first through second
y = X[:3, :2]
print(y)

[[ 0  1]
 [ 5  6]
 [10 11]]


In [45]:
# all element in third column
y = X[:, 2]
print(y)

[ 2  7 12 17]


In [46]:
y = X[:, 2:3]
print(y)

[[ 2]
 [ 7]
 [12]
 [17]]


In [47]:
y = X[2:3, :]
z = X[2, :]
print(y)
print(z)

[[10 11 12 13 14]]
[10 11 12 13 14]


In [48]:
z = X[1:, 2:]
print("X: ")
print(X)
print("z: ")
print(z)


X: 
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
z: 
[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


**Note-** when we perform slices on ndarrays and save them into new variables, as we did above, the data is not copied into the new variable.

In [49]:
z[2, 2] = 333333
print("X: ", X)
print("z: ", z)

X:  [[     0      1      2      3      4]
 [     5      6      7      8      9]
 [    10     11     12     13     14]
 [    15     16     17     18 333333]]
z:  [[     7      8      9]
 [    12     13     14]
 [    17     18 333333]]


In [50]:
# .copy()- It returns a copy of the array

z = X[1:, 2:].copy()
z[2, 2] = 6666
print("X: ", X)
print("z: ", z)

X:  [[     0      1      2      3      4]
 [     5      6      7      8      9]
 [    10     11     12     13     14]
 [    15     16     17     18 333333]]
z:  [[   7    8    9]
 [  12   13   14]
 [  17   18 6666]]


In [51]:
# extract diagonal elements
d = np.diag(X)
print(d)

[ 0  6 12 18]


In [52]:
# extract elements above diagonal
d = np.diag(X, k=1)
print(d)

[     1      7     13 333333]


In [53]:
# extraxt elements below diagonal
d = np.diag(X, k = -1)
print(d)

[ 5 11 17]


In [54]:
# unique elements from array
X = [[1, 2, 3], [2, 3, 3]]
print(np.unique(X))

[1 2 3]


# Boolean Indexing, Set Operations, and Sorting

### Boolean Indexing

slicing array using indices is useful when we are knowing the exact indices of the element wwe want to select
but when we are not knowing the indices of the element we want to select then boolean indexing can be helpful.

In [55]:
X = np.arange(0, 20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [56]:
# select element greater than 10
print(X[X>10])

[11 12 13 14 15 16 17 18 19]


In [57]:
# select element less than 8
print(X[X<=7])

[0 1 2 3 4 5 6 7]


In [58]:
# select all the element with value between 11 & 19
print(X[(X>10)&(X<20)])

[11 12 13 14 15 16 17 18 19]


In [59]:
# replace all the element with value between 11 & 19 with -1
X[(X>10)&(X<20)] = -1
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 -1 -1 -1 -1]
 [-1 -1 -1 -1 -1]]


### Set Operation

In [60]:
x = np.array([1, 2, 3, 4, 5])
y = np.array([6, 7, 3, 2, 8, 9])

In [61]:
print(np.intersect1d(x, y))
print(np.setdiff1d(x, y))
print(np.union1d(x, y))

[2 3]
[1 4 5]
[1 2 3 4 5 6 7 8 9]


### Sorting

**default sort method (X.sort())-**
- The method above sorts an array in-place.that the original array will be changed to the sorted one

**numpy.sort function (np.sort(X))-**
-  it sorts the ndrrays out of place. it doesn't change the original ndarray being sorted.

In [62]:
s = np.random.randint(1, 20, size = (10, ))
print(s)

[17 15 19 14  5  5 11 19 19 17]


In [63]:
print(np.sort(s))
print(s)

[ 5  5 11 14 15 17 17 19 19 19]
[17 15 19 14  5  5 11 19 19 17]


In [64]:
s = np.random.randint(1, 20, size = (10, ))
print(s)
s.sort()
print(s)

[19  3  7  8 11  4  5  2  4 10]
[ 2  3  4  4  5  7  8 10 11 19]


In [65]:
s = np.random.randint(1, 20, size = (5, 5))
print(s)

[[ 5 17 16 12 19]
 [13 14 14  6 19]
 [ 5 15  6 18  8]
 [14  2 14 13  4]
 [ 1 10  3  9  5]]


In [66]:
# sort column wise
print(np.sort(s, axis = 0))

[[ 1  2  3  6  4]
 [ 5 10  6  9  5]
 [ 5 14 14 12  8]
 [13 15 14 13 19]
 [14 17 16 18 19]]


In [67]:
# sort row wise
print(np.sort(s, axis = 1))

[[ 5 12 16 17 19]
 [ 6 13 14 14 19]
 [ 5  6  8 15 18]
 [ 2  4 13 14 14]
 [ 1  3  5  9 10]]
