**The Essential NumPy Topics for Data Analytics**


**1. Introduction to NumPy**

NumPy is the foundation of data analytics in Python.
It provides:

Fast numerical operations

Efficient arrays

Tools for statistics, reshaping, filtering, and more

The underlying engine for Pandas, SciPy, Scikit‑Learn, and Matplotlib

**1.1 Install (if needed)**

In [None]:
!pip install python

**1.2 Import**

In [None]:
import numpy as np

**2. NumPy Arrays (ndarray)**
1. What an ndarray is
A NumPy array is:
- A homogeneous data container (all elements have the same data type)
- A multi‑dimensional structure (1D, 2D, 3D, …)
- A fast, memory‑efficient alternative to Python lists
- A structure that supports vectorized operations (operations on entire arrays at once)
Think of it as a super‑powered list designed for mathematics.



**2. Creating NumPy Arrays (ndarray)**

In [None]:
a= np.array([1,2,3,4,5])
b= np.array([[6,7,8],[9,10,11]]) ## all the elements have to be of the same type and the array has to be of fixed size

**✔ Array attributes**

In [None]:
print(a.shape)      # dimensions
print(a.ndim)       # number of dimensions
print(a.size)       # total elements
print(a.dtype)      # data type


In [None]:
print(b)          # print the array
print(b.shape)      # dimensions
print(b.ndim)       # number of dimensions
print(b.size)       # total elements
print(b.dtype) 

**3. Array Creation Techniques**


In [None]:
## Create a 3×3 array filled with zeros
np.zeros((3, 3))





In [None]:
## Create a 2×4 array filled with ones
np.ones((2, 4)) # shape = (3, 3)



In [None]:
## Create values from 0 to 10 (exclusive) with step size 2
np.arange(0, 10, 2) # start=0, stop=10 (exclusive), step=2



In [None]:
## Create 5 evenly spaced numbers between 0 and 1
np.linspace(0, 1, 5) # start=0, stop=10, num=2 ; unlike arange, linspace includes the endpoint



In [None]:
## Create a 3×3 array of random numbers between 0 and 1
np.random.rand(3, 3)

**4. Indexing & Slicing**


In [None]:
arr = np.array([10, 20, 30, 40, 50])

arr[0]      # 10
arr[-1]     # 50
arr[1:4]    # [20 30 40]


In [None]:
#create a 2D array 
arr_2d=np.array([1,2,3,4,5,6]).reshape(2,3)
print(arr_2d)

In [None]:
#Create an 3D array
arr_3d=np.array([1,2,3,4,5,6,7,8]).reshape(2,2,2)

In [None]:
array= np.array([[1,2,3],[4,5,6],[7,8,9]])

In [None]:
print(array)

In [None]:
print(array[1,1])

**5. Boolean Filtering (VERY important for analytics)**

In [None]:
a = np.array([12, 5, 18, 7, 30])

data[a > 10]        # [12 18 30]
data[(a > 10) & (a < 20)]   # [12 18]
print(data[a>10])

6. Reshaping & Dimensionality
Used constantly in ML and preprocessing.



In [None]:
arr = np.arange(12)



In [None]:
arr.reshape(3,4) ## The data stays the same — only the structure changes.

In [None]:
arr.reshape(-1, 6)     # column: 6, row: auto-calc dimension

In [None]:
array_flatened = np.arange(12).reshape(3,4).flatten() #converts any multi‑dimensional array into a 1‑dimensional array.

print(array_flatened)

**7. Combining & Splitting Data**


In [None]:
a = np.array([1,2,3]) 
b = np.array([4,5,6]) 
np.vstack([a,b]) 
np.hstack([a,b])

**Splitting**



In [None]:
arr = np.arange(10) 
np.split(arr, 2)

**8. Mathematical & Statistical Operations**

In [None]:
arr = np.array([10,20,30,40]) 
print(arr.sum()) 
print(arr.mean()) 
print(arr.std()) ## standard deviation
print(arr.var()) ## variance
print(arr.min()) 
print(arr.max()) 
print(arr.argmin()) # index of minimum value
print(arr.argmax()) # index of maximum value

**9. Broadcasting (critical for analytics)**


In [None]:
#NumPy apply operations across arrays of different shapes. This is called broadcasting. It allows you to perform operations on arrays of different sizes and shapes without explicitly reshaping them.
arr = np.array([1,2,3]) 
arr + 10             # [11 12 13]

In [None]:
m = np.array([[1,2,3],[4,5,6]])
m + np.array([10,20,30])


**10. Random Number Generation**

In [None]:
rng = np.random.default_rng() #rng is just an engine. not a variable that stores data

rng.integers(0, 10, size=5) # Generate 5 random integers between 0 (inclusive) and 10 (exclusive)
rng.random(3) # Generate  random integers between 0 
rng.normal(0, 1, 5)
rng.choice([1,2,3,4], size=3) #randomly selects 3 values from the list:



In [None]:
rng.random(3)

In [None]:
a=np.array([[1,2],[4,5]])
b=np.array([[7,9],[10,11]])

In [None]:
a@b # matrix multiplication (transpose b so dimensions are compatible)

In [None]:
a @ b.T # matrix multiplication (transpose b so dimensions are compatible)
np.linalg.inv(a) # inverse of a matrix
np.linalg.det(a) # determinant
np.linalg.eig(a) # eigenvalues

**12. Saving & Loading Data**

In [None]:
np.save("data.npy", arr) 
np.load("data.npy")