# Introduction to NumPy

- Definition
- NumPy (Numerical Python) is a Python library used for fast numerical computation.
- It provides an efficient multi-dimensional array object called ndarray and supports operations such as mathematical functions, linear algebra, random numbers, statistics, etc.


### Why NumPy?
| Python List                 | NumPy Array            |
| --------------------------- | ---------------------- |
| Slow                        | Fast (C optimized)     |
| Stores different data types | Stores same data type  |
| No vectorization            | Supports vectorization |
| More memory                 | Uses less memory       |


In [3]:
# How to install NumPy.
!pip install numpy

Defaulting to user installation because normal site-packages is not writeable


In [2]:
import numpy as np

print("NumPy Version:- ", np.__version__)

NumPy Version:-  1.26.4


In [5]:
print("Numpy Available :- ", np)

Numpy Available :-  <module 'numpy' from 'C:\\Users\\sagar\\AppData\\Roaming\\Python\\Python312\\site-packages\\numpy\\__init__.py'>


## NumPy Array Fundamentals
- An ndarray is a multidimensional, homogeneous (same datatype) array in NumPy.

### Creating NumPy Arrays
- Different ways to create arrays

### 1. Using np.array()

In [3]:
arr = np.array([10, 20, 30])
print(arr)

[10 20 30]


### 2. Multi-dimensional array

In [4]:
arr2 = np.array([[1, 2, 3],
                 [4, 5, 6]])
print(arr2)


[[1 2 3]
 [4 5 6]]


### 3.Zeros array

In [5]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

### 4.Ones array

In [6]:
np.ones((3,3))


array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

### 5. Full array with constant

In [7]:
np.full((2,2),7)

array([[7, 7],
       [7, 7]])

In [8]:
np.full((3,3),5)

array([[5, 5, 5],
       [5, 5, 5],
       [5, 5, 5]])

### 6. Identity matrix

In [9]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [54]:
np.eye(4,4,k=-1, dtype=int) # Shift Diagonal Downwards (k = -1)

array([[0, 0, 0, 0],
       [1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0]])

In [55]:
np.eye(4,4,k=1, dtype=int) #Shift Diagonal upwards (k = 1)

array([[0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1],
       [0, 0, 0, 0]])

In [50]:
np.identity(3, dtype= int)

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

### 7. Random Array

In [13]:
np.random.rand(3,3)

array([[0.85529146, 0.99450626, 0.66430944],
       [0.35925627, 0.5165937 , 0.19729736],
       [0.95199354, 0.94102246, 0.5789598 ]])

In [18]:
np.random.randint(1,10,(3,7))

array([[9, 3, 6, 7, 9, 7, 8],
       [5, 6, 3, 8, 6, 9, 5],
       [2, 6, 7, 6, 8, 1, 5]])

### 8. Arange / Linspace / Logspace

In [20]:
np.arange(0, 10, 2)   

array([0, 2, 4, 6, 8])

In [21]:
np.linspace(1, 5, 5)


array([1., 2., 3., 4., 5.])

In [22]:
np.logspace(1, 3, 4)

array([  10.        ,   46.41588834,  215.443469  , 1000.        ])

## Array Attributes

- Definition:
- Array attributes describe array properties.

| Attribute      | Meaning                  |
| -------------- | ------------------------ |
| `arr.shape`    | Dimensions of array      |
| `arr.ndim`     | Number of dimensions     |
| `arr.size`     | Total elements           |
| `arr.dtype`    | Datatype                 |
| `arr.itemsize` | Size (bytes) per element |
| `arr.nbytes`   | Total memory             |


In [25]:
arr = np.array([[1,2,3],[4,5,6]])

print("Dimensions of array :- ",arr.shape)    
print("Number of dimensions:- ",arr.ndim)    
print("Total elements:- ", arr.size)    
print("Data Types :- ",arr.dtype)    
print("Size per elements " ,arr.itemsize) 
print("Total Memory :- ",arr.nbytes)   
print("Transpose Matrix :- ",arr.T)

Dimensions of array :-  (2, 3)
Number of dimensions:-  2
Total elements:-  6
Data Types :-  int32
Size per elements  4
Total Memory :-  24
Transpose Matrix :-  [[1 4]
 [2 5]
 [3 6]]


### Indexing & Slicing

- Definition
1. Indexing → Select single element
2. Slicing → Select range of elements

In [27]:
# 1D Indexing
arr = np.array([10,20,30,40])
print(arr[0])   
print(arr[-1]) 

10
40


In [28]:
#1D Slicing
arr[1:3]

array([20, 30])

In [29]:
arr[:3]

array([10, 20, 30])

In [30]:
arr[2:]

array([30, 40])

In [31]:
arr[::2]

array([10, 30])

In [32]:
# 2D Indexing

arr = np.array([[10,20,30],
                [40,50,60]])

print(arr[0,1]) 
print(arr[1,2])

20
60


In [35]:
#2D Slicing
print(arr[:, 1]  )
print(arr[0, :]     )
print(arr[0:2, 1:3] )


[20 50]
[10 20 30]
[[20 30]
 [50 60]]


### Integer Indexing / Fancy Indexing
- Integer indexing (Fancy Indexing) means selecting elements of a NumPy array using lists or arrays of integer positions instead of single integers or slices

In [88]:
import numpy as np

arr = np.array([10, 20, 30, 40, 50])

indexes = [0, 2, 4]   # we want elements at positions 0, 2, 4

print(arr[indexes])


[10 30 50]


In [89]:
arr = np.array([[10, 20, 30],
                [40, 50, 60],
                [70, 80, 90]])

print(arr[[0, 2]])   # select row 0 and row 2

[[10 20 30]
 [70 80 90]]


In [90]:
print(arr[:, [0, 2]])  # all rows , select column 0 and 2

[[10 30]
 [40 60]
 [70 90]]


In [91]:
arr = np.array([[5, 6, 7],
                [8, 9, 10],
                [11, 12, 13]])

rows = [0, 1, 2]
cols = [2, 1, 0]

print(arr[rows, cols])


[ 7  9 11]


### Vectorized Operations

- Definition
- Vectorization means performing operations on entire arrays without loops.

In [36]:
#Arithmetic
arr = np.array([1,2,3])

print(arr + 10)
print(arr * 2)
print(arr ** 2)


[11 12 13]
[2 4 6]
[1 4 9]


In [37]:
#Array vs Array

a = np.array([1,2,3])
b = np.array([4,5,6])

print(a + b)
print(a * b)


[5 7 9]
[ 4 10 18]


### Universal Functions (ufuncs) & Aggregation functions

- Definition
- Ufuncs are fast element-wise mathematical functions.

In [58]:
arr = np.array([1,2,3,4,5])

In [59]:
print(np.sqrt(arr))

[1.         1.41421356 1.73205081 2.         2.23606798]


In [60]:
np.exp(arr)

array([  2.71828183,   7.3890561 ,  20.08553692,  54.59815003,
       148.4131591 ])

In [61]:
np.log(arr)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791])

In [62]:
np.sin(arr)

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 , -0.95892427])

In [63]:
np.cos(arr)

array([ 0.54030231, -0.41614684, -0.9899925 , -0.65364362,  0.28366219])

In [64]:
np.tan(arr)

array([ 1.55740772, -2.18503986, -0.14254654,  1.15782128, -3.38051501])

In [65]:
np.sum(arr)

15

In [66]:
np.mean(arr)

3.0

In [67]:
np.std(arr)

1.4142135623730951

In [68]:
np.median(arr)

3.0

In [69]:
np.min(arr)

1

In [70]:
np.max(arr)

5

### Boolean Indexing
- Definition
- Selecting elements using conditions.

In [46]:
arr = np.array([10, 20, 30, 40])

print(arr[arr > 20])
print(arr[arr % 2 == 0])


[30 40]
[10 20 30 40]


### NumPy Broadcasting

- Definition
- Broadcasting allows operations between arrays of different shapes.

In [71]:
import numpy as np

a = np.array([1, 2, 3])
b = 5
print("Array a:\n", a)
print("\nResult (a + b):\n", a + b)


Array a:
 [1 2 3]

Result (a + b):
 [6 7 8]


In [73]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

b = np.array([10, 20, 30])

print("Array a:\n", a)
print("\nArray b:\n", b)
print("\nResult (a + b):\n", a + b)


Array a:
 [[1 2 3]
 [4 5 6]]

Array b:
 [10 20 30]

Result (a + b):
 [[11 22 33]
 [14 25 36]]


In [74]:
a = np.array([[1],
              [2],
              [3]])

b = np.array([10, 20, 30])

print("Array a:\n", a)
print("\nArray b:\n", b)
print("\nResult (a + b):\n", a + b)


Array a:
 [[1]
 [2]
 [3]]

Array b:
 [10 20 30]

Result (a + b):
 [[11 21 31]
 [12 22 32]
 [13 23 33]]


In [75]:
#2D Matrix + Scalar
a = np.array([[10, 20],
              [30, 40]])

print("Array a:\n", a)
print("\nResult (a * 2):\n", a * 2)


Array a:
 [[10 20]
 [30 40]]

Result (a * 2):
 [[20 40]
 [60 80]]


In [76]:
#2D Array + 1D Column Vector (Different Shape)
a = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

b = np.array([1, 2, 3])  # column-like but actually row-vector

b = b.reshape(3, 1)  # making it proper column vector

print("Array a:\n", a)
print("\nColumn vector b:\n", b)

print("\nResult (a + b):\n", a + b)


Array a:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Column vector b:
 [[1]
 [2]
 [3]]

Result (a + b):
 [[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]


| Practical | Operation      | Works? | Reason                |
| --------- | -------------- | ------ | --------------------- |
| 1         | Array + Scalar | ✔      | scalar expands        |
| 2         | 2D + 1D row    | ✔      | row repeats           |
| 3         | Column + Row   | ✔      | both expand           |
| 4         | 2D + scalar    | ✔      | scalar expands        |
| 5         | 2D + column    | ✔      | column repeats        |
| 6         | (3,) + (2,)    | ❌      | shapes not compatible |


### Shape Manipulation 
- Shape manipulation means changing the structure of an array without changing the data.


### 1.reshape()
- reshape() changes the shape (rows × columns) of an array without changing its data.

In [77]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

reshaped = arr.reshape(2, 3)
print(reshaped)


[[1 2 3]
 [4 5 6]]


### 2.ravel() & flatten()
- Both convert an array into 1D array.

| Function      | Returns              | Meaning                           |
| ------------- | -------------------- | --------------------------------- |
| **ravel()**   | View (no new memory) | Changes reflect in original array |
| **flatten()** | Copy                 | Changes DO NOT reflect            |


In [78]:
arr = np.array([[1, 2], [3, 4]])

print("ravel:", arr.ravel())
print("flatten:", arr.flatten())


ravel: [1 2 3 4]
flatten: [1 2 3 4]


In [79]:
r = arr.ravel()
r[0] = 100
print(arr)   # original changed


[[100   2]
 [  3   4]]


### 3.transpose() / arr.T
- Transpose swaps rows and columns.

In [80]:
arr.T
# or
np.transpose(arr)


array([[100,   3],
       [  2,   4]])

In [81]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

print(arr.T)


[[1 4]
 [2 5]
 [3 6]]


### 4.Stacking (Combining arrays)
- Used to join multiple arrays.

In [82]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(np.vstack((a, b)))


[[1 2 3]
 [4 5 6]]


### 5. Horizontal Stacking (hstack)
Adds arrays left to right (columns increase)

In [83]:
print(np.hstack((a, b)))


[1 2 3 4 5 6]


In [84]:
print(np.stack((a, b), axis=0))  # vertical
print(np.stack((a, b), axis=1))  # column-wise pairing


[[1 2 3]
 [4 5 6]]
[[1 4]
 [2 5]
 [3 6]]


### 6. Splitting
Splits an array into equal partitions.

In [85]:
#np.split(array, parts)

arr = np.array([10, 20, 30, 40, 50, 60])

print(np.split(arr, 3))


[array([10, 20]), array([30, 40]), array([50, 60])]


In [86]:
#Horizontal Split (hsplit)

arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8]])

print(np.hsplit(arr, 2))


[array([[1, 2],
       [5, 6]]), array([[3, 4],
       [7, 8]])]


In [87]:
#Vertical Split (vsplit)
print(np.vsplit(arr, 2))


[array([[1, 2, 3, 4]]), array([[5, 6, 7, 8]])]


# Assignments

1️. Hospital Data Analysis

You are given patient temperature readings for one week as a NumPy array.
Tasks:[98.4, 99.1, 101.2, 100.5, 98.9, 102.1, 99.8]

- Find average temperature
- Find max & min temperature
- Convert Fahrenheit → Celsius using vectorized operations
- Identify days where temperature > 100°F️
  
2. Student Marks Dataset

Marks of 50 students in 3 subjects stored in a NumPy 2D array.
Tasks:

- Find average marks per subject
- Find top 5 scoring students
- Identify students scoring < 40 in any subject
- Add 5 grace marks to all students using broadcasting

3. Logistics: Delivery Time Analysis

Array contains delivery times (in hours) of 500 parcels.
Tasks:

- Find mean, median, std deviation
- Identify fast deliveries (< 12 hrs)
- Identify extremely slow deliveries (> 48 hrs)
- Convert hours → days

4. Manufacturing: Machine Sensor Readings

Machine collects vibration values every second → 10,000 readings.
Tasks:

- Smooth the data using moving average (window=5)
- Detect abnormal values (> threshold)
- Reshape into batches of 100 samples
- Find max vibration per batch

5. Ecommerce: Product Rating Processing

Ratings stored for 1,000 users (1–5).
Tasks:

- Count how many users gave rating 5
- Replace all rating 1 → 2
- Standardize ratings (z-score normalization)
- Calculate frequency of each rating

6. Cricket Analytics: Player Performance Matrix

Runs scored in 10 matches by 11 players (11×10 matrix).
Tasks:

- Find average runs per player
- Find match with maximum total runs
- Extract players scoring > 50 in any match (fancy indexing)
- Add 10 bonus runs to match 5 using broadcasting