1.Python List vs NumPy Array (Memory & Performance)

| Feature      | Python List                     | NumPy Array                    |
| ------------ | ------------------------------- | ------------------------------ |
| Data Type    | Can store mixed types           | Stores same data type only     |
| Memory Usage | More (stores object references) | Less (contiguous memory)       |
| Speed        | Slower for math operations      | Very fast (C optimized)        |
| Operations   | Need loops                      | Supports vectorized operations |


In [1]:
import numpy as np

# Python Lists
list1 = [1, 2, 3, 4]
list2 = [5, 6, 7, 8]

# Element-wise multiplication using loop
list_result = [a * b for a, b in zip(list1, list2)]
print("List Multiplication:", list_result)

# NumPy Arrays
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Element-wise multiplication
array_result = arr1 * arr2
print("NumPy Multiplication:", array_result)


List Multiplication: [5, 12, 21, 32]
NumPy Multiplication: [ 5 12 21 32]


2.What is Broadcasting?

Broadcasting allows NumPy to perform operations on arrays of different shapes.

Instead of copying data, NumPy automatically expands dimensions logically.

In [2]:
import numpy as np

A = np.array([[1,2,3],
              [4,5,6],
              [7,8,9]])

B = np.array([10,20,30])

result = A + B
print(result)


[[11 22 33]
 [14 25 36]
 [17 28 39]]


3.What Are Missing Values?

Missing values represent empty or unknown data.

They are represented as:

NaN (Not a Number)

None

In [3]:
import pandas as pd
import numpy as np

data = {
    "Math": [90, 85, np.nan, 70],
    "Science": [80, np.nan, 75, 60],
    "English": [np.nan, 88, 92, 85]
}

df = pd.DataFrame(data)
print(df)


   Math  Science  English
0  90.0     80.0      NaN
1  85.0      NaN     88.0
2   NaN     75.0     92.0
3  70.0     60.0     85.0


isnull() â†’ shows True/False

sum() â†’ counts missing values

df.mean() calculates column-wise average

âœ” fillna() replaces NaN with mean

4.What is Boolean Indexing?

It filters data using True/False conditions.

In [4]:
data = {
    "Name": ["A", "B", "C", "D", "E"],
    "Age": [25, 30, 22, 35, 28],
    "Salary": [50000, 60000, 45000, 70000, 52000],
    "Department": ["IT", "HR", "IT", "Finance", "HR"],
    "Experience": [2, 5, 1, 8, 3]
}

df = pd.DataFrame(data)

# Filter: Age > 25 AND Salary > 50000
filtered = df[(df["Age"] > 25) & (df["Salary"] > 50000)]
print(filtered)


  Name  Age  Salary Department  Experience
1    B   30   60000         HR           5
3    D   35   70000    Finance           8
4    E   28   52000         HR           3


5.Purpose of groupby() in Pandas

Pandas
ðŸ”¹ What is groupby?

It splits data into groups and applies operations like:

sum

mean

count

max/min

In [None]:
import pandas as pd

data = {
    "Department": ["IT", "HR", "IT", "Finance", "HR"],
    "Salary": [50000, 60000, 55000, 70000, 65000]
}

df = pd.DataFrame(data)

result = df.groupby("Department")["Salary"].mean()
print(result)
