# üéØ NumPy Interview & Exam Preparation

## Complete Question Bank with Solutions

This notebook contains **75 carefully crafted questions** covering all NumPy topics from basic to advanced level.

### Question Categories:
- **Section 1**: Fundamentals (Q1-15)
- **Section 2**: Array Operations (Q16-30)
- **Section 3**: Advanced Indexing & Manipulation (Q31-45)
- **Section 4**: Linear Algebra & Statistics (Q46-60)
- **Section 5**: Performance & Real-World Problems (Q61-75)

### How to Use:
1. Try to solve each question on your own first
2. Run the solution cells to check your answer
3. Study the explanations provided
4. Mark difficult questions for review

**Good luck! üöÄ**

In [2]:
import numpy as np
print(f"NumPy version: {np.__version__}")
print("‚úÖ Ready for interview preparation!")

NumPy version: 1.25.0
‚úÖ Ready for interview preparation!


---
# Section 1: Fundamentals (Q1-15)

Testing basic NumPy concepts, array creation, and properties.

### Q1: What will be the output of the following code?

```python
arr = np.array([1, 2, 3, 4, 5])
result = arr[1:4]
result[0] = 999
print(arr)
```

**Options:**
- A) `[1, 2, 3, 4, 5]`
- B) `[1, 999, 3, 4, 5]`
- C) `[999, 2, 3, 4, 5]`
- D) Error

**Difficulty:** Medium  
**Topic:** Views vs Copies

In [10]:
# Solution Q1
arr = np.array([1, 2, 3, 4, 5])
result = arr[1:4]
result[0] = 999
print(arr)

print("\n‚úÖ Answer: B) [1, 999, 3, 4, 5]")
print("\nExplanation:")
print("Slicing creates a VIEW, not a copy. Changes to 'result' affect the original array.")
print("result[0] modifies arr[1] because they share memory.")

[  1 999   3   4   5]

‚úÖ Answer: B) [1, 999, 3, 4, 5]

Explanation:
Slicing creates a VIEW, not a copy. Changes to 'result' affect the original array.
result[0] modifies arr[1] because they share memory.


### Q2: Create a 5x5 matrix with values ranging from 0 to 24 and extract the border elements.

**Expected Output:**
```
Border elements: [0, 1, 2, 3, 4, 9, 14, 19, 24, 23, 22, 21, 20, 15, 10, 5]
```

**Difficulty:** Medium  
**Topic:** Array Creation & Indexing

### Q2b: Given the following 2D NumPy array, how would you retrieve the subarray `[[60, 70], [100, 110]]`?

```python
arr = np.array([
    [10, 20, 30, 40],
    [50, 60, 70, 80],
    [90, 100, 110, 120],
    [130, 140, 150, 160]
])
```

**Note:** This question may have more than one correct answer.

**Difficulty:** Medium  
**Topic:** 2D Array Indexing and Slicing

In [None]:
# Solution Q2b
arr = np.array([
    [10, 20, 30, 40],
    [50, 60, 70, 80],
    [90, 100, 110, 120],
    [130, 140, 150, 160]
])

print("Original array:")
print(arr)
print(f"Shape: {arr.shape}\n")

print("Target subarray: [[60, 70], [100, 110]]")
print("This is rows 1-2, columns 1-2\n")

# Method 1: Using positive indices
result1 = arr[1:3, 1:3]
print("Method 1 - arr[1:3, 1:3]:")
print(result1)

# Method 2: Using negative indices
result2 = arr[1:-1, 1:-1]
print("\nMethod 2 - arr[1:-1, 1:-1]:")
print(result2)

# Method 3: Mixed positive and negative indices
result3 = arr[1:3, -3:-1]
print("\nMethod 3 - arr[1:3, -3:-1]:")
print(result3)

# Verify all methods give same result
print("\n‚úÖ All three methods are correct!")
print(f"Method 1 == Method 2: {np.array_equal(result1, result2)}")
print(f"Method 1 == Method 3: {np.array_equal(result1, result3)}")

print("\nüìù Explanation:")
print("Row slicing [1:3] means:")
print("  - Start at row 1 (second row: [50, 60, 70, 80])")
print("  - End before row 3 (stop at row 2: [90, 100, 110, 120])")
print("\nColumn slicing [1:3] means:")
print("  - Start at column 1 (second column)")
print("  - End before column 3 (stop at column 2)")
print("\nNegative indices:")
print("  - arr[1:-1, 1:-1] excludes first and last row/column")
print("  - arr[-3:-1] means 3rd from end to 1st from end (exclusive)")

print("\nüí° Remember:")
print("- Slicing is [start:stop] where stop is EXCLUSIVE")
print("- Negative indices count from the end (-1 is last)")
print("- arr[row_slice, col_slice] for 2D arrays")

In [None]:
# Solution Q2
matrix = np.arange(25).reshape(5, 5)
print("Matrix:")
print(matrix)

# Extract border elements
border = np.concatenate([
    matrix[0, :],      # Top row
    matrix[1:, -1],    # Right column (excluding top)
    matrix[-1, -2::-1],  # Bottom row (reversed, excluding right corner)
    matrix[-2:0:-1, 0]   # Left column (reversed, excluding corners)
])

print("\nBorder elements:", border)

print("\n‚úÖ Explanation:")
print("We extract: top row ‚Üí right column ‚Üí bottom row (reversed) ‚Üí left column (reversed)")

### Q3: What is the difference between `np.zeros((3, 4))` and `np.empty((3, 4))`?

**Difficulty:** Easy  
**Topic:** Array Creation

In [None]:
# Solution Q3
zeros_array = np.zeros((3, 4))
empty_array = np.empty((3, 4))

print("np.zeros((3, 4)):")
print(zeros_array)
print("\nnp.empty((3, 4)):")
print(empty_array)

print("\n‚úÖ Answer:")
print("- np.zeros() initializes all elements to 0")
print("- np.empty() does NOT initialize - contains random values from memory")
print("- np.empty() is slightly faster but values are unpredictable")
print("- Always use np.zeros() unless you'll immediately overwrite all values")

### Q4: How do you create an array of 10 evenly spaced values between 0 and 1 (inclusive)?

**Options:**
- A) `np.arange(0, 1, 0.1)`
- B) `np.linspace(0, 1, 10)`
- C) `np.range(0, 1, 10)`
- D) `np.logspace(0, 1, 10)`

**Difficulty:** Easy  
**Topic:** Array Creation

In [None]:
# Solution Q4
print("Testing each option:\n")

# Option A
a = np.arange(0, 1, 0.1)
print(f"A) np.arange(0, 1, 0.1): {a}")
print(f"   Length: {len(a)} - Doesn't include 1.0\n")

# Option B
b = np.linspace(0, 1, 10)
print(f"B) np.linspace(0, 1, 10): {b}")
print(f"   Length: {len(b)} - Includes both 0 and 1 ‚úì\n")

# Option C - doesn't exist
print("C) np.range() - Does not exist in NumPy\n")

# Option D
d = np.logspace(0, 1, 10)
print(f"D) np.logspace(0, 1, 10): {d}")
print(f"   Creates logarithmically spaced values (10^0 to 10^1)\n")

print("‚úÖ Answer: B) np.linspace(0, 1, 10)")
print("\nKey difference: linspace includes endpoint by default, arange doesn't")

### Q5: What will `arr.reshape(3, -1)` do if `arr` has 12 elements?

**Difficulty:** Easy  
**Topic:** Reshaping

In [None]:
# Solution Q5
arr = np.arange(12)
print("Original array:", arr)
print(f"Shape: {arr.shape}\n")

reshaped = arr.reshape(3, -1)
print("After reshape(3, -1):")
print(reshaped)
print(f"Shape: {reshaped.shape}")

print("\n‚úÖ Answer:")
print("Creates a 3x4 array")
print("\nExplanation:")
print("The -1 tells NumPy to automatically calculate that dimension")
print("Since we have 12 elements and want 3 rows: 12 / 3 = 4 columns")
print("This is useful when you know one dimension but not the other")

### Q6: Find all elements in an array that are between 5 and 10 (inclusive).

Given: `arr = np.array([2, 7, 4, 9, 12, 6, 3, 11, 8, 5])`

**Difficulty:** Easy  
**Topic:** Boolean Indexing

In [None]:
# Solution Q6
arr = np.array([2, 7, 4, 9, 12, 6, 3, 11, 8, 5])
print("Original array:", arr)

# Method 1: Boolean indexing
result = arr[(arr >= 5) & (arr <= 10)]
print("\nElements between 5 and 10:", result)

# Method 2: Using np.where
indices = np.where((arr >= 5) & (arr <= 10))
result2 = arr[indices]
print("Using np.where:", result2)

print("\n‚úÖ Key Points:")
print("- Use & for 'and', | for 'or' with boolean conditions")
print("- Must use parentheses around each condition")
print("- Cannot use 'and'/'or' keywords - must use &/| operators")

### Q7: What is the output of `np.arange(10).reshape(2, 5).T.shape`?

**Options:**
- A) `(2, 5)`
- B) `(5, 2)`
- C) `(10,)`
- D) Error

**Difficulty:** Medium  
**Topic:** Reshaping & Transpose

In [None]:
# Solution Q7
arr = np.arange(10)
print("Step 1 - arange(10):", arr)
print(f"Shape: {arr.shape}\n")

reshaped = arr.reshape(2, 5)
print("Step 2 - reshape(2, 5):")
print(reshaped)
print(f"Shape: {reshaped.shape}\n")

transposed = reshaped.T
print("Step 3 - Transpose (.T):")
print(transposed)
print(f"Shape: {transposed.shape}")

print("\n‚úÖ Answer: B) (5, 2)")
print("\nExplanation:")
print("1. arange(10) creates [0,1,2,3,4,5,6,7,8,9] with shape (10,)")
print("2. reshape(2,5) creates 2 rows √ó 5 columns")
print("3. .T transposes it to 5 rows √ó 2 columns")

### Q8: Create a checkerboard pattern (8x8) with 0s and 1s.

**Expected Output:**
```
[[0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 ...]
```

**Difficulty:** Medium  
**Topic:** Array Creation & Indexing

In [None]:
# Solution Q8

# Method 1: Using slicing
checkerboard1 = np.zeros((8, 8), dtype=int)
checkerboard1[::2, 1::2] = 1  # Even rows, odd columns
checkerboard1[1::2, ::2] = 1  # Odd rows, even columns
print("Method 1 - Using slicing:")
print(checkerboard1)

# Method 2: Using broadcasting
checkerboard2 = np.indices((8, 8)).sum(axis=0) % 2
print("\nMethod 2 - Using indices and modulo:")
print(checkerboard2)

# Method 3: Using tile
pattern = np.array([[0, 1], [1, 0]])
checkerboard3 = np.tile(pattern, (4, 4))
print("\nMethod 3 - Using tile:")
print(checkerboard3)

print("\n‚úÖ All three methods produce the same checkerboard pattern!")
print("Method 1 is most explicit, Method 3 is most concise")

### Q9: What's the difference between `arr.flatten()` and `arr.ravel()`?

**Difficulty:** Medium  
**Topic:** Array Manipulation

In [None]:
# Solution Q9
arr = np.array([[1, 2, 3], [4, 5, 6]])
print("Original array:")
print(arr)

# Test flatten
flat = arr.flatten()
flat[0] = 999
print("\nAfter flatten and modifying flat[0]=999:")
print("Flattened:", flat)
print("Original:", arr)
print("Original unchanged? Yes - flatten() creates a COPY")

# Test ravel
arr = np.array([[1, 2, 3], [4, 5, 6]])
rav = arr.ravel()
rav[0] = 999
print("\nAfter ravel and modifying rav[0]=999:")
print("Raveled:", rav)
print("Original:", arr)
print("Original changed? Yes - ravel() creates a VIEW when possible")

print("\n‚úÖ Key Differences:")
print("flatten() - Always returns a COPY (independent)")
print("ravel()   - Returns a VIEW when possible (shares memory, faster)")
print("\nUse flatten() when you need independence")
print("Use ravel() for better performance")

### Q10: Reverse a 1D array using NumPy.

Given: `arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])`

**Difficulty:** Easy  
**Topic:** Indexing

In [None]:
# Solution Q10
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("Original:", arr)

# Method 1: Slicing with negative step
reversed1 = arr[::-1]
print("\nMethod 1 - arr[::-1]:", reversed1)

# Method 2: Using np.flip
reversed2 = np.flip(arr)
print("Method 2 - np.flip(arr):", reversed2)

# Method 3: Using np.flipud (flip up-down, works for 1D)
reversed3 = np.flipud(arr)
print("Method 3 - np.flipud(arr):", reversed3)

print("\n‚úÖ All methods work!")
print("arr[::-1] is most common and concise")
print("np.flip() is more explicit and works for multi-dimensional arrays")

### Q11: What is the output of `np.array([1, 2, 3]) + np.array([[1], [2], [3]])`?

**Difficulty:** Medium  
**Topic:** Broadcasting

In [None]:
# Solution Q11
arr1 = np.array([1, 2, 3])
arr2 = np.array([[1], [2], [3]])

print("Array 1 (1D):", arr1)
print("Shape:", arr1.shape)
print("\nArray 2 (2D column):")
print(arr2)
print("Shape:", arr2.shape)

result = arr1 + arr2
print("\nResult of addition:")
print(result)
print("Shape:", result.shape)

print("\n‚úÖ Explanation (Broadcasting):")
print("arr1 shape: (3,)   ‚Üí broadcasts to (3, 3)")
print("arr2 shape: (3, 1) ‚Üí broadcasts to (3, 3)")
print("\nEach row of arr2 is added to arr1:")
print("[1,2,3] + 1 = [2,3,4]")
print("[1,2,3] + 2 = [3,4,5]")
print("[1,2,3] + 3 = [4,5,6]")

### Q12: Find the indices of the maximum value in a 2D array.

Given: `arr = np.array([[3, 7, 2], [9, 1, 5], [4, 8, 6]])`

**Difficulty:** Easy  
**Topic:** Searching

In [None]:
# Solution Q12
arr = np.array([[3, 7, 2], [9, 1, 5], [4, 8, 6]])
print("Array:")
print(arr)

# Method 1: Using argmax with unravel_index
max_index_flat = np.argmax(arr)
max_index_2d = np.unravel_index(max_index_flat, arr.shape)
print(f"\nMethod 1 - Maximum value: {arr.max()}")
print(f"Flat index: {max_index_flat}")
print(f"2D index (row, col): {max_index_2d}")
print(f"Value at {max_index_2d}: {arr[max_index_2d]}")

# Method 2: Using where
max_val = arr.max()
indices = np.where(arr == max_val)
print(f"\nMethod 2 - Using np.where:")
print(f"Row index: {indices[0][0]}, Column index: {indices[1][0]}")

print("\n‚úÖ The maximum value 9 is at position (1, 0)")

### Q13: Replace all odd numbers in an array with -1.

Given: `arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])`

**Difficulty:** Easy  
**Topic:** Boolean Indexing

In [None]:
# Solution Q13
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("Original:", arr)

# Method 1: Boolean indexing (modifies original)
arr_copy1 = arr.copy()
arr_copy1[arr_copy1 % 2 == 1] = -1
print("\nMethod 1 - Boolean indexing:", arr_copy1)

# Method 2: Using np.where (creates new array)
arr_copy2 = np.where(arr % 2 == 1, -1, arr)
print("Method 2 - np.where:", arr_copy2)

print("\n‚úÖ Both methods work!")
print("Method 1 modifies in-place (more memory efficient)")
print("Method 2 creates new array (preserves original)")

### Q14: What is the difference between `np.array([1, 2]) * 2` and `np.array([1, 2]) @ np.array([2, 2])`?

**Difficulty:** Medium  
**Topic:** Array Operations

In [None]:
# Solution Q14
arr = np.array([1, 2])

result1 = arr * 2
print("arr * 2:", result1)
print("Type:", type(result1), "Shape:", result1.shape)

result2 = arr @ np.array([2, 2])
print("\narr @ [2, 2]:", result2)
print("Type:", type(result2))

print("\n‚úÖ Key Differences:")
print("\n1. arr * 2 (Element-wise multiplication):")
print("   - Multiplies each element by 2")
print("   - Result: [1*2, 2*2] = [2, 4]")
print("   - Returns an array")

print("\n2. arr @ [2, 2] (Dot product / Matrix multiplication):")
print("   - Computes: (1*2) + (2*2) = 2 + 4 = 6")
print("   - Returns a scalar")
print("   - @ is the matrix multiplication operator")

print("\nRemember:")
print("* = element-wise multiplication")
print("@ = matrix/dot product multiplication")

### Q15: Create a 3x3 identity matrix and add 5 to all elements. What's the result?

**Difficulty:** Easy  
**Topic:** Array Creation & Operations

In [None]:
# Solution Q15
identity = np.eye(3)
print("3x3 Identity matrix:")
print(identity)

result = identity + 5
print("\nAfter adding 5:")
print(result)

print("\n‚úÖ Explanation:")
print("Identity matrix has 1s on diagonal, 0s elsewhere")
print("Adding 5 to all elements:")
print("- Diagonal: 1 + 5 = 6")
print("- Off-diagonal: 0 + 5 = 5")
print("\nBroadcasting applies the scalar 5 to every element")

---
# Section 2: Array Operations (Q16-30)

Testing mathematical operations, aggregations, and transformations.

### Q16: Normalize an array so all values are between 0 and 1.

Given: `arr = np.array([10, 20, 30, 40, 50])`

**Difficulty:** Medium  
**Topic:** Array Operations

In [None]:
# Solution Q16
arr = np.array([10, 20, 30, 40, 50])
print("Original array:", arr)

# Min-Max Normalization formula: (x - min) / (max - min)
normalized = (arr - arr.min()) / (arr.max() - arr.min())
print("\nNormalized array:", normalized)

# Verify
print(f"\nMin value: {normalized.min()} (should be 0)")
print(f"Max value: {normalized.max()} (should be 1)")

print("\n‚úÖ Formula: (x - min) / (max - min)")
print("This scales all values to the range [0, 1]")
print("\nExample: For 30:")
print("(30 - 10) / (50 - 10) = 20 / 40 = 0.5")

### Q17: What is the output of `np.arange(9).reshape(3,3).sum(axis=0)`?

**Options:**
- A) `45`
- B) `[3, 12, 21]`
- C) `[9, 12, 15]`
- D) `[[0,1,2], [3,4,5], [6,7,8]]`

**Difficulty:** Medium  
**Topic:** Aggregations with axis

In [None]:
# Solution Q17
arr = np.arange(9).reshape(3, 3)
print("Array:")
print(arr)

result = arr.sum(axis=0)
print("\nsum(axis=0):", result)

# Show calculation
print("\n‚úÖ Answer: C) [9, 12, 15]")
print("\nExplanation:")
print("axis=0 means sum DOWN the columns (collapse rows)")
print("\nColumn 0: 0 + 3 + 6 = 9")
print("Column 1: 1 + 4 + 7 = 12")
print("Column 2: 2 + 5 + 8 = 15")

print("\nRemember:")
print("axis=0 ‚Üí sum down columns (result has same # of columns)")
print("axis=1 ‚Üí sum across rows (result has same # of rows)")
print("no axis ‚Üí sum everything (single number)")

### Q18: Swap two rows in a 2D array.

Given: `arr = np.array([[1,2,3], [4,5,6], [7,8,9]])`  
Swap rows 0 and 2.

**Difficulty:** Easy  
**Topic:** Array Manipulation

In [None]:
# Solution Q18
arr = np.array([[1,2,3], [4,5,6], [7,8,9]])
print("Original array:")
print(arr)

# Method 1: Using indexing (creates new array)
swapped1 = arr[[2, 1, 0]]
print("\nMethod 1 - Fancy indexing:")
print(swapped1)

# Method 2: In-place swap using tuple unpacking
arr_copy = arr.copy()
arr_copy[[0, 2]] = arr_copy[[2, 0]]
print("\nMethod 2 - In-place swap:")
print(arr_copy)

print("\n‚úÖ Both methods work!")
print("Method 1: arr[[2, 1, 0]] reorders all rows")
print("Method 2: arr[[0, 2]] = arr[[2, 0]] swaps specific rows in-place")

### Q19: Find the most frequent value in a NumPy array.

Given: `arr = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])`

**Difficulty:** Medium  
**Topic:** Statistics & Unique

In [None]:
# Solution Q19
arr = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])
print("Array:", arr)

# Get unique values and their counts
values, counts = np.unique(arr, return_counts=True)
print("\nUnique values:", values)
print("Counts:", counts)

# Find the most frequent
most_frequent_index = np.argmax(counts)
most_frequent_value = values[most_frequent_index]
max_count = counts[most_frequent_index]

print(f"\n‚úÖ Most frequent value: {most_frequent_value}")
print(f"Appears {max_count} times")

# Alternative: using bincount (only for non-negative integers)
if arr.min() >= 0:
    counts_bincount = np.bincount(arr)
    most_frequent_alt = np.argmax(counts_bincount)
    print(f"\nAlternative method (bincount): {most_frequent_alt}")

### Q20: Calculate the cumulative sum of an array.

Given: `arr = np.array([1, 2, 3, 4, 5])`

**Expected Output:** `[1, 3, 6, 10, 15]`

**Difficulty:** Easy  
**Topic:** Cumulative Operations

### Q21b: Given the following arrays, which concatenation operations are correct?

```python
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([[7, 8], [9, 10]])
d = np.array([[11, 12], [13, 14]])
```

**Select all correct options:**

**A)** To produce `[1, 2, 3, 4, 5, 6]`, use: `np.concatenate((a, b), axis=0)`

**B)** To produce `[[1, 2, 3], [4, 5, 6]]`, use: `np.concatenate((a, b), axis=1)`

**C)** To produce `[[7, 8], [9, 10], [11, 12], [13, 14]]`, use: `np.concatenate((c, d), axis=0)`

**D)** To produce `[[7, 8, 11, 12], [9, 10, 13, 14]]`, use: `np.concatenate((c, d), axis=1)`

**Note:** This question may have more than one correct answer.

**Difficulty:** Medium  
**Topic:** Array Concatenation with axis parameter

In [None]:
# Solution Q21b
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([[7, 8], [9, 10]])
d = np.array([[11, 12], [13, 14]])

print("Arrays:")
print(f"a = {a}, shape: {a.shape}")
print(f"b = {b}, shape: {b.shape}")
print(f"c =\n{c}, shape: {c.shape}")
print(f"d =\n{d}, shape: {d.shape}")

print("\n" + "="*60)
print("Testing each option:\n")

# Option A: concatenate 1D arrays with axis=0
print("A) np.concatenate((a, b), axis=0):")
result_a = np.concatenate((a, b), axis=0)
print(f"   Result: {result_a}")
print(f"   Expected: [1, 2, 3, 4, 5, 6]")
print(f"   ‚úÖ CORRECT - axis=0 stacks 1D arrays end-to-end\n")

# Option B: Try concatenate 1D arrays with axis=1
print("B) np.concatenate((a, b), axis=1):")
try:
    result_b = np.concatenate((a, b), axis=1)
    print(f"   Result: {result_b}")
except Exception as e:
    print(f"   ‚ùå ERROR: {e}")
    print(f"   INCORRECT - 1D arrays don't have axis=1")
    print(f"   To get [[1,2,3], [4,5,6]], need to reshape first or use vstack\n")

# Option C: concatenate 2D arrays with axis=0
print("C) np.concatenate((c, d), axis=0):")
result_c = np.concatenate((c, d), axis=0)
print(f"   Result:\n{result_c}")
print(f"   Expected: [[7, 8], [9, 10], [11, 12], [13, 14]]")
print(f"   ‚úÖ CORRECT - axis=0 stacks rows vertically\n")

# Option D: concatenate 2D arrays with axis=1
print("D) np.concatenate((c, d), axis=1):")
result_d = np.concatenate((c, d), axis=1)
print(f"   Result:\n{result_d}")
print(f"   Expected: [[7, 8, 11, 12], [9, 10, 13, 14]]")
print(f"   ‚úÖ CORRECT - axis=1 stacks columns horizontally\n")

print("="*60)
print("\n‚úÖ CORRECT ANSWERS: A, C, D")
print("\nüìù Key Concepts:")
print("\nFor 1D arrays:")
print("  - axis=0: Concatenates end-to-end [1,2,3] + [4,5,6] = [1,2,3,4,5,6]")
print("  - axis=1: Not valid (1D arrays only have axis=0)")
print("\nFor 2D arrays:")
print("  - axis=0: Stacks VERTICALLY (adds more rows)")
print("  - axis=1: Stacks HORIZONTALLY (adds more columns)")

print("\nüí° Remember:")
print("  axis=0 ‚Üí along rows (vertical stacking)")
print("  axis=1 ‚Üí along columns (horizontal stacking)")
print("\nThink of axis as the dimension that GROWS:")

In [None]:
# Solution Q20
arr = np.array([1, 2, 3, 4, 5])
print("Original array:", arr)

cumsum = np.cumsum(arr)
print("\nCumulative sum:", cumsum)

print("\n‚úÖ Explanation:")
print("Position 0: 1")
print("Position 1: 1 + 2 = 3")
print("Position 2: 1 + 2 + 3 = 6")
print("Position 3: 1 + 2 + 3 + 4 = 10")
print("Position 4: 1 + 2 + 3 + 4 + 5 = 15")

# Also works with 2D arrays
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D array:")
print(arr_2d)
print("\nCumulative sum (axis=0):")
print(np.cumsum(arr_2d, axis=0))
print("\nCumulative sum (axis=1):")
print(np.cumsum(arr_2d, axis=1))

### Q21: How do you concatenate two arrays vertically and horizontally?

Given: `arr1 = np.array([[1, 2], [3, 4]])` and `arr2 = np.array([[5, 6], [7, 8]])`

**Difficulty:** Easy  
**Topic:** Array Manipulation

In [None]:
# Solution Q21
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

print("Array 1:")
print(arr1)
print("\nArray 2:")
print(arr2)

# Vertical concatenation (stack rows)
vertical = np.vstack([arr1, arr2])
print("\nVertical concatenation (vstack):")
print(vertical)

# Alternative for vertical
vertical_alt = np.concatenate([arr1, arr2], axis=0)
print("\nAlternative (concatenate axis=0):")
print(vertical_alt)

# Horizontal concatenation (stack columns)
horizontal = np.hstack([arr1, arr2])
print("\nHorizontal concatenation (hstack):")
print(horizontal)

# Alternative for horizontal
horizontal_alt = np.concatenate([arr1, arr2], axis=1)
print("\nAlternative (concatenate axis=1):")
print(horizontal_alt)

print("\n‚úÖ Summary:")
print("Vertical: vstack() or concatenate(axis=0)")
print("Horizontal: hstack() or concatenate(axis=1)")

### Q22: What happens when you try to reshape an array with incompatible dimensions?

Try: `np.arange(10).reshape(3, 4)`

**Difficulty:** Easy  
**Topic:** Reshaping

In [None]:
# Solution Q22
arr = np.arange(10)
print(f"Array with {arr.size} elements:", arr)

try:
    reshaped = arr.reshape(3, 4)
    print("Reshaped successfully:", reshaped)
except ValueError as e:
    print(f"\n‚ùå Error: {e}")
    
print("\n‚úÖ Explanation:")
print("Cannot reshape 10 elements into 3√ó4 (which needs 12 elements)")
print("The total number of elements must match!")
print("\nValid reshapes for 10 elements:")
print("- (10,) - original")
print("- (2, 5) - 2 rows, 5 columns")
print("- (5, 2) - 5 rows, 2 columns")
print("- (10, 1) - column vector")
print("- (1, 10) - row vector")

# Show valid reshape
valid = arr.reshape(2, 5)
print("\nValid reshape (2, 5):")
print(valid)

### Q23: Calculate the standard deviation and variance of an array.

Given: `arr = np.array([2, 4, 4, 4, 5, 5, 7, 9])`

**Difficulty:** Easy  
**Topic:** Statistics

In [None]:
# Solution Q23
arr = np.array([2, 4, 4, 4, 5, 5, 7, 9])
print("Array:", arr)

mean = np.mean(arr)
variance = np.var(arr)
std_dev = np.std(arr)

print(f"\nMean: {mean}")
print(f"Variance: {variance}")
print(f"Standard Deviation: {std_dev}")

# Verify relationship: std = sqrt(variance)
print(f"\nVerification: sqrt(variance) = {np.sqrt(variance)}")
print(f"Matches std_dev? {np.isclose(np.sqrt(variance), std_dev)}")

print("\n‚úÖ Key Concepts:")
print("- Variance measures spread of data from mean")
print("- Standard deviation = square root of variance")
print("- Lower values = data closer to mean")
print("- Higher values = data more spread out")

# Manual calculation
print("\nManual calculation:")
deviations = arr - mean
squared_deviations = deviations ** 2
variance_manual = squared_deviations.mean()
print(f"Variance (manual): {variance_manual}")

### Q24: Use `np.where()` to replace negative values with 0 and positive values with their square.

Given: `arr = np.array([-2, -1, 0, 1, 2, 3])`

**Difficulty:** Medium  
**Topic:** Advanced Indexing

In [None]:
# Solution Q24
arr = np.array([-2, -1, 0, 1, 2, 3])
print("Original array:", arr)

# Replace: negative ‚Üí 0, positive ‚Üí square
result = np.where(arr < 0, 0, arr**2)
print("\nResult:", result)

print("\n‚úÖ Explanation:")
print("np.where(condition, value_if_true, value_if_false)")
print("\nFor each element:")
print("-2: negative ‚Üí 0")
print("-1: negative ‚Üí 0")
print(" 0: not negative ‚Üí 0¬≤ = 0")
print(" 1: not negative ‚Üí 1¬≤ = 1")
print(" 2: not negative ‚Üí 2¬≤ = 4")
print(" 3: not negative ‚Üí 3¬≤ = 9")

# Alternative: nested where for more conditions
result2 = np.where(arr < 0, 0, np.where(arr == 0, 0, arr**2))
print("\nWith explicit zero handling:", result2)

### Q25: Create a 10x10 array with random values and find the row and column with the highest sum.

**Difficulty:** Medium  
**Topic:** Random & Aggregations

In [None]:
# Solution Q25
np.random.seed(42)
arr = np.random.rand(10, 10)
print("10x10 random array (first 3 rows):")
print(arr[:3])

# Sum each row
row_sums = arr.sum(axis=1)
max_row_index = np.argmax(row_sums)
print(f"\nRow sums: {row_sums}")
print(f"Row with highest sum: Row {max_row_index}")
print(f"Sum: {row_sums[max_row_index]:.4f}")

# Sum each column
col_sums = arr.sum(axis=0)
max_col_index = np.argmax(col_sums)
print(f"\nColumn sums: {col_sums}")
print(f"Column with highest sum: Column {max_col_index}")
print(f"Sum: {col_sums[max_col_index]:.4f}")

print("\n‚úÖ Remember:")
print("axis=1 ‚Üí sum across columns (gives row sums)")
print("axis=0 ‚Üí sum down rows (gives column sums)")

### Q26: What is the difference between `np.sort()` and `np.argsort()`?

**Difficulty:** Easy  
**Topic:** Sorting

In [None]:
# Solution Q26
arr = np.array([40, 10, 30, 20, 50])
print("Original array:", arr)
print("Indices:       ", list(range(len(arr))))

# np.sort - returns sorted values
sorted_values = np.sort(arr)
print("\nnp.sort(arr):", sorted_values)

# np.argsort - returns indices that would sort the array
sorted_indices = np.argsort(arr)
print("\nnp.argsort(arr):", sorted_indices)

# Verify: using argsort indices gives sorted array
print("\nVerification:")
print("arr[argsort]:", arr[sorted_indices])
print("Matches np.sort? ", np.array_equal(arr[sorted_indices], sorted_values))

print("\n‚úÖ Key Differences:")
print("np.sort()    ‚Üí Returns sorted VALUES")
print("np.argsort() ‚Üí Returns INDICES that would sort the array")
print("\nUse argsort when you need to:")
print("- Sort multiple related arrays together")
print("- Keep track of original positions")
print("- Get top-k elements with their positions")

### Q27: Find the intersection of two arrays.

Given: `arr1 = np.array([1, 2, 3, 4, 5])` and `arr2 = np.array([3, 4, 5, 6, 7])`

**Difficulty:** Easy  
**Topic:** Set Operations

In [None]:
# Solution Q27
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([3, 4, 5, 6, 7])

print("Array 1:", arr1)
print("Array 2:", arr2)

# Intersection - common elements
intersection = np.intersect1d(arr1, arr2)
print("\nIntersection:", intersection)

# Other set operations
union = np.union1d(arr1, arr2)
print("Union:", union)

difference = np.setdiff1d(arr1, arr2)
print("Difference (in arr1, not arr2):", difference)

symmetric_diff = np.setxor1d(arr1, arr2)
print("Symmetric difference:", symmetric_diff)

print("\n‚úÖ Set Operations:")
print("intersect1d() ‚Üí Common elements")
print("union1d()     ‚Üí All unique elements")
print("setdiff1d()   ‚Üí In first, not in second")
print("setxor1d()    ‚Üí In either, but not both")

### Q28: Calculate the percentile values (25th, 50th, 75th) of an array.

Given: `arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])`

**Difficulty:** Easy  
**Topic:** Statistics

In [None]:
# Solution Q28
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("Array:", arr)

# Calculate percentiles
p25 = np.percentile(arr, 25)
p50 = np.percentile(arr, 50)
p75 = np.percentile(arr, 75)

print(f"\n25th percentile (Q1): {p25}")
print(f"50th percentile (Q2/Median): {p50}")
print(f"75th percentile (Q3): {p75}")

# Calculate all at once
percentiles = np.percentile(arr, [25, 50, 75])
print(f"\nAll percentiles: {percentiles}")

# Interquartile range (IQR)
iqr = p75 - p25
print(f"\nInterquartile Range (IQR): {iqr}")

print("\n‚úÖ Percentiles divide data into 100 equal parts")
print("25th percentile: 25% of data is below this value")
print("50th percentile: Same as median")
print("75th percentile: 75% of data is below this value")

### Q29: Clip array values to a specific range [min, max].

Given: `arr = np.array([1, 5, 10, 15, 20, 25, 30])`  
Clip to range [10, 20]

**Difficulty:** Easy  
**Topic:** Array Operations

In [None]:
# Solution Q29
arr = np.array([1, 5, 10, 15, 20, 25, 30])
print("Original array:", arr)

clipped = np.clip(arr, 10, 20)
print("\nClipped to [10, 20]:", clipped)

print("\n‚úÖ Explanation:")
print("Values below 10 ‚Üí become 10")
print("Values above 20 ‚Üí become 20")
print("Values in [10, 20] ‚Üí unchanged")
print("\nMapping:")
for orig, clip in zip(arr, clipped):
    if orig < 10:
        print(f"{orig} ‚Üí {clip} (below min)")
    elif orig > 20:
        print(f"{orig} ‚Üí {clip} (above max)")
    else:
        print(f"{orig} ‚Üí {clip} (in range)")

print("\nCommon use: Preventing outliers in data")

### Q30: What will be the output of `np.tile(np.array([1, 2]), (3, 2))`?

**Options:**
- A) `[[1, 2], [1, 2], [1, 2]]`
- B) `[[1, 2, 1, 2], [1, 2, 1, 2], [1, 2, 1, 2]]`
- C) `[[1, 1], [2, 2]]`
- D) Error

**Difficulty:** Medium  
**Topic:** Array Manipulation

In [None]:
# Solution Q30
arr = np.array([1, 2])
result = np.tile(arr, (3, 2))

print("Original array:", arr)
print("\nnp.tile(arr, (3, 2)):")
print(result)

print("\n‚úÖ Answer: B)")
print("\nExplanation:")
print("tile(arr, (3, 2)) means:")
print("- Repeat 3 times vertically (3 rows)")
print("- Repeat 2 times horizontally (2 times per row)")
print("\nResult: 3 rows, each containing [1, 2, 1, 2]")

# Show step by step
print("\nStep by step:")
print("1. Original: [1, 2]")
print("2. Tile horizontally 2 times: [1, 2, 1, 2]")
print("3. Tile vertically 3 times:")
print("   [[1, 2, 1, 2],")
print("    [1, 2, 1, 2],")
print("    [1, 2, 1, 2]]")

---
# Section 3: Advanced Indexing & Manipulation (Q31-45)

Testing complex indexing, broadcasting, and array transformations.

### Q31: Extract the diagonal elements of a 2D array.

Given: `arr = np.array([[1,2,3], [4,5,6], [7,8,9]])`

**Difficulty:** Easy  
**Topic:** Indexing

In [None]:
# Solution Q31
arr = np.array([[1,2,3], [4,5,6], [7,8,9]])
print("Array:")
print(arr)

# Method 1: Using np.diag
diagonal = np.diag(arr)
print("\nDiagonal (np.diag):", diagonal)

# Method 2: Using diagonal method
diagonal2 = arr.diagonal()
print("Diagonal (.diagonal()):", diagonal2)

# Extract other diagonals
upper_diag = np.diag(arr, k=1)  # k=1 for upper diagonal
lower_diag = np.diag(arr, k=-1)  # k=-1 for lower diagonal

print(f"\nUpper diagonal (k=1): {upper_diag}")
print(f"Lower diagonal (k=-1): {lower_diag}")

print("\n‚úÖ Main diagonal: elements where row index = column index")
print("arr[0,0]=1, arr[1,1]=5, arr[2,2]=9")

### Q32: Use fancy indexing to extract elements at positions (0,0), (1,2), and (2,1).

Given: `arr = np.array([[1,2,3], [4,5,6], [7,8,9]])`

**Difficulty:** Medium  
**Topic:** Fancy Indexing

In [None]:
# Solution Q32
arr = np.array([[1,2,3], [4,5,6], [7,8,9]])
print("Array:")
print(arr)

# Fancy indexing with row and column arrays
rows = np.array([0, 1, 2])
cols = np.array([0, 2, 1])

result = arr[rows, cols]
print(f"\nExtract positions (0,0), (1,2), (2,1):")
print(f"Result: {result}")

print("\n‚úÖ Explanation:")
print("arr[0, 0] = 1")
print("arr[1, 2] = 6")
print("arr[2, 1] = 8")

# Alternative: list of tuples (less efficient)
positions = [(0,0), (1,2), (2,1)]
result2 = np.array([arr[r, c] for r, c in positions])
print(f"\nAlternative method: {result2}")

print("\nFancy indexing: arr[[rows], [cols]]")
print("Pairs up corresponding elements from row and column arrays")

### Q33: Create a 5x5 matrix with row values ranging from 0 to 4.

**Expected Output:**
```
[[0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4]]
```

**Difficulty:** Medium  
**Topic:** Broadcasting

In [None]:
# Solution Q33

# Method 1: Using broadcasting
row = np.arange(5)
matrix1 = np.zeros((5, 5), dtype=int) + row
print("Method 1 - Broadcasting:")
print(matrix1)

# Method 2: Using tile
matrix2 = np.tile(np.arange(5), (5, 1))
print("\nMethod 2 - Tile:")
print(matrix2)

# Method 3: Using repeat and reshape
matrix3 = np.repeat(np.arange(5), 5).reshape(5, 5, order='F')
print("\nMethod 3 - Repeat:")
print(matrix3)

# Method 4: Using meshgrid (most explicit)
_, matrix4 = np.meshgrid(np.arange(5), np.arange(5))
print("\nMethod 4 - Meshgrid:")
print(matrix4)

print("\n‚úÖ All methods create the same pattern!")
print("Method 2 (tile) is most concise and readable")

### Q34: Split an array into 3 equal parts.

Given: `arr = np.arange(12)`

**Difficulty:** Easy  
**Topic:** Array Manipulation

In [None]:
# Solution Q34
arr = np.arange(12)
print("Original array:", arr)

# Split into 3 equal parts
parts = np.split(arr, 3)

print("\nSplit into 3 parts:")
for i, part in enumerate(parts):
    print(f"Part {i+1}: {part}")

# Alternative: split at specific indices
parts2 = np.split(arr, [4, 8])
print("\nAlternative - split at indices [4, 8]:")
for i, part in enumerate(parts2):
    print(f"Part {i+1}: {part}")

# Array split (works with 2D)
arr_2d = np.arange(12).reshape(3, 4)
print("\n2D array:")
print(arr_2d)

vsplit_result = np.vsplit(arr_2d, 3)
print("\nVertical split (3 parts):")
for i, part in enumerate(vsplit_result):
    print(f"Part {i+1}:")
    print(part)

print("\n‚úÖ np.split() divides array into equal parts")

### Q35: Use `np.select()` to categorize ages into groups.

Given: `ages = np.array([5, 15, 25, 45, 65, 85])`

Categories:
- Child: < 13
- Teen: 13-19
- Adult: 20-64
- Senior: >= 65

**Difficulty:** Medium  
**Topic:** Advanced Indexing

In [None]:
# Solution Q35
ages = np.array([5, 15, 25, 45, 65, 85])
print("Ages:", ages)

# Define conditions
conditions = [
    ages < 13,
    (ages >= 13) & (ages < 20),
    (ages >= 20) & (ages < 65),
    ages >= 65
]

# Define corresponding categories
categories = ['Child', 'Teen', 'Adult', 'Senior']

# Apply conditions
result = np.select(conditions, categories)
print("\nCategories:", result)

print("\n‚úÖ Breakdown:")
for age, category in zip(ages, result):
    print(f"Age {age:2d} ‚Üí {category}")

print("\nnp.select() is like a vectorized if-elif-else")
print("More efficient than nested np.where() for multiple conditions")

### Q36: Find all unique rows in a 2D array.

Given: `arr = np.array([[1,2], [3,4], [1,2], [5,6], [3,4]])`

**Difficulty:** Medium  
**Topic:** Unique Operations

In [None]:
# Solution Q36
arr = np.array([[1,2], [3,4], [1,2], [5,6], [3,4]])
print("Original array:")
print(arr)

# Find unique rows
unique_rows = np.unique(arr, axis=0)
print("\nUnique rows:")
print(unique_rows)

# With return_counts to see duplicates
unique_rows, counts = np.unique(arr, axis=0, return_counts=True)
print("\nUnique rows with counts:")
for row, count in zip(unique_rows, counts):
    print(f"{row} appears {count} time(s)")

print("\n‚úÖ np.unique(arr, axis=0) finds unique rows")
print("axis=0 treats each row as a unit to compare")

### Q37: Create a 3x3 array and replace the maximum value in each row with 0.

**Difficulty:** Medium  
**Topic:** Indexing & Aggregations

In [None]:
# Solution Q37
arr = np.array([[3, 7, 2], [9, 1, 5], [4, 8, 6]])
print("Original array:")
print(arr)

# Find column index of max in each row
max_col_indices = np.argmax(arr, axis=1)
print(f"\nMax column indices per row: {max_col_indices}")

# Create row indices
row_indices = np.arange(arr.shape[0])

# Replace max values with 0
arr_copy = arr.copy()
arr_copy[row_indices, max_col_indices] = 0

print("\nAfter replacing max in each row with 0:")
print(arr_copy)

print("\n‚úÖ Explanation:")
print("Row 0: max is 7 at column 1 ‚Üí replaced with 0")
print("Row 1: max is 9 at column 0 ‚Üí replaced with 0")
print("Row 2: max is 8 at column 1 ‚Üí replaced with 0")

### Q38: Generate a random 5x5 matrix and normalize each column to have mean=0 and std=1.

**Difficulty:** Hard  
**Topic:** Statistics & Broadcasting

In [None]:
# Solution Q38
np.random.seed(42)
arr = np.random.rand(5, 5) * 10
print("Original array:")
print(arr.round(2))

# Calculate mean and std for each column
col_means = arr.mean(axis=0)
col_stds = arr.std(axis=0)

print(f"\nColumn means: {col_means.round(2)}")
print(f"Column stds: {col_stds.round(2)}")

# Standardize: (x - mean) / std
standardized = (arr - col_means) / col_stds

print("\nStandardized array:")
print(standardized.round(2))

# Verify standardization
print(f"\nVerification:")
print(f"New column means: {standardized.mean(axis=0).round(10)}")
print(f"New column stds: {standardized.std(axis=0).round(2)}")

print("\n‚úÖ Z-score normalization formula: (x - Œº) / œÉ")
print("Broadcasting automatically applies to each column")

### Q39: Find the k largest elements in an array.

Given: `arr = np.array([3, 7, 1, 9, 2, 8, 5])`, find k=3 largest elements.

**Difficulty:** Medium  
**Topic:** Sorting

In [None]:
# Solution Q39
arr = np.array([3, 7, 1, 9, 2, 8, 5])
k = 3

print("Array:", arr)
print(f"Find {k} largest elements\n")

# Method 1: Sort and take last k
sorted_arr = np.sort(arr)
largest_method1 = sorted_arr[-k:]
print(f"Method 1 - Sort and slice: {largest_method1}")

# Method 2: Using argsort to get indices
indices = np.argsort(arr)[-k:]
largest_method2 = arr[indices]
print(f"Method 2 - Argsort indices: {indices}")
print(f"Values: {largest_method2}")

# Method 3: Using argpartition (most efficient for large arrays)
partition_indices = np.argpartition(arr, -k)[-k:]
largest_method3 = arr[partition_indices]
print(f"Method 3 - Argpartition: {largest_method3}")

# Sort the result
largest_sorted = np.sort(largest_method3)
print(f"Sorted: {largest_sorted}")

print("\n‚úÖ For k largest:")
print("- Small arrays: np.sort(arr)[-k:]")
print("- Large arrays: np.argpartition(arr, -k)[-k:] (O(n) vs O(n log n))")
print("- Need indices too: np.argsort(arr)[-k:]")

### Q40: Calculate the Euclidean distance between two points.

Given: `point1 = np.array([1, 2, 3])` and `point2 = np.array([4, 6, 8])`

**Difficulty:** Easy  
**Topic:** Mathematical Operations

In [None]:
# Solution Q40
point1 = np.array([1, 2, 3])
point2 = np.array([4, 6, 8])

print("Point 1:", point1)
print("Point 2:", point2)

# Method 1: Manual calculation
diff = point2 - point1
squared_diff = diff ** 2
distance1 = np.sqrt(np.sum(squared_diff))
print(f"\nMethod 1 - Manual: {distance1:.4f}")

# Method 2: Using np.linalg.norm
distance2 = np.linalg.norm(point2 - point1)
print(f"Method 2 - np.linalg.norm: {distance2:.4f}")

# Method 3: Using np.sqrt and dot product
distance3 = np.sqrt(np.dot(diff, diff))
print(f"Method 3 - Dot product: {distance3:.4f}")

print("\n‚úÖ Euclidean distance formula:")
print("d = ‚àö[(x‚ÇÇ-x‚ÇÅ)¬≤ + (y‚ÇÇ-y‚ÇÅ)¬≤ + (z‚ÇÇ-z‚ÇÅ)¬≤]")
print(f"d = ‚àö[(4-1)¬≤ + (6-2)¬≤ + (8-3)¬≤]")
print(f"d = ‚àö[9 + 16 + 25] = ‚àö50 ‚âà {distance1:.4f}")

print("\nBest method: np.linalg.norm(point2 - point1)")

### Q41: Create a correlation matrix for a dataset.

Given: 3 variables with 5 observations each

**Difficulty:** Medium  
**Topic:** Statistics

In [None]:
# Solution Q41
# Create sample data: 3 variables, 5 observations
var1 = np.array([1, 2, 3, 4, 5])
var2 = np.array([2, 4, 6, 8, 10])  # Perfectly correlated with var1
var3 = np.array([5, 4, 3, 2, 1])   # Negatively correlated with var1

data = np.array([var1, var2, var3])
print("Data (3 variables, 5 observations):")
print(data)

# Calculate correlation matrix
corr_matrix = np.corrcoef(data)
print("\nCorrelation matrix:")
print(corr_matrix.round(3))

print("\n‚úÖ Interpretation:")
print("Diagonal: 1.0 (variable correlated with itself)")
print(f"var1 vs var2: {corr_matrix[0,1]:.3f} (perfect positive correlation)")
print(f"var1 vs var3: {corr_matrix[0,2]:.3f} (perfect negative correlation)")
print(f"var2 vs var3: {corr_matrix[1,2]:.3f} (perfect negative correlation)")

print("\nCorrelation values:")
print("  1.0: Perfect positive correlation")
print("  0.0: No correlation")
print(" -1.0: Perfect negative correlation")

### Q42: Implement a moving average with window size 3.

Given: `arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])`

**Difficulty:** Medium  
**Topic:** Array Operations

In [None]:
# Solution Q42
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
window = 3

print("Original array:", arr)
print(f"Window size: {window}\n")

# Method 1: Using np.convolve
moving_avg1 = np.convolve(arr, np.ones(window)/window, mode='valid')
print(f"Method 1 - np.convolve: {moving_avg1}")

# Method 2: Manual calculation with loop
moving_avg2 = np.array([arr[i:i+window].mean() 
                        for i in range(len(arr)-window+1)])
print(f"Method 2 - Manual: {moving_avg2}")

# Method 3: Using cumsum (efficient for large arrays)
cumsum = np.cumsum(np.insert(arr, 0, 0))
moving_avg3 = (cumsum[window:] - cumsum[:-window]) / window
print(f"Method 3 - Cumsum: {moving_avg3}")

print("\n‚úÖ Explanation (window=3):")
print("Position 0: (1+2+3)/3 = 2.0")
print("Position 1: (2+3+4)/3 = 3.0")
print("Position 2: (3+4+5)/3 = 4.0")
print("...")
print("Position 7: (8+9+10)/3 = 9.0")

print("\nBest method: np.convolve() - clean and efficient")

### Q43: Find the indices of elements that are closest to a given value.

Given: `arr = np.array([1.5, 2.3, 3.7, 4.1, 5.9])`, find closest to `3.5`

**Difficulty:** Medium  
**Topic:** Searching

In [None]:
# Solution Q43
arr = np.array([1.5, 2.3, 3.7, 4.1, 5.9])
target = 3.5

print("Array:", arr)
print(f"Target value: {target}\n")

# Calculate absolute differences
differences = np.abs(arr - target)
print("Differences from target:", differences)

# Find index of minimum difference
closest_index = np.argmin(differences)
closest_value = arr[closest_index]

print(f"\nClosest index: {closest_index}")
print(f"Closest value: {closest_value}")
print(f"Distance: {differences[closest_index]}")

# Find multiple closest values (top k)
k = 3
top_k_indices = np.argsort(differences)[:k]
print(f"\nTop {k} closest indices: {top_k_indices}")
print(f"Top {k} closest values: {arr[top_k_indices]}")

print("\n‚úÖ Method: np.argmin(np.abs(arr - target))")

### Q44: Remove all NaN values from an array.

Given: `arr = np.array([1, 2, np.nan, 4, np.nan, 6])`

**Difficulty:** Easy  
**Topic:** NaN Handling

In [None]:
# Solution Q44
arr = np.array([1, 2, np.nan, 4, np.nan, 6])
print("Original array:", arr)

# Remove NaN values
cleaned = arr[~np.isnan(arr)]
print("After removing NaN:", cleaned)

# Alternative: using boolean indexing
cleaned2 = arr[np.isfinite(arr)]
print("Using isfinite:", cleaned2)

print("\n‚úÖ Methods:")
print("~np.isnan(arr)  - Remove NaN only")
print("np.isfinite(arr) - Remove NaN and Inf")

# Replace NaN with a value instead of removing
arr_copy = arr.copy()
arr_copy[np.isnan(arr_copy)] = 0
print("\nReplace NaN with 0:", arr_copy)

# Using nanmean, nansum (ignore NaN in calculations)
print(f"\nMean (ignoring NaN): {np.nanmean(arr)}")
print(f"Sum (ignoring NaN): {np.nansum(arr)}")

### Q45: Create a function to calculate pairwise distances between all points in a 2D array.

Given: `points = np.array([[0,0], [1,1], [2,2]])`

**Difficulty:** Hard  
**Topic:** Broadcasting & Linear Algebra

In [None]:
# Solution Q45
points = np.array([[0,0], [1,1], [2,2]])
print("Points:")
print(points)

# Method 1: Using broadcasting (vectorized, fast)
diff = points[:, np.newaxis, :] - points[np.newaxis, :, :]
distances = np.sqrt(np.sum(diff**2, axis=2))

print("\nPairwise distance matrix:")
print(distances.round(3))

# Method 2: Using nested loops (slow, for comparison)
n = len(points)
distances_loop = np.zeros((n, n))
for i in range(n):
    for j in range(n):
        distances_loop[i, j] = np.linalg.norm(points[i] - points[j])

print("\nUsing loops (same result):")
print(distances_loop.round(3))

print("\n‚úÖ Explanation:")
print("Distance from point 0 to point 1:")
print(f"  ‚àö[(1-0)¬≤ + (1-0)¬≤] = ‚àö2 ‚âà {distances[0,1]:.3f}")
print("Distance from point 0 to point 2:")
print(f"  ‚àö[(2-0)¬≤ + (2-0)¬≤] = ‚àö8 ‚âà {distances[0,2]:.3f}")

print("\nBroadcasting method is much faster for large arrays!")

---
# Section 4: Linear Algebra & Statistics (Q46-60)

Testing matrix operations, statistical analysis, and mathematical functions.

### Q46: Calculate the determinant and inverse of a matrix.

Given: `matrix = np.array([[4, 7], [2, 6]])`

**Difficulty:** Easy  
**Topic:** Linear Algebra

In [None]:
# Solution Q46
matrix = np.array([[4, 7], [2, 6]])
print("Matrix:")
print(matrix)

# Calculate determinant
det = np.linalg.det(matrix)
print(f"\nDeterminant: {det}")

# Calculate inverse (only if determinant != 0)
if det != 0:
    inverse = np.linalg.inv(matrix)
    print("\nInverse matrix:")
    print(inverse)
    
    # Verify: matrix √ó inverse = identity
    identity = matrix @ inverse
    print("\nMatrix √ó Inverse (should be identity):")
    print(identity.round(10))
else:
    print("\nMatrix is singular (determinant = 0), no inverse exists")

print("\n‚úÖ Key concepts:")
print("- Determinant ‚â† 0 ‚Üí matrix is invertible")
print("- Determinant = 0 ‚Üí matrix is singular (no inverse)")
print("- A √ó A‚Åª¬π = I (identity matrix)")

### Q47: Solve a system of linear equations using NumPy.

Solve:
- 2x + 3y = 8
- 4x + y = 10

**Difficulty:** Medium  
**Topic:** Linear Algebra

In [None]:
# Solution Q47
# System: 2x + 3y = 8
#         4x + y = 10

# Coefficient matrix A
A = np.array([[2, 3], [4, 1]])
# Constants vector b
b = np.array([8, 10])

print("System of equations:")
print("2x + 3y = 8")
print("4x + y = 10")
print("\nCoefficient matrix A:")
print(A)
print("\nConstants vector b:", b)

# Solve using np.linalg.solve
solution = np.linalg.solve(A, b)
x, y = solution

print(f"\nSolution:")
print(f"x = {x}")
print(f"y = {y}")

# Verify the solution
print("\nVerification:")
print(f"2({x}) + 3({y}) = {2*x + 3*y} (should be 8)")
print(f"4({x}) + 1({y}) = {4*x + 1*y} (should be 10)")

# Alternative: using matrix inverse
solution_alt = np.linalg.inv(A) @ b
print(f"\nAlternative (using inverse): x={solution_alt[0]}, y={solution_alt[1]}")

print("\n‚úÖ np.linalg.solve() is more efficient than using inverse")

### Q48: Calculate eigenvalues and eigenvectors of a matrix.

Given: `matrix = np.array([[4, 2], [1, 3]])`

**Difficulty:** Medium  
**Topic:** Linear Algebra

In [None]:
# Solution Q48
matrix = np.array([[4, 2], [1, 3]])
print("Matrix:")
print(matrix)

# Calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(matrix)

print(f"\nEigenvalues:")
print(eigenvalues)

print(f"\nEigenvectors:")
print(eigenvectors)

# Verify: A √ó v = Œª √ó v
print("\nVerification (A √ó v = Œª √ó v):")
for i in range(len(eigenvalues)):
    v = eigenvectors[:, i]
    lam = eigenvalues[i]
    
    Av = matrix @ v
    lam_v = lam * v
    
    print(f"\nEigenvalue {i+1}: Œª = {lam:.4f}")
    print(f"A √ó v = {Av}")
    print(f"Œª √ó v = {lam_v}")
    print(f"Match? {np.allclose(Av, lam_v)}")

print("\n‚úÖ Eigenvalues/vectors are crucial for:")
print("- PCA (dimensionality reduction)")
print("- Stability analysis")
print("- Google PageRank algorithm")

### Q49: Perform matrix multiplication using @ operator vs np.dot().

What's the difference?

**Difficulty:** Easy  
**Topic:** Linear Algebra

In [None]:
# Solution Q49
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)

# Using @ operator (Python 3.5+)
result_at = A @ B
print("\nA @ B:")
print(result_at)

# Using np.dot()
result_dot = np.dot(A, B)
print("\nnp.dot(A, B):")
print(result_dot)

# Using np.matmul()
result_matmul = np.matmul(A, B)
print("\nnp.matmul(A, B):")
print(result_matmul)

print("\n‚úÖ All three give same result for 2D matrices!")

# Difference with 1D arrays
vec1 = np.array([1, 2, 3])
vec2 = np.array([4, 5, 6])

print("\n1D arrays:")
print(f"vec1 @ vec2 = {vec1 @ vec2} (dot product)")
print(f"np.dot(vec1, vec2) = {np.dot(vec1, vec2)} (dot product)")

print("\n‚úÖ Summary:")
print("@ operator - Modern, clean syntax (recommended)")
print("np.dot()   - Works for all dimensions, more flexible")
print("np.matmul() - Strict matrix multiplication")
print("\nFor 2D matrices, all three are equivalent")

### Q50: Calculate the trace and rank of a matrix.

Given: `matrix = np.array([[1,2,3], [4,5,6], [7,8,9]])`

**Difficulty:** Easy  
**Topic:** Linear Algebra

In [None]:
# Solution Q50
matrix = np.array([[1,2,3], [4,5,6], [7,8,9]])
print("Matrix:")
print(matrix)

# Trace (sum of diagonal elements)
trace = np.trace(matrix)
print(f"\nTrace: {trace}")
print(f"Calculation: {matrix[0,0]} + {matrix[1,1]} + {matrix[2,2]} = {trace}")

# Rank (number of linearly independent rows/columns)
rank = np.linalg.matrix_rank(matrix)
print(f"\nRank: {rank}")

print("\n‚úÖ Interpretations:")
print(f"Trace = {trace}: Sum of diagonal elements")
print(f"Rank = {rank}: This 3√ó3 matrix has only {rank} independent rows")
print("  (The third row is a linear combination of first two)")

# Verify rank with a full-rank matrix
full_rank_matrix = np.array([[1,0,0], [0,1,0], [0,0,1]])
print(f"\nIdentity matrix rank: {np.linalg.matrix_rank(full_rank_matrix)}")
print("(Full rank = 3 for 3√ó3 matrix)")

### Q51: Generate random samples from different probability distributions.

Generate 1000 samples from:
- Normal distribution (mean=0, std=1)
- Uniform distribution (0 to 1)
- Binomial distribution (n=10, p=0.5)

**Difficulty:** Easy  
**Topic:** Random & Statistics

In [None]:
# Solution Q51
np.random.seed(42)
n_samples = 1000

# Normal distribution
normal_samples = np.random.normal(loc=0, scale=1, size=n_samples)
print("Normal distribution (mean=0, std=1):")
print(f"Mean: {normal_samples.mean():.3f}")
print(f"Std: {normal_samples.std():.3f}")
print(f"Sample: {normal_samples[:5]}")

# Uniform distribution
uniform_samples = np.random.uniform(low=0, high=1, size=n_samples)
print("\nUniform distribution (0 to 1):")
print(f"Mean: {uniform_samples.mean():.3f} (should be ~0.5)")
print(f"Min: {uniform_samples.min():.3f}")
print(f"Max: {uniform_samples.max():.3f}")
print(f"Sample: {uniform_samples[:5]}")

# Binomial distribution
binomial_samples = np.random.binomial(n=10, p=0.5, size=n_samples)
print("\nBinomial distribution (n=10, p=0.5):")
print(f"Mean: {binomial_samples.mean():.3f} (should be ~5)")
print(f"Sample: {binomial_samples[:10]}")

print("\n‚úÖ Common distributions:")
print("Normal: np.random.normal(mean, std, size)")
print("Uniform: np.random.uniform(low, high, size)")
print("Binomial: np.random.binomial(n, p, size)")
print("Poisson: np.random.poisson(lambda, size)")

### Q52: Calculate the covariance between two variables.

Given: `x = np.array([1, 2, 3, 4, 5])` and `y = np.array([2, 4, 6, 8, 10])`

**Difficulty:** Easy  
**Topic:** Statistics

In [None]:
# Solution Q52
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])

print("x:", x)
print("y:", y)

# Calculate covariance matrix
cov_matrix = np.cov(x, y)
print("\nCovariance matrix:")
print(cov_matrix)

# Extract covariance between x and y
cov_xy = cov_matrix[0, 1]
print(f"\nCovariance between x and y: {cov_xy}")

# Manual calculation
mean_x = x.mean()
mean_y = y.mean()
cov_manual = np.sum((x - mean_x) * (y - mean_y)) / (len(x) - 1)
print(f"Manual calculation: {cov_manual}")

# Correlation coefficient
corr = np.corrcoef(x, y)[0, 1]
print(f"\nCorrelation coefficient: {corr}")

print("\n‚úÖ Interpretation:")
print(f"Covariance = {cov_xy}: Positive ‚Üí variables increase together")
print(f"Correlation = {corr}: Perfect positive correlation (1.0)")
print("\nCovariance measures direction and strength of relationship")
print("Correlation normalizes covariance to [-1, 1] range")

### Q53: Fit a polynomial to data and evaluate it.

Given data points, fit a 2nd degree polynomial.

**Difficulty:** Medium  
**Topic:** Polynomials

In [None]:
# Solution Q53
# Generate data: y = 2x¬≤ + 3x + 1 with noise
np.random.seed(42)
x = np.linspace(0, 5, 20)
y_true = 2*x**2 + 3*x + 1
y_noisy = y_true + np.random.randn(20) * 5

print("Fitting 2nd degree polynomial to noisy data...")

# Fit polynomial
coefficients = np.polyfit(x, y_noisy, deg=2)
print(f"\nFitted coefficients: {coefficients}")
print(f"True coefficients: [2, 3, 1]")

# Create polynomial function
poly = np.poly1d(coefficients)
print(f"\nPolynomial: {poly}")

# Evaluate at new points
x_new = np.array([1, 2, 3])
y_pred = poly(x_new)
print(f"\nEvaluate at x={x_new}:")
print(f"Predicted y: {y_pred}")

# Alternative: using np.polyval
y_pred_alt = np.polyval(coefficients, x_new)
print(f"Using polyval: {y_pred_alt}")

# Calculate R¬≤ score (goodness of fit)
y_fitted = poly(x)
ss_res = np.sum((y_noisy - y_fitted)**2)
ss_tot = np.sum((y_noisy - y_noisy.mean())**2)
r_squared = 1 - (ss_res / ss_tot)
print(f"\nR¬≤ score: {r_squared:.4f}")

print("\n‚úÖ np.polyfit() finds best-fit polynomial coefficients")

### Q54: Calculate the histogram of an array.

Given: `data = np.random.randn(1000) * 10 + 50`

**Difficulty:** Easy  
**Topic:** Statistics

In [None]:
# Solution Q54
np.random.seed(42)
data = np.random.randn(1000) * 10 + 50

print(f"Generated {len(data)} random values")
print(f"Mean: {data.mean():.2f}, Std: {data.std():.2f}")

# Create histogram with 10 bins
counts, bin_edges = np.histogram(data, bins=10)

print("\nHistogram (10 bins):")
print(f"Bin edges: {bin_edges.round(1)}")
print(f"Counts: {counts}")

# Show bins with counts
print("\nBin ranges and counts:")
for i in range(len(counts)):
    print(f"[{bin_edges[i]:6.2f}, {bin_edges[i+1]:6.2f}): {counts[i]:4d} values")

# Custom bin edges
custom_bins = [0, 30, 40, 50, 60, 70, 100]
counts_custom, _ = np.histogram(data, bins=custom_bins)
print(f"\nCustom bins {custom_bins}:")
print(f"Counts: {counts_custom}")

# Digitize - find which bin each value belongs to
bins = [30, 40, 50, 60, 70]
bin_indices = np.digitize(data[:10], bins)
print(f"\nFirst 10 values: {data[:10].round(1)}")
print(f"Bin indices: {bin_indices}")

print("\n‚úÖ np.histogram() groups data into bins for analysis")

### Q55: Calculate numerical gradient (derivative) of a function.

Given: `y = x¬≤`, calculate dy/dx

**Difficulty:** Medium  
**Topic:** Calculus

In [None]:
# Solution Q55
# Function: y = x¬≤
x = np.linspace(0, 10, 100)
y = x**2

print("Function: y = x¬≤")
print("Analytical derivative: dy/dx = 2x")

# Calculate numerical gradient
dy_dx = np.gradient(y, x)

# Check at specific points
test_points = [1, 2, 5]
for point in test_points:
    idx = np.argmin(np.abs(x - point))
    numerical = dy_dx[idx]
    analytical = 2 * x[idx]
    print(f"\nAt x={x[idx]:.2f}:")
    print(f"  Numerical: {numerical:.4f}")
    print(f"  Analytical: {analytical:.4f}")
    print(f"  Difference: {abs(numerical - analytical):.6f}")

# 2D gradient example
X, Y = np.meshgrid(np.linspace(-2, 2, 50), np.linspace(-2, 2, 50))
Z = X**2 + Y**2  # Function: z = x¬≤ + y¬≤

# Calculate gradients
dZ_dY, dZ_dX = np.gradient(Z)

print("\n2D function: z = x¬≤ + y¬≤")
print(f"Gradient shape: {dZ_dX.shape}")
print("‚àÇz/‚àÇx should be 2x, ‚àÇz/‚àÇy should be 2y")

print("\n‚úÖ np.gradient() computes numerical derivatives")

### Q56: Perform numerical integration using trapezoidal rule.

Integrate: ‚à´sin(x)dx from 0 to œÄ

**Difficulty:** Medium  
**Topic:** Calculus

In [None]:
# Solution Q56
# Integrate sin(x) from 0 to œÄ
# Analytical result: -cos(œÄ) - (-cos(0)) = -(-1) - (-1) = 2

x = np.linspace(0, np.pi, 100)
y = np.sin(x)

# Numerical integration using trapezoidal rule
integral = np.trapz(y, x)

print("Integrating: ‚à´sin(x)dx from 0 to œÄ")
print(f"\nNumerical result: {integral:.6f}")
print(f"Analytical result: 2.0")
print(f"Error: {abs(integral - 2.0):.6f}")

# Effect of number of points
for n_points in [10, 50, 100, 500, 1000]:
    x_n = np.linspace(0, np.pi, n_points)
    y_n = np.sin(x_n)
    integral_n = np.trapz(y_n, x_n)
    error = abs(integral_n - 2.0)
    print(f"\n{n_points:4d} points: {integral_n:.8f}, error: {error:.8f}")

print("\n‚úÖ More points ‚Üí better accuracy")
print("np.trapz() uses trapezoidal rule for integration")

### Q57: What is the difference between np.mean() and np.average()? When would you use np.average()?

**Difficulty:** Medium  
**Topic:** Statistics

In [None]:
# Solution Q57
arr = np.array([1, 2, 3, 4, 5])

# np.mean() - simple average
mean = np.mean(arr)
print(f"np.mean(arr): {mean}")

# np.average() - can use weights!
average = np.average(arr)
print(f"np.average(arr): {average}")
print("Without weights, they're the same\n")

# Weighted average
weights = np.array([1, 1, 1, 2, 2])  # Last two values weighted more
weighted_avg = np.average(arr, weights=weights)

print("With weights [1, 1, 1, 2, 2]:")
print(f"Weighted average: {weighted_avg}")

# Manual calculation
manual = np.sum(arr * weights) / np.sum(weights)
print(f"Manual calculation: {manual}")

# Real-world example: grade calculation
grades = np.array([85, 90, 78, 92])
weights = np.array([0.2, 0.3, 0.2, 0.3])  # Different weight for each test

final_grade = np.average(grades, weights=weights)
print(f"\nGrade calculation:")
print(f"Grades: {grades}")
print(f"Weights: {weights}")
print(f"Final grade: {final_grade:.2f}")

print("\n‚úÖ Key Difference:")
print("np.mean()    - Simple average only")
print("np.average() - Can compute weighted average")
print("\nUse np.average() when different values have different importance")

### Q58: Calculate the outer product of two vectors.

Given: `a = np.array([1, 2, 3])` and `b = np.array([4, 5, 6])`

**Difficulty:** Easy  
**Topic:** Linear Algebra

In [None]:
# Solution Q58
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Vector a:", a)
print("Vector b:", b)

# Outer product
outer = np.outer(a, b)
print("\nOuter product (a ‚äó b):")
print(outer)

# Manual calculation to show what it does
print("\n‚úÖ Explanation:")
print("Each element of 'a' multiplied by each element of 'b':")
for i, val_a in enumerate(a):
    row = [val_a * val_b for val_b in b]
    print(f"a[{i}] √ó b = {val_a} √ó {b} = {row}")

# Compare with inner (dot) product
inner = np.inner(a, b)
print(f"\nInner (dot) product: {inner}")
print(f"Calculation: {a[0]}√ó{b[0]} + {a[1]}√ó{b[1]} + {a[2]}√ó{b[2]} = {inner}")

print("\n‚úÖ Key Differences:")
print("Outer product: Returns a matrix (m√ón)")
print("Inner product: Returns a scalar")
print("\nOuter product useful for creating matrices from vectors")

### Q59: Calculate the Frobenius norm of a matrix.

Given: `matrix = np.array([[1, 2], [3, 4]])`

**Difficulty:** Easy  
**Topic:** Linear Algebra

In [None]:
# Solution Q59
matrix = np.array([[1, 2], [3, 4]])
print("Matrix:")
print(matrix)

# Frobenius norm (default for matrices)
frobenius_norm = np.linalg.norm(matrix)
print(f"\nFrobenius norm: {frobenius_norm:.4f}")

# Manual calculation: sqrt of sum of squared elements
manual = np.sqrt(np.sum(matrix**2))
print(f"Manual calculation: {manual:.4f}")

# Other norms
norm_1 = np.linalg.norm(matrix, ord=1)
norm_inf = np.linalg.norm(matrix, ord=np.inf)
print(f"\n1-norm: {norm_1}")
print(f"Infinity norm: {norm_inf}")

print("\n‚úÖ Frobenius norm formula:")
print("||A||_F = ‚àö(Œ£|a_ij|¬≤)")
print(f"= ‚àö(1¬≤ + 2¬≤ + 3¬≤ + 4¬≤) = ‚àö30 ‚âà {frobenius_norm:.4f}")
print("\nMeasures the 'size' or 'magnitude' of a matrix")

### Q60: Use np.random.choice() to sample with and without replacement.

Sample 5 elements from an array.

**Difficulty:** Easy  
**Topic:** Random Sampling

In [None]:
# Solution Q60
np.random.seed(42)
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
print("Original array:", arr)

# Sample WITH replacement (can pick same element multiple times)
sample_with = np.random.choice(arr, size=5, replace=True)
print(f"\nWith replacement (size=5): {sample_with}")
print("Notice: elements can repeat")

# Sample WITHOUT replacement (each element picked at most once)
sample_without = np.random.choice(arr, size=5, replace=False)
print(f"\nWithout replacement (size=5): {sample_without}")
print("Notice: all elements are unique")

# With probabilities (weighted sampling)
probabilities = np.array([0.05, 0.05, 0.1, 0.1, 0.2, 0.2, 0.1, 0.1, 0.05, 0.05])
sample_weighted = np.random.choice(arr, size=100, p=probabilities)
print(f"\nWeighted sampling (100 samples):")
print(f"Most common value: {np.bincount(sample_weighted//10)[5]*10}")
print("(50 and 60 have highest probability: 0.2)")

print("\n‚úÖ Parameters:")
print("replace=True  ‚Üí Can pick same element multiple times")
print("replace=False ‚Üí Each element picked at most once")
print("p=probabilities ‚Üí Weighted sampling")

---
# Section 5: Performance & Real-World Problems (Q61-75)

Testing optimization techniques, memory management, and practical applications.

### Q61: Compare performance: Python loop vs NumPy vectorization.

Calculate squares of 1 million numbers.

**Difficulty:** Easy  
**Topic:** Performance

In [None]:
# Solution Q61
import time

n = 1_000_000
arr = np.arange(n)

# Method 1: Python loop
start = time.time()
result_loop = [x**2 for x in arr]
time_loop = time.time() - start

# Method 2: NumPy vectorized
start = time.time()
result_numpy = arr ** 2
time_numpy = time.time() - start

print(f"Array size: {n:,} elements\n")
print(f"Python loop: {time_loop:.4f} seconds")
print(f"NumPy vectorized: {time_numpy:.4f} seconds")
print(f"\nSpeedup: {time_loop/time_numpy:.1f}x faster!")

print("\n‚úÖ Why NumPy is faster:")
print("1. Written in C (compiled, not interpreted)")
print("2. Vectorized operations (no Python loop overhead)")
print("3. Optimized memory access patterns")
print("4. Uses SIMD instructions when available")
print("\nüí° ALWAYS use vectorized operations in NumPy!")

### Q62: Explain the difference between shallow copy and deep copy in NumPy.

**Difficulty:** Medium  
**Topic:** Memory Management

In [None]:
# Solution Q62
original = np.array([[1, 2, 3], [4, 5, 6]])
print("Original array:")
print(original)

# Assignment (no copy - same object)
assigned = original
assigned[0, 0] = 999
print("\nAfter assignment and modifying assigned[0,0]=999:")
print("Original:", original)
print("Assigned:", assigned)
print("Same object? ", assigned is original)

# Reset
original = np.array([[1, 2, 3], [4, 5, 6]])

# View (shallow copy - shares data)
view = original.view()
view[0, 0] = 999
print("\nAfter view() and modifying view[0,0]=999:")
print("Original:", original)
print("View:", view)
print("Shares data? ", view.base is original)

# Reset
original = np.array([[1, 2, 3], [4, 5, 6]])

# Deep copy (independent)
deep = original.copy()
deep[0, 0] = 999
print("\nAfter copy() and modifying deep[0,0]=999:")
print("Original:", original)
print("Deep copy:", deep)
print("Independent? ", deep.base is None)

print("\n‚úÖ Summary:")
print("Assignment (=)  ‚Üí Same object, same data")
print("View (.view()) ‚Üí New object, shares data")
print("Copy (.copy()) ‚Üí New object, independent data")
print("\nüí° Use .copy() when you need independence!")

### Q63: Implement image brightness adjustment using NumPy.

Increase brightness of a grayscale image by 50 units.

**Difficulty:** Easy  
**Topic:** Real-World Application

In [None]:
# Solution Q63
# Simulate a grayscale image (values 0-255)
np.random.seed(42)
image = np.random.randint(0, 200, size=(5, 5), dtype=np.uint8)

print("Original image (5x5 pixels):")
print(image)
print(f"Min: {image.min()}, Max: {image.max()}, Mean: {image.mean():.1f}")

# Increase brightness by 50 (with clipping to prevent overflow)
brightened = np.clip(image + 50, 0, 255).astype(np.uint8)

print("\nBrightened image (+50):")
print(brightened)
print(f"Min: {brightened.min()}, Max: {brightened.max()}, Mean: {brightened.mean():.1f}")

# Decrease brightness
darkened = np.clip(image - 50, 0, 255).astype(np.uint8)
print("\nDarkened image (-50):")
print(darkened)

# Adjust contrast (multiply by factor)
contrast_increased = np.clip(image * 1.5, 0, 255).astype(np.uint8)
print("\nIncreased contrast (√ó1.5):")
print(contrast_increased)

print("\n‚úÖ Image operations:")
print("Brightness: Add/subtract constant")
print("Contrast: Multiply by factor")
print("Always clip to valid range [0, 255] for uint8 images")

### Q64: Calculate portfolio returns and risk.

Given stock prices and holdings, calculate total value and volatility.

**Difficulty:** Medium  
**Topic:** Financial Analysis

In [None]:
# Solution Q64
# Stock data: [price, shares]
stocks = np.array([
    [150.0, 10],   # Stock A
    [75.0, 20],    # Stock B
    [200.0, 5],    # Stock C
    [50.0, 30]     # Stock D
])

prices = stocks[:, 0]
shares = stocks[:, 1]

print("Portfolio:")
for i, (price, share) in enumerate(stocks):
    print(f"Stock {chr(65+i)}: ${price:.2f} √ó {share:.0f} shares")

# Calculate total value
values = prices * shares
total_value = np.sum(values)
print(f"\nTotal portfolio value: ${total_value:,.2f}")

# Portfolio weights
weights = values / total_value
print("\nPortfolio allocation:")
for i, weight in enumerate(weights):
    print(f"Stock {chr(65+i)}: {weight*100:.1f}%")

# Simulate daily returns (random for demo)
np.random.seed(42)
daily_returns = np.random.randn(30, 4) * 0.02  # 30 days, 4 stocks

# Calculate portfolio daily returns
portfolio_returns = np.dot(daily_returns, weights)

# Risk metrics
mean_return = np.mean(portfolio_returns)
volatility = np.std(portfolio_returns)
sharpe_ratio = mean_return / volatility if volatility > 0 else 0

print(f"\nRisk Metrics (30-day simulation):")
print(f"Average daily return: {mean_return*100:.3f}%")
print(f"Volatility (std dev): {volatility*100:.3f}%")
print(f"Sharpe ratio: {sharpe_ratio:.3f}")

print("\n‚úÖ Key concepts:")
print("Portfolio value = Œ£(price √ó shares)")
print("Weights = individual values / total value")
print("Volatility = standard deviation of returns")

### Q65: Implement k-nearest neighbors distance calculation.

Find k=3 nearest neighbors to a query point.

**Difficulty:** Medium  
**Topic:** Machine Learning

In [None]:
# Solution Q65
# Training data points
training_data = np.array([
    [1, 2],
    [2, 3],
    [3, 1],
    [6, 5],
    [7, 7],
    [8, 6]
])

# Query point
query = np.array([5, 4])
k = 3

print("Training data:")
print(training_data)
print(f"\nQuery point: {query}")
print(f"Find k={k} nearest neighbors\n")

# Calculate Euclidean distances
distances = np.sqrt(np.sum((training_data - query)**2, axis=1))
print("Distances to query:")
for i, dist in enumerate(distances):
    print(f"Point {i} {training_data[i]}: {dist:.3f}")

# Find k nearest indices
nearest_indices = np.argsort(distances)[:k]
nearest_points = training_data[nearest_indices]
nearest_distances = distances[nearest_indices]

print(f"\n{k} Nearest Neighbors:")
for idx, point, dist in zip(nearest_indices, nearest_points, nearest_distances):
    print(f"Point {idx} {point}: distance = {dist:.3f}")

# Predict class (if we had labels)
labels = np.array([0, 0, 0, 1, 1, 1])  # Class labels
nearest_labels = labels[nearest_indices]
predicted_class = np.bincount(nearest_labels).argmax()

print(f"\nClass labels: {labels}")
print(f"Nearest labels: {nearest_labels}")
print(f"Predicted class: {predicted_class}")

print("\n‚úÖ KNN algorithm:")
print("1. Calculate distances to all training points")
print("2. Sort and select k nearest")
print("3. Majority vote for classification")

### Q66: Implement Min-Max normalization for machine learning preprocessing.

Normalize features to [0, 1] range.

**Difficulty:** Easy  
**Topic:** Data Preprocessing

In [None]:
# Solution Q66
# Dataset with different scales
data = np.array([
    [25, 50000, 3],      # Age, Salary, Years
    [35, 75000, 8],
    [45, 90000, 15],
    [22, 45000, 1],
    [50, 120000, 20]
])

print("Original data (Age, Salary, Years):")
print(data)
print(f"\nMin per column: {data.min(axis=0)}")
print(f"Max per column: {data.max(axis=0)}")

# Min-Max normalization: (x - min) / (max - min)
min_vals = data.min(axis=0)
max_vals = data.max(axis=0)
normalized = (data - min_vals) / (max_vals - min_vals)

print("\nNormalized data [0, 1]:")
print(normalized.round(3))

# Verify
print(f"\nMin per column: {normalized.min(axis=0)}")
print(f"Max per column: {normalized.max(axis=0)}")

# Denormalize (reverse the process)
denormalized = normalized * (max_vals - min_vals) + min_vals
print("\nDenormalized (should match original):")
print(denormalized.round(0))

print("\n‚úÖ Min-Max Normalization:")
print("Formula: (x - min) / (max - min)")
print("Scales all features to [0, 1]")
print("Preserves relationships between values")
print("\nUseful when:")
print("- Features have different scales")
print("- Algorithm sensitive to magnitude (e.g., KNN, Neural Networks)")

### Q67: Calculate confusion matrix metrics (precision, recall, F1-score).

Given predictions and actual labels.

**Difficulty:** Medium  
**Topic:** Machine Learning Metrics

In [None]:
# Solution Q67
# Binary classification results
y_true = np.array([1, 0, 1, 1, 0, 1, 0, 0, 1, 1])
y_pred = np.array([1, 0, 1, 0, 0, 1, 1, 0, 1, 1])

print("Actual:   ", y_true)
print("Predicted:", y_pred)

# Calculate confusion matrix components
true_positive = np.sum((y_true == 1) & (y_pred == 1))
true_negative = np.sum((y_true == 0) & (y_pred == 0))
false_positive = np.sum((y_true == 0) & (y_pred == 1))
false_negative = np.sum((y_true == 1) & (y_pred == 0))

print("\nConfusion Matrix:")
print(f"                Predicted")
print(f"              0      1")
print(f"Actual  0    {true_negative}      {false_positive}")
print(f"        1    {false_negative}      {true_positive}")

# Calculate metrics
accuracy = (true_positive + true_negative) / len(y_true)
precision = true_positive / (true_positive + false_positive) if (true_positive + false_positive) > 0 else 0
recall = true_positive / (true_positive + false_negative) if (true_positive + false_negative) > 0 else 0
f1_score = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0

print(f"\nMetrics:")
print(f"Accuracy:  {accuracy:.3f} ({accuracy*100:.1f}%)")
print(f"Precision: {precision:.3f} (of predicted positives, {precision*100:.1f}% were correct)")
print(f"Recall:    {recall:.3f} (of actual positives, {recall*100:.1f}% were found)")
print(f"F1-Score:  {f1_score:.3f} (harmonic mean of precision and recall)")

print("\n‚úÖ Formulas:")
print("Precision = TP / (TP + FP)")
print("Recall = TP / (TP + FN)")
print("F1 = 2 √ó (Precision √ó Recall) / (Precision + Recall)")

### Q68: Implement one-hot encoding for categorical variables.

Convert categorical labels to binary matrix.

**Difficulty:** Easy  
**Topic:** Data Preprocessing

In [None]:
# Solution Q68
# Categorical labels (0, 1, 2 representing different classes)
labels = np.array([0, 2, 1, 0, 2, 1, 1, 0])
n_classes = 3

print("Original labels:", labels)
print(f"Number of classes: {n_classes}\n")

# Method 1: Using np.eye (identity matrix trick)
one_hot1 = np.eye(n_classes)[labels]
print("Method 1 - Using np.eye:")
print(one_hot1.astype(int))

# Method 2: Manual creation
one_hot2 = np.zeros((len(labels), n_classes))
one_hot2[np.arange(len(labels)), labels] = 1
print("\nMethod 2 - Manual:")
print(one_hot2.astype(int))

# Reverse: one-hot to labels
labels_recovered = np.argmax(one_hot1, axis=1)
print(f"\nRecovered labels: {labels_recovered}")
print(f"Match original? {np.array_equal(labels, labels_recovered)}")

# Example with string categories
categories = np.array(['cat', 'dog', 'bird', 'cat', 'bird'])
unique_categories = np.unique(categories)
label_map = {cat: i for i, cat in enumerate(unique_categories)}
numeric_labels = np.array([label_map[cat] for cat in categories])
one_hot_cats = np.eye(len(unique_categories))[numeric_labels]

print(f"\nString categories: {categories}")
print(f"Unique: {unique_categories}")
print(f"Numeric labels: {numeric_labels}")
print("One-hot encoded:")
print(one_hot_cats.astype(int))

print("\n‚úÖ One-hot encoding:")
print("Converts categorical variables to binary vectors")
print("Each category gets its own column with 1/0 values")

### Q69: Implement train-test split for machine learning.

Split data into 80% training and 20% testing.

**Difficulty:** Easy  
**Topic:** Data Preprocessing

In [None]:
# Solution Q69
# Sample dataset
np.random.seed(42)
X = np.random.rand(100, 5)  # 100 samples, 5 features
y = np.random.randint(0, 2, 100)  # Binary labels

print(f"Dataset: {X.shape[0]} samples, {X.shape[1]} features")
print(f"Labels: {y.shape[0]} samples\n")

# Split ratio
train_ratio = 0.8
n_samples = len(X)
n_train = int(n_samples * train_ratio)

# Method 1: Simple split (no shuffling)
X_train_simple = X[:n_train]
X_test_simple = X[n_train:]
y_train_simple = y[:n_train]
y_test_simple = y[n_train:]

print("Method 1 - Simple split:")
print(f"Train: {X_train_simple.shape}, Test: {X_test_simple.shape}")

# Method 2: Random shuffle then split
indices = np.arange(n_samples)
np.random.shuffle(indices)

X_shuffled = X[indices]
y_shuffled = y[indices]

X_train = X_shuffled[:n_train]
X_test = X_shuffled[n_train:]
y_train = y_shuffled[:n_train]
y_test = y_shuffled[n_train:]

print(f"\nMethod 2 - Shuffled split:")
print(f"Train: {X_train.shape}, Test: {X_test.shape}")
print(f"Train labels: {y_train.shape}, Test labels: {y_test.shape}")

# Verify split
print(f"\nVerification:")
print(f"Total samples: {n_samples}")
print(f"Train samples: {len(X_train)} ({len(X_train)/n_samples*100:.0f}%)")
print(f"Test samples: {len(X_test)} ({len(X_test)/n_samples*100:.0f}%)")

print("\n‚úÖ Best practice:")
print("1. Shuffle data before splitting (avoid order bias)")
print("2. Common ratios: 80-20, 70-30, 60-20-20 (train-val-test)")
print("3. Set random seed for reproducibility")

### Q70: Calculate cosine similarity between vectors.

Measure similarity between two documents/vectors.

**Difficulty:** Medium  
**Topic:** Similarity Metrics

In [None]:
# Solution Q70
# Two document vectors (word frequencies)
doc1 = np.array([3, 2, 0, 5])
doc2 = np.array([1, 0, 0, 0])
doc3 = np.array([3, 2, 0, 4])

print("Document vectors:")
print(f"Doc 1: {doc1}")
print(f"Doc 2: {doc2}")
print(f"Doc 3: {doc3}")

# Cosine similarity formula: (A¬∑B) / (||A|| √ó ||B||)
def cosine_similarity(a, b):
    dot_product = np.dot(a, b)
    norm_a = np.linalg.norm(a)
    norm_b = np.linalg.norm(b)
    return dot_product / (norm_a * norm_b) if norm_a > 0 and norm_b > 0 else 0

# Calculate similarities
sim_1_2 = cosine_similarity(doc1, doc2)
sim_1_3 = cosine_similarity(doc1, doc3)
sim_2_3 = cosine_similarity(doc2, doc3)

print(f"\nCosine similarities:")
print(f"Doc1 vs Doc2: {sim_1_2:.4f}")
print(f"Doc1 vs Doc3: {sim_1_3:.4f}")
print(f"Doc2 vs Doc3: {sim_2_3:.4f}")

print("\n‚úÖ Interpretation:")
print(f"Doc1 and Doc3 are most similar ({sim_1_3:.4f})")
print("Cosine similarity ranges from -1 to 1:")
print("  1.0 = identical direction")
print("  0.0 = orthogonal (no similarity)")
print(" -1.0 = opposite direction")

# Pairwise similarity matrix
docs = np.array([doc1, doc2, doc3])
n = len(docs)
similarity_matrix = np.zeros((n, n))

for i in range(n):
    for j in range(n):
        similarity_matrix[i, j] = cosine_similarity(docs[i], docs[j])

print("\nPairwise similarity matrix:")
print(similarity_matrix.round(4))

print("\nüí° Used in:")
print("- Document similarity")
print("- Recommendation systems")
print("- Text mining")

### Q71: Implement batch processing for large datasets.

Process data in batches to manage memory.

**Difficulty:** Medium  
**Topic:** Memory Management

In [None]:
# Solution Q71
# Large dataset simulation
n_samples = 1000
n_features = 10
data = np.random.rand(n_samples, n_features)

print(f"Dataset: {n_samples} samples, {n_features} features")
print(f"Memory: {data.nbytes / 1024:.2f} KB\n")

# Process in batches
batch_size = 100
n_batches = int(np.ceil(n_samples / batch_size))

print(f"Processing in batches of {batch_size}")
print(f"Total batches: {n_batches}\n")

# Example: Calculate mean for each batch
batch_means = []

for i in range(n_batches):
    start_idx = i * batch_size
    end_idx = min((i + 1) * batch_size, n_samples)
    
    batch = data[start_idx:end_idx]
    batch_mean = batch.mean(axis=0)
    batch_means.append(batch_mean)
    
    print(f"Batch {i+1}: samples {start_idx:4d}-{end_idx:4d}, shape {batch.shape}")

# Combine batch results
all_batch_means = np.array(batch_means)
overall_mean = all_batch_means.mean(axis=0)

print(f"\nOverall mean (from batches): {overall_mean[:3].round(4)}...")

# Verify with direct calculation
direct_mean = data.mean(axis=0)
print(f"Direct mean: {direct_mean[:3].round(4)}...")
print(f"Match? {np.allclose(overall_mean, direct_mean)}")

# Generator function for batches
def batch_generator(data, batch_size):
    n_samples = len(data)
    for i in range(0, n_samples, batch_size):
        yield data[i:min(i + batch_size, n_samples)]

print("\nUsing generator:")
for i, batch in enumerate(batch_generator(data, batch_size)):
    if i < 3:  # Show first 3
        print(f"Batch {i+1}: {batch.shape}")

print("\n‚úÖ Batch processing benefits:")
print("- Reduces memory usage")
print("- Enables processing of datasets larger than RAM")
print("- Useful for training neural networks")

### Q72: Detect and handle outliers using IQR method.

Remove outliers from a dataset.

**Difficulty:** Medium  
**Topic:** Data Cleaning

In [None]:
# Solution Q72
# Data with outliers
np.random.seed(42)
data = np.concatenate([
    np.random.normal(50, 10, 95),  # Normal data
    np.array([150, 200, -50, 180, 190])  # Outliers
])

print(f"Dataset: {len(data)} values")
print(f"Min: {data.min():.1f}, Max: {data.max():.1f}, Mean: {data.mean():.1f}")

# IQR method for outlier detection
Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1

# Define outlier bounds
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

print(f"\nIQR Analysis:")
print(f"Q1 (25th percentile): {Q1:.2f}")
print(f"Q3 (75th percentile): {Q3:.2f}")
print(f"IQR: {IQR:.2f}")
print(f"\nOutlier bounds:")
print(f"Lower: {lower_bound:.2f}")
print(f"Upper: {upper_bound:.2f}")

# Identify outliers
outliers = (data < lower_bound) | (data > upper_bound)
n_outliers = np.sum(outliers)

print(f"\nOutliers detected: {n_outliers}")
print(f"Outlier values: {data[outliers]}")

# Remove outliers
data_clean = data[~outliers]
print(f"\nAfter removing outliers:")
print(f"Remaining: {len(data_clean)} values")
print(f"Min: {data_clean.min():.1f}, Max: {data_clean.max():.1f}, Mean: {data_clean.mean():.1f}")

# Alternative: Cap outliers instead of removing
data_capped = np.clip(data, lower_bound, upper_bound)
print(f"\nAfter capping outliers:")
print(f"Min: {data_capped.min():.1f}, Max: {data_capped.max():.1f}, Mean: {data_capped.mean():.1f}")

print("\n‚úÖ IQR Method:")
print("Outliers = values < Q1 - 1.5√óIQR or > Q3 + 1.5√óIQR")
print("\nOptions:")
print("1. Remove outliers (reduces dataset size)")
print("2. Cap outliers (preserves dataset size)")
print("3. Transform data (log, sqrt)")

### Q73: Implement simple linear regression from scratch.

Fit y = mx + b to data points.

**Difficulty:** Hard  
**Topic:** Machine Learning

In [None]:
# Solution Q73
# Generate data: y = 2x + 3 with noise
np.random.seed(42)
x = np.linspace(0, 10, 50)
y_true = 2 * x + 3
y = y_true + np.random.randn(50) * 2

print("Fitting linear regression: y = mx + b")
print(f"True parameters: m=2, b=3\n")

# Method 1: Using formulas
# m = (n*Œ£xy - Œ£x*Œ£y) / (n*Œ£x¬≤ - (Œ£x)¬≤)
# b = (Œ£y - m*Œ£x) / n

n = len(x)
sum_x = np.sum(x)
sum_y = np.sum(y)
sum_xy = np.sum(x * y)
sum_x2 = np.sum(x ** 2)

m = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - sum_x**2)
b = (sum_y - m * sum_x) / n

print(f"Method 1 - Formulas:")
print(f"Slope (m): {m:.4f}")
print(f"Intercept (b): {b:.4f}")

# Method 2: Using numpy polyfit
coeffs = np.polyfit(x, y, deg=1)
m2, b2 = coeffs
print(f"\nMethod 2 - np.polyfit:")
print(f"Slope (m): {m2:.4f}")
print(f"Intercept (b): {b2:.4f}")

# Make predictions
y_pred = m * x + b

# Calculate R¬≤ score
ss_res = np.sum((y - y_pred)**2)
ss_tot = np.sum((y - y.mean())**2)
r2 = 1 - (ss_res / ss_tot)

print(f"\nModel evaluation:")
print(f"R¬≤ score: {r2:.4f}")

# Calculate MSE and RMSE
mse = np.mean((y - y_pred)**2)
rmse = np.sqrt(mse)
print(f"MSE: {mse:.4f}")
print(f"RMSE: {rmse:.4f}")

print("\n‚úÖ Linear regression finds best-fit line")
print("Minimizes sum of squared errors")
print("R¬≤ closer to 1 = better fit")

### Q74: Implement Monte Carlo simulation to estimate œÄ.

Use random sampling to approximate œÄ.

**Difficulty:** Medium  
**Topic:** Simulation

---
# Section 6: Additional Exam-Style Questions (Q76-90)

Covering file I/O, structured arrays, dtype operations, and other commonly tested topics.

### Q76: Load data from a CSV file with headers using NumPy.

Given a CSV file `data.csv`:
```
Name,Age,Score
Alice,25,85
Bob,30,92
Charlie,35,78
```

How do you load this data?

**Difficulty:** Easy  
**Topic:** File I/O

In [None]:
# Solution Q76
# Note: Using the actual CSV file in the directory
csv_path = '/Users/rishan/upgrad/pythonlearning/numpy/data_header.csv'

# Method 1: Using np.genfromtxt (handles headers)
data = np.genfromtxt(csv_path, delimiter=',', skip_header=1, dtype=None, encoding='utf-8')
print("Method 1 - np.genfromtxt (skip_header=1):")
print(data)
print(f"Type: {type(data)}, Shape: {data.shape}\n")

# Method 2: Load with names (structured array)
data_structured = np.genfromtxt(csv_path, delimiter=',', names=True, dtype=None, encoding='utf-8')
print("Method 2 - np.genfromtxt (names=True):")
print(data_structured)
print(f"\nColumn names: {data_structured.dtype.names}")
print(f"Ages: {data_structured['Age']}")
print(f"Scores: {data_structured['Score']}\n")

# Method 3: Load only numeric columns
numeric_data = np.loadtxt(csv_path, delimiter=',', skiprows=1, usecols=(1, 2))
print("Method 3 - np.loadtxt (numeric columns only):")
print(numeric_data)
print(f"Shape: {numeric_data.shape}")

print("\n‚úÖ Key Functions:")
print("np.genfromtxt() - More flexible, handles missing data")
print("  - skip_header=1: Skip first row")
print("  - names=True: Use first row as column names")
print("  - dtype=None: Auto-detect types")
print("\nnp.loadtxt() - Faster but less flexible")
print("  - skiprows=1: Skip header")
print("  - usecols=(1,2): Select specific columns")

print("\nüí° For mixed data types (text + numbers), use genfromtxt with names=True")

### Q77: Save and load NumPy arrays in binary format (.npy and .npz).

**Difficulty:** Easy  
**Topic:** File I/O

In [None]:
# Solution Q77
import os

# Create sample arrays
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([10, 20, 30, 40, 50])

print("Original arrays:")
print("arr1:")
print(arr1)
print("\narr2:", arr2)

# Save single array (.npy)
np.save('/tmp/single_array.npy', arr1)
print("\n‚úÖ Saved arr1 to single_array.npy")

# Load single array
loaded_arr1 = np.load('/tmp/single_array.npy')
print("\nLoaded from .npy file:")
print(loaded_arr1)
print(f"Arrays match: {np.array_equal(arr1, loaded_arr1)}")

# Save multiple arrays (.npz)
np.savez('/tmp/multiple_arrays.npz', first=arr1, second=arr2)
print("\n‚úÖ Saved multiple arrays to multiple_arrays.npz")

# Load multiple arrays
loaded_data = np.load('/tmp/multiple_arrays.npz')
print("\nLoaded from .npz file:")
print(f"Available arrays: {list(loaded_data.keys())}")
print(f"first:\n{loaded_data['first']}")
print(f"second: {loaded_data['second']}")

# Compressed save (.npz with compression)
np.savez_compressed('/tmp/compressed.npz', arr1=arr1, arr2=arr2)
print("\n‚úÖ Saved compressed arrays")

# Check file sizes
npy_size = os.path.getsize('/tmp/single_array.npy')
npz_size = os.path.getsize('/tmp/multiple_arrays.npz')
compressed_size = os.path.getsize('/tmp/compressed.npz')

print(f"\nFile sizes:")
print(f".npy (single): {npy_size} bytes")
print(f".npz (multiple): {npz_size} bytes")
print(f".npz compressed: {compressed_size} bytes")

print("\n‚úÖ Summary:")
print("np.save()  ‚Üí Save single array (.npy)")
print("np.load()  ‚Üí Load array")
print("np.savez() ‚Üí Save multiple arrays (.npz)")
print("np.savez_compressed() ‚Üí Save compressed (.npz)")
print("\nüí° Binary format is faster and preserves exact values")

### Q78: Create and work with structured arrays (record arrays).

Create an array to store employee data: name, age, and salary.

**Difficulty:** Medium  
**Topic:** Structured Arrays

In [None]:
# Solution Q78

# Define structured dtype
dtype = np.dtype([('name', 'U10'), ('age', 'i4'), ('salary', 'f8')])

# Create structured array
employees = np.array([
    ('Alice', 25, 50000.0),
    ('Bob', 30, 65000.0),
    ('Charlie', 35, 75000.0),
    ('Diana', 28, 58000.0)
], dtype=dtype)

print("Structured array:")
print(employees)
print(f"\nDtype: {employees.dtype}")
print(f"Field names: {employees.dtype.names}")

# Access by field name
print("\n‚úÖ Access by field:")
print(f"All names: {employees['name']}")
print(f"All ages: {employees['age']}")
print(f"All salaries: {employees['salary']}")

# Access by index
print("\n‚úÖ Access by index:")
print(f"First employee: {employees[0]}")
print(f"First employee name: {employees[0]['name']}")

# Filter data
print("\n‚úÖ Filtering:")
high_earners = employees[employees['salary'] > 60000]
print(f"Employees earning > 60000:")
print(high_earners)

young_employees = employees[employees['age'] < 30]
print(f"\nEmployees under 30:")
print(young_employees)

# Statistics on fields
print("\n‚úÖ Statistics:")
print(f"Average age: {employees['age'].mean():.1f}")
print(f"Average salary: ${employees['salary'].mean():,.2f}")
print(f"Max salary: ${employees['salary'].max():,.2f}")

# Sort by field
sorted_by_salary = np.sort(employees, order='salary')
print("\n‚úÖ Sorted by salary:")
print(sorted_by_salary)

# Add calculated field (bonus = 10% of salary)
print("\n‚úÖ Calculated field:")
bonuses = employees['salary'] * 0.1
for emp, bonus in zip(employees, bonuses):
    print(f"{emp['name']}: ${bonus:,.2f}")

print("\n‚úÖ Structured arrays are useful for:")
print("- Heterogeneous data (mixed types)")
print("- Named fields (like database records)")
print("- CSV/database import")
print("- Scientific data with metadata")

### Q79: What is the difference between C-order (row-major) and F-order (column-major)?

**Difficulty:** Medium  
**Topic:** Memory Layout

In [None]:
# Solution Q79

# Create arrays with different memory layouts
arr_c = np.array([[1, 2, 3], [4, 5, 6]], order='C')  # C-order (row-major)
arr_f = np.array([[1, 2, 3], [4, 5, 6]], order='F')  # F-order (column-major)

print("Both arrays have same values:")
print(arr_c)
print("\nBut different memory layouts:")
print(f"C-order flags: {arr_c.flags}")
print(f"\nF-order flags: {arr_f.flags}")

# Memory layout visualization
print("\n‚úÖ C-order (row-major) - Default in NumPy/Python:")
print("Memory: [1, 2, 3, 4, 5, 6]")
print("Stores row by row (rows are contiguous)")
print("[[1, 2, 3],")
print(" [4, 5, 6]]")

print("\n‚úÖ F-order (column-major) - Default in Fortran/MATLAB:")
print("Memory: [1, 4, 2, 5, 3, 6]")
print("Stores column by column (columns are contiguous)")
print("[[1, 2, 3],")
print(" [4, 5, 6]]")

# Performance implications
import time

large_c = np.arange(10000000).reshape(10000, 1000, order='C')
large_f = np.arange(10000000).reshape(10000, 1000, order='F')

# Row-wise access (C-order is faster)
start = time.time()
sum_c_rows = np.sum(large_c, axis=1)
time_c_rows = time.time() - start

start = time.time()
sum_f_rows = np.sum(large_f, axis=1)
time_f_rows = time.time() - start

print(f"\n‚úÖ Performance (row-wise sum):")
print(f"C-order: {time_c_rows:.4f}s")
print(f"F-order: {time_f_rows:.4f}s")
print(f"C-order is {time_f_rows/time_c_rows:.2f}x faster for row operations")

# Column-wise access (F-order is faster)
start = time.time()
sum_c_cols = np.sum(large_c, axis=0)
time_c_cols = time.time() - start

start = time.time()
sum_f_cols = np.sum(large_f, axis=0)
time_f_cols = time.time() - start

print(f"\n‚úÖ Performance (column-wise sum):")
print(f"C-order: {time_c_cols:.4f}s")
print(f"F-order: {time_f_cols:.4f}s")
print(f"F-order is {time_c_cols/time_f_cols:.2f}x faster for column operations")

print("\n‚úÖ Key Points:")
print("C-order (row-major):")
print("  - Default in NumPy, Python, C")
print("  - Faster for row-wise operations")
print("  - Last index changes fastest")
print("\nF-order (column-major):")
print("  - Default in Fortran, MATLAB, R")
print("  - Faster for column-wise operations")
print("  - First index changes fastest")
print("\nüí° Use order that matches your access pattern!")

### Q80: Convert between different data types (dtype) and understand precision.

**Difficulty:** Easy  
**Topic:** Data Types

In [None]:
# Solution Q80

# Create array with default dtype
arr = np.array([1.5, 2.7, 3.9, 4.2])
print(f"Original array: {arr}")
print(f"Dtype: {arr.dtype}")
print(f"Memory: {arr.nbytes} bytes\n")

# Convert to different types
arr_int = arr.astype(np.int32)
print(f"As int32: {arr_int}")
print(f"Dtype: {arr_int.dtype}")
print(f"Memory: {arr_int.nbytes} bytes")
print("‚ö†Ô∏è Note: Decimal parts are truncated, not rounded\n")

# Convert to int with rounding
arr_int_rounded = np.round(arr).astype(np.int32)
print(f"Rounded then int32: {arr_int_rounded}\n")

# Different float precisions
arr_f16 = arr.astype(np.float16)
arr_f32 = arr.astype(np.float32)
arr_f64 = arr.astype(np.float64)

print("Different float precisions:")
print(f"float16: {arr_f16}, memory: {arr_f16.nbytes} bytes")
print(f"float32: {arr_f32}, memory: {arr_f32.nbytes} bytes")
print(f"float64: {arr_f64}, memory: {arr_f64.nbytes} bytes\n")

# Precision loss example
large_num = np.array([1234567.89], dtype=np.float64)
print(f"Original (float64): {large_num[0]:.2f}")
print(f"As float32: {large_num.astype(np.float32)[0]:.2f}")
print(f"As float16: {large_num.astype(np.float16)[0]:.2f}")
print("‚ö†Ô∏è Note: Precision loss with lower bit types\n")

# Integer overflow
small_int = np.array([200], dtype=np.int8)
print(f"int8 value: {small_int[0]}")
try:
    overflow = np.array([300], dtype=np.int8)
    print(f"300 as int8: {overflow[0]} (overflow!)")
except:
    pass

# Safe casting check
print("\n‚úÖ Safe casting:")
print(f"Can safely cast float64 to int32? {np.can_cast(np.float64, np.int32)}")
print(f"Can safely cast int32 to float64? {np.can_cast(np.int32, np.float64)}")

# Common dtypes
print("\n‚úÖ Common NumPy dtypes:")
print("Integers: int8, int16, int32, int64, uint8, uint16, uint32, uint64")
print("Floats: float16, float32, float64")
print("Complex: complex64, complex128")
print("Boolean: bool")
print("String: U (Unicode), S (byte string)")

print("\nüí° Choose dtype based on:")
print("- Range of values needed")
print("- Memory constraints")
print("- Precision requirements")
print("- Compatibility with other systems")

### Q81: Use np.meshgrid() to create coordinate matrices.

Create a grid of x and y coordinates for plotting.

**Difficulty:** Medium  
**Topic:** Meshgrid & Broadcasting

In [None]:
# Solution Q81

# Create 1D coordinate arrays
x = np.array([0, 1, 2])
y = np.array([0, 1, 2, 3])

print("1D arrays:")
print(f"x = {x}")
print(f"y = {y}")

# Create 2D coordinate grids
X, Y = np.meshgrid(x, y)

print("\n‚úÖ Meshgrid output:")
print("X (x-coordinates at each point):")
print(X)
print("\nY (y-coordinates at each point):")
print(Y)

print("\nüìù Interpretation:")
print("Each position (i,j) in the grid has coordinates:")
print(f"Point (0,0): x={X[0,0]}, y={Y[0,0]}")
print(f"Point (0,1): x={X[0,1]}, y={Y[0,1]}")
print(f"Point (1,2): x={X[1,2]}, y={Y[1,2]}")

# Use case: Evaluate function on grid
print("\n‚úÖ Example: Evaluate z = x¬≤ + y¬≤ on grid")
Z = X**2 + Y**2
print("Z values:")
print(Z)

# Visualize the grid points
print("\n‚úÖ All grid points (x, y, z):")
for i in range(Y.shape[0]):
    for j in range(X.shape[1]):
        print(f"({X[i,j]}, {Y[i,j]}, {Z[i,j]})", end="  ")
    print()

# Real-world example: Distance from origin
print("\n‚úÖ Example: Distance from origin at each point")
distances = np.sqrt(X**2 + Y**2)
print("Distances:")
print(distances.round(2))

# Create finer grid
x_fine = np.linspace(-2, 2, 5)
y_fine = np.linspace(-2, 2, 5)
X_fine, Y_fine = np.meshgrid(x_fine, y_fine)

print(f"\n‚úÖ Finer grid: {X_fine.shape}")
print("X coordinates:")
print(X_fine)

print("\n‚úÖ Common uses:")
print("- Creating 2D/3D plots")
print("- Evaluating functions on grids")
print("- Image processing")
print("- Contour plots")
print("- Vector field visualization")

print("\nüí° meshgrid creates all combinations of x and y coordinates")

### Q82: Use np.newaxis to add dimensions to arrays.

**Difficulty:** Medium  
**Topic:** Array Manipulation & Broadcasting

In [None]:
# Solution Q82

# Start with 1D array
arr = np.array([1, 2, 3, 4])
print("Original 1D array:")
print(f"arr = {arr}")
print(f"Shape: {arr.shape}\n")

# Add axis to make column vector
col_vector = arr[:, np.newaxis]
print("Column vector (add axis at end):")
print(col_vector)
print(f"Shape: {col_vector.shape}\n")

# Alternative: using reshape
col_vector_alt = arr.reshape(-1, 1)
print("Alternative using reshape(-1, 1):")
print(col_vector_alt)
print(f"Shape: {col_vector_alt.shape}\n")

# Add axis to make row vector
row_vector = arr[np.newaxis, :]
print("Row vector (add axis at start):")
print(row_vector)
print(f"Shape: {row_vector.shape}\n")

# Use case: Broadcasting for outer product
a = np.array([1, 2, 3])
b = np.array([10, 20, 30, 40])

print("Broadcasting example:")
print(f"a = {a}, shape: {a.shape}")
print(f"b = {b}, shape: {b.shape}")

# Create outer product using broadcasting
outer = a[:, np.newaxis] * b[np.newaxis, :]
print("\nOuter product (a[:, newaxis] * b[newaxis, :]):")
print(outer)
print(f"Shape: {outer.shape}")

# Verify with np.outer
outer_verify = np.outer(a, b)
print(f"\nMatches np.outer(a, b)? {np.array_equal(outer, outer_verify)}")

# Multiple newaxis
arr_1d = np.array([1, 2, 3])
arr_3d = arr_1d[np.newaxis, :, np.newaxis]
print(f"\n‚úÖ Adding multiple dimensions:")
print(f"Original shape: {arr_1d.shape}")
print(f"After [newaxis, :, newaxis]: {arr_3d.shape}")
print(arr_3d)

# Practical example: Normalize rows
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n‚úÖ Practical example - Normalize rows:")
print("Original matrix:")
print(matrix)

row_sums = matrix.sum(axis=1)
print(f"\nRow sums: {row_sums}, shape: {row_sums.shape}")

# Need to add axis for broadcasting
normalized = matrix / row_sums[:, np.newaxis]
print("\nNormalized (each row sums to 1):")
print(normalized.round(3))
print(f"Row sums after normalization: {normalized.sum(axis=1)}")

print("\n‚úÖ Summary:")
print("np.newaxis adds a dimension of size 1")
print("arr[:, np.newaxis] ‚Üí column vector (n, 1)")
print("arr[np.newaxis, :] ‚Üí row vector (1, n)")
print("\nüí° Essential for broadcasting operations!")

### Q83: Use np.repeat() and np.tile() - what's the difference?

**Difficulty:** Easy  
**Topic:** Array Manipulation

In [None]:
# Solution Q83

arr = np.array([1, 2, 3])
print("Original array:", arr)

print("\n" + "="*60)
print("np.repeat() - Repeats EACH element")
print("="*60)

# Repeat each element 3 times
repeated = np.repeat(arr, 3)
print(f"\nnp.repeat(arr, 3): {repeated}")
print("Each element repeated: [1,1,1, 2,2,2, 3,3,3]")

# Different repeats for each element
repeated_var = np.repeat(arr, [2, 3, 1])
print(f"\nnp.repeat(arr, [2,3,1]): {repeated_var}")
print("Element-wise repeats: [1,1, 2,2,2, 3]")

print("\n" + "="*60)
print("np.tile() - Repeats ENTIRE array")
print("="*60)

# Tile entire array 3 times
tiled = np.tile(arr, 3)
print(f"\nnp.tile(arr, 3): {tiled}")
print("Whole array repeated: [1,2,3, 1,2,3, 1,2,3]")

# 2D tiling
tiled_2d = np.tile(arr, (2, 3))
print(f"\nnp.tile(arr, (2, 3)):")
print(tiled_2d)
print("2 rows, each with array repeated 3 times")

print("\n" + "="*60)
print("Side-by-side comparison")
print("="*60)

print(f"\nOriginal:        {arr}")
print(f"repeat(arr, 3):  {np.repeat(arr, 3)}")
print(f"tile(arr, 3):    {np.tile(arr, 3)}")

# 2D example
arr_2d = np.array([[1, 2], [3, 4]])
print("\n‚úÖ 2D array example:")
print("Original:")
print(arr_2d)

print("\nnp.repeat(arr_2d, 2, axis=0) - repeat rows:")
print(np.repeat(arr_2d, 2, axis=0))

print("\nnp.repeat(arr_2d, 2, axis=1) - repeat columns:")
print(np.repeat(arr_2d, 2, axis=1))

print("\nnp.tile(arr_2d, (2, 2)) - tile entire array:")
print(np.tile(arr_2d, (2, 2)))

print("\n‚úÖ Key Differences:")
print("np.repeat():")
print("  - Repeats EACH element individually")
print("  - [1,2,3] ‚Üí [1,1,1, 2,2,2, 3,3,3]")
print("  - Can specify different repeats per element")
print("  - Use axis parameter for 2D arrays")

print("\nnp.tile():")
print("  - Repeats ENTIRE array as a unit")
print("  - [1,2,3] ‚Üí [1,2,3, 1,2,3, 1,2,3]")
print("  - Creates tiled patterns")
print("  - Use tuple for multi-dimensional tiling")

print("\nüí° Think: repeat = echo each element, tile = copy whole array")

### Q84: Use np.all() and np.any() for boolean array operations.

**Difficulty:** Easy  
**Topic:** Boolean Operations

In [None]:
# Solution Q84

# Create test arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([0, 2, 0, 4, 0])
arr3 = np.array([1, 1, 1, 1, 1])

print("Arrays:")
print(f"arr1 = {arr1}")
print(f"arr2 = {arr2}")
print(f"arr3 = {arr3}")

print("\n‚úÖ np.all() - Check if ALL elements satisfy condition")
print(f"All elements > 0 in arr1? {np.all(arr1 > 0)}")
print(f"All elements > 0 in arr2? {np.all(arr2 > 0)}")
print(f"All elements == 1 in arr3? {np.all(arr3 == 1)}")

print("\n‚úÖ np.any() - Check if ANY element satisfies condition")
print(f"Any element > 3 in arr1? {np.any(arr1 > 3)}")
print(f"Any element > 10 in arr1? {np.any(arr1 > 10)}")
print(f"Any element == 0 in arr2? {np.any(arr2 == 0)}")

# 2D array example
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n‚úÖ 2D array:")
print(matrix)

print("\nWithout axis (entire array):")
print(f"All elements > 0? {np.all(matrix > 0)}")
print(f"Any element > 8? {np.any(matrix > 8)}")

print("\nWith axis=0 (check each column):")
print(f"All elements > 3 in each column? {np.all(matrix > 3, axis=0)}")
print(f"Any element > 5 in each column? {np.any(matrix > 5, axis=0)}")

print("\nWith axis=1 (check each row):")
print(f"All elements > 0 in each row? {np.all(matrix > 0, axis=1)}")
print(f"Any element > 5 in each row? {np.any(matrix > 5, axis=1)}")

# Practical examples
print("\n‚úÖ Practical examples:")

# Check if array has any NaN
arr_with_nan = np.array([1, 2, np.nan, 4])
print(f"\nArray: {arr_with_nan}")
print(f"Has any NaN? {np.any(np.isnan(arr_with_nan))}")

# Check if all values in range
data = np.array([0.5, 0.8, 0.3, 0.9])
print(f"\nData: {data}")
print(f"All values in [0, 1]? {np.all((data >= 0) & (data <= 1))}")

# Validate input
ages = np.array([25, 30, -5, 45])
print(f"\nAges: {ages}")
print(f"All ages positive? {np.all(ages > 0)}")
if not np.all(ages > 0):
    print("‚ö†Ô∏è Invalid data detected!")

# Check for duplicates
arr_unique = np.array([1, 2, 3, 4, 5])
arr_dups = np.array([1, 2, 2, 3, 4])
print(f"\nArray 1: {arr_unique}")
print(f"Has duplicates? {not np.all(np.unique(arr_unique) == np.sort(arr_unique))}")
print(f"\nArray 2: {arr_dups}")
print(f"Has duplicates? {len(arr_dups) != len(np.unique(arr_dups))}")

print("\n‚úÖ Summary:")
print("np.all() ‚Üí Returns True if ALL elements are True")
print("np.any() ‚Üí Returns True if ANY element is True")
print("\nUse with axis parameter for row/column-wise checks")
print("\nüí° Essential for data validation and filtering!")

### Q85: What happens with integer division and floor division in NumPy?

**Difficulty:** Easy  
**Topic:** Array Operations

In [None]:
# Solution Q85

arr = np.array([10, 11, 12, 13, 14])
divisor = 3

print(f"Array: {arr}")
print(f"Divisor: {divisor}\n")

# True division (/)
true_div = arr / divisor
print(f"True division (arr / {divisor}):")
print(f"Result: {true_div}")
print(f"Dtype: {true_div.dtype}")
print("Returns float, preserves decimal values\n")

# Floor division (//)
floor_div = arr // divisor
print(f"Floor division (arr // {divisor}):")
print(f"Result: {floor_div}")
print(f"Dtype: {floor_div.dtype}")
print("Returns integer, rounds down to nearest integer\n")

# Modulo (%)
modulo = arr % divisor
print(f"Modulo (arr % {divisor}):")
print(f"Result: {modulo}")
print("Returns remainder after division\n")

# Verify: dividend = quotient * divisor + remainder
print("‚úÖ Verification: arr = (arr // 3) * 3 + (arr % 3)")
for val in arr:
    quotient = val // divisor
    remainder = val % divisor
    reconstructed = quotient * divisor + remainder
    print(f"{val} = {quotient} √ó {divisor} + {remainder} = {reconstructed}")

# Negative numbers
print("\n‚úÖ With negative numbers:")
arr_neg = np.array([-10, -11, -12, 13, 14])
print(f"Array: {arr_neg}")
print(f"Floor division by 3: {arr_neg // 3}")
print(f"Modulo 3: {arr_neg % 3}")
print("Note: Floor division rounds toward negative infinity")

# Different dtypes
print("\n‚úÖ Integer vs Float arrays:")
int_arr = np.array([10, 11, 12], dtype=np.int32)
float_arr = np.array([10.0, 11.0, 12.0], dtype=np.float64)

print(f"Integer array / 3: {int_arr / 3} (dtype: {(int_arr / 3).dtype})")
print(f"Float array / 3: {float_arr / 3} (dtype: {(float_arr / 3).dtype})")
print(f"Integer array // 3: {int_arr // 3} (dtype: {(int_arr // 3).dtype})")
print(f"Float array // 3: {float_arr // 3} (dtype: {(float_arr // 3).dtype})")

# Practical example
print("\n‚úÖ Practical example: Convert minutes to hours and minutes")
minutes = np.array([65, 90, 125, 180])
hours = minutes // 60
remaining_mins = minutes % 60

print(f"Minutes: {minutes}")
for m, h, rm in zip(minutes, hours, remaining_mins):
    print(f"{m} minutes = {h} hour(s) and {rm} minute(s)")

print("\n‚úÖ Summary:")
print("/ (true division)  ‚Üí Always returns float")
print("// (floor division) ‚Üí Rounds down to nearest integer")
print("% (modulo)         ‚Üí Returns remainder")
print("\nüí° Use // when you need integer results (e.g., indexing)")

### Q86: Use np.squeeze() and np.expand_dims() to manipulate dimensions.

**Difficulty:** Easy  
**Topic:** Array Shape Manipulation

In [None]:
# Solution Q86

print("="*60)
print("np.squeeze() - Remove dimensions of size 1")
print("="*60)

# Array with unnecessary dimensions
arr = np.array([[[1, 2, 3]]])
print(f"\nOriginal array shape: {arr.shape}")
print(f"Array:\n{arr}")

squeezed = np.squeeze(arr)
print(f"\nAfter squeeze: {squeezed}")
print(f"Shape: {squeezed.shape}")
print("Removed all dimensions of size 1")

# Selective squeeze
arr_multi = np.array([[[[1, 2, 3]]]])
print(f"\n‚úÖ Array with multiple size-1 dimensions: {arr_multi.shape}")
squeezed_axis0 = np.squeeze(arr_multi, axis=0)
print(f"Squeeze axis=0: shape {squeezed_axis0.shape}")
squeezed_axis1 = np.squeeze(arr_multi, axis=1)
print(f"Squeeze axis=1: shape {squeezed_axis1.shape}")

print("\n" + "="*60)
print("np.expand_dims() - Add dimensions")
print("="*60)

# Start with 1D array
arr_1d = np.array([1, 2, 3, 4])
print(f"\nOriginal 1D array: {arr_1d}")
print(f"Shape: {arr_1d.shape}")

# Add dimension at different positions
expanded_0 = np.expand_dims(arr_1d, axis=0)
print(f"\nexpand_dims(axis=0): {expanded_0}")
print(f"Shape: {expanded_0.shape} (row vector)")

expanded_1 = np.expand_dims(arr_1d, axis=1)
print(f"\nexpand_dims(axis=1):\n{expanded_1}")
print(f"Shape: {expanded_1.shape} (column vector)")

# Multiple expansions
expanded_multi = np.expand_dims(np.expand_dims(arr_1d, axis=0), axis=0)
print(f"\nDouble expand_dims: shape {expanded_multi.shape}")

print("\n‚úÖ Comparison with alternatives:")
print(f"expand_dims(arr, 0) same as arr[np.newaxis, :]: {np.array_equal(expanded_0, arr_1d[np.newaxis, :])}")
print(f"expand_dims(arr, 1) same as arr[:, np.newaxis]: {np.array_equal(expanded_1, arr_1d[:, np.newaxis])}")

# Practical example
print("\n‚úÖ Practical example: Prepare data for neural network")
image_data = np.random.rand(28, 28)  # Single grayscale image
print(f"Single image shape: {image_data.shape}")

# Add batch and channel dimensions
batched = np.expand_dims(image_data, axis=0)  # Add batch dim
print(f"After adding batch dim: {batched.shape}")

batched_channeled = np.expand_dims(batched, axis=-1)  # Add channel dim
print(f"After adding channel dim: {batched_channeled.shape}")
print("Now ready for neural network input (batch, height, width, channels)")

# Reverse with squeeze
recovered = np.squeeze(batched_channeled)
print(f"\nAfter squeeze: {recovered.shape}")
print(f"Back to original? {np.array_equal(image_data, recovered)}")

print("\n‚úÖ Summary:")
print("np.squeeze():")
print("  - Removes dimensions of size 1")
print("  - squeeze() removes all, squeeze(axis=n) removes specific")
print("  - Useful for cleaning up array shapes")

print("\nnp.expand_dims():")
print("  - Adds dimension of size 1 at specified axis")
print("  - Useful for broadcasting and batch processing")
print("  - Alternative to np.newaxis")

print("\nüí° Common in deep learning for batch/channel dimensions!")

### Q87: Use np.take() and np.put() for advanced indexing.

**Difficulty:** Medium  
**Topic:** Advanced Indexing

In [None]:
# Solution Q87

arr = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
print("Original array:", arr)

print("\n‚úÖ np.take() - Extract elements at specific indices")
indices = [0, 2, 5, 8]
result = np.take(arr, indices)
print(f"np.take(arr, {indices}): {result}")
print("Equivalent to: arr[[0, 2, 5, 8]]")

# Take with mode parameter
print("\n‚úÖ np.take() with mode (handle out-of-bounds)")
indices_oob = [0, 5, 15, -2]
result_wrap = np.take(arr, indices_oob, mode='wrap')
result_clip = np.take(arr, indices_oob, mode='clip')
print(f"Indices (some out of bounds): {indices_oob}")
print(f"mode='wrap': {result_wrap}")
print(f"mode='clip': {result_clip}")

# 2D array
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n‚úÖ np.take() on 2D array:")
print("Matrix:")
print(matrix)
print(f"np.take(matrix, [0, 4, 8]): {np.take(matrix, [0, 4, 8])}")
print("Takes from flattened array")

# Take along axis
print(f"\nnp.take(matrix, [0, 2], axis=0):")
print(np.take(matrix, [0, 2], axis=0))
print("Takes rows 0 and 2")

print(f"\nnp.take(matrix, [0, 2], axis=1):")
print(np.take(matrix, [0, 2], axis=1))
print("Takes columns 0 and 2")

print("\n‚úÖ np.put() - Replace elements at specific indices")
arr_copy = arr.copy()
print(f"Original: {arr_copy}")
np.put(arr_copy, [1, 3, 5], [999, 888, 777])
print(f"After np.put(arr, [1,3,5], [999,888,777]): {arr_copy}")
print("Modified in-place!")

# Put with single value
arr_copy2 = arr.copy()
np.put(arr_copy2, [0, 2, 4, 6], 0)
print(f"\nAfter np.put(arr, [0,2,4,6], 0): {arr_copy2}")
print("Sets multiple indices to same value")

# Put with mode
arr_copy3 = arr.copy()
np.put(arr_copy3, [0, 5, 15], [100, 200, 300], mode='clip')
print(f"\nWith out-of-bounds index and mode='clip': {arr_copy3}")
print("Index 15 clipped to last valid index")

print("\n‚úÖ Summary:")
print("np.take(arr, indices):")
print("  - Extracts elements at specified indices")
print("  - Returns new array")
print("  - Can specify axis for multidimensional arrays")
print("  - mode: 'raise', 'wrap', 'clip'")

print("\nnp.put(arr, indices, values):")
print("  - Replaces elements at specified indices")
print("  - Modifies array in-place")
print("  - Can broadcast single value to multiple indices")

print("\nüí° More flexible than basic indexing for complex patterns!")

### Q88: Use np.apply_along_axis() to apply a function along an axis.

**Difficulty:** Medium  
**Topic:** Advanced Operations

In [None]:
# Solution Q88

# Create sample data
data = np.array([[1, 2, 3, 4],
                 [5, 6, 7, 8],
                 [9, 10, 11, 12]])

print("Data matrix:")
print(data)

# Define custom function
def range_func(x):
    """Calculate range (max - min)"""
    return x.max() - x.min()

# Apply along axis 0 (down columns)
result_axis0 = np.apply_along_axis(range_func, 0, data)
print(f"\n‚úÖ Apply range_func along axis=0 (columns):")
print(f"Result: {result_axis0}")
print("Calculates range for each column:")
for i, r in enumerate(result_axis0):
    print(f"  Column {i}: max={data[:, i].max()}, min={data[:, i].min()}, range={r}")

# Apply along axis 1 (across rows)
result_axis1 = np.apply_along_axis(range_func, 1, data)
print(f"\n‚úÖ Apply range_func along axis=1 (rows):")
print(f"Result: {result_axis1}")
print("Calculates range for each row:")
for i, r in enumerate(result_axis1):
    print(f"  Row {i}: max={data[i].max()}, min={data[i].min()}, range={r}")

# More complex function
def normalize_to_unit(x):
    """Normalize to unit vector"""
    norm = np.linalg.norm(x)
    return x / norm if norm > 0 else x

print("\n‚úÖ Normalize each row to unit vector:")
normalized = np.apply_along_axis(normalize_to_unit, 1, data)
print(normalized.round(3))

# Verify normalization
norms = np.linalg.norm(normalized, axis=1)
print(f"Norms of normalized rows: {norms.round(3)}")
print("All should be 1.0 ‚úì")

# Function that returns multiple values
def stats_func(x):
    """Return mean and std"""
    return np.array([x.mean(), x.std()])

print("\n‚úÖ Function returning multiple values:")
stats = np.apply_along_axis(stats_func, 1, data)
print("Statistics (mean, std) for each row:")
print(stats.round(2))

# Lambda function
print("\n‚úÖ Using lambda function:")
squared_sums = np.apply_along_axis(lambda x: np.sum(x**2), 1, data)
print(f"Sum of squares for each row: {squared_sums}")

# Comparison with vectorized approach
print("\n‚úÖ Performance note:")
print("For simple operations, vectorized NumPy is faster:")
print(f"Vectorized range: {data.max(axis=1) - data.min(axis=1)}")
print(f"apply_along_axis: {result_axis1}")
print("Use apply_along_axis for complex custom functions")

print("\n‚úÖ Summary:")
print("np.apply_along_axis(func, axis, arr):")
print("  - Applies function to 1D slices along specified axis")
print("  - axis=0: apply to each column")
print("  - axis=1: apply to each row")
print("  - Useful for custom operations not available in NumPy")
print("\nüí° Prefer vectorized operations when possible for speed!")

### Q89: Use np.pad() to add padding to arrays.

**Difficulty:** Easy  
**Topic:** Array Manipulation

In [None]:
# Solution Q89

# Original array
arr = np.array([[1, 2], [3, 4]])
print("Original array:")
print(arr)
print(f"Shape: {arr.shape}\n")

# Constant padding (default)
padded_constant = np.pad(arr, pad_width=1, mode='constant', constant_values=0)
print("‚úÖ Constant padding (pad_width=1, value=0):")
print(padded_constant)
print(f"Shape: {padded_constant.shape}\n")

# Different padding for each side
padded_asymmetric = np.pad(arr, pad_width=((1, 2), (2, 1)), mode='constant', constant_values=0)
print("‚úÖ Asymmetric padding ((1,2), (2,1)):")
print(padded_asymmetric)
print(f"Shape: {padded_asymmetric.shape}")
print("(1,2) = 1 row top, 2 rows bottom")
print("(2,1) = 2 cols left, 1 col right\n")

# Edge padding (repeat edge values)
padded_edge = np.pad(arr, pad_width=1, mode='edge')
print("‚úÖ Edge padding (repeat edge values):")
print(padded_edge)
print("Corners and edges repeated\n")

# Reflect padding (mirror)
padded_reflect = np.pad(arr, pad_width=1, mode='reflect')
print("‚úÖ Reflect padding (mirror without edge):")
print(padded_reflect)
print("Values reflected across boundary\n")

# Symmetric padding (mirror with edge)
padded_symmetric = np.pad(arr, pad_width=1, mode='symmetric')
print("‚úÖ Symmetric padding (mirror with edge):")
print(padded_symmetric)
print("Like reflect but includes edge\n")

# Wrap padding (circular)
padded_wrap = np.pad(arr, pad_width=1, mode='wrap')
print("‚úÖ Wrap padding (circular/periodic):")
print(padded_wrap)
print("Wraps around to opposite side\n")

# 1D example
arr_1d = np.array([1, 2, 3, 4, 5])
print("‚úÖ 1D array padding:")
print(f"Original: {arr_1d}")
print(f"Constant (pad=2): {np.pad(arr_1d, 2, mode='constant', constant_values=0)}")
print(f"Edge (pad=2): {np.pad(arr_1d, 2, mode='edge')}")
print(f"Reflect (pad=2): {np.pad(arr_1d, 2, mode='reflect')}\n")

# Practical example: Image padding for convolution
print("‚úÖ Practical: Image padding for convolution")
image = np.random.randint(0, 255, (5, 5))
print(f"Original image (5√ó5):")
print(image)

padded_image = np.pad(image, pad_width=1, mode='constant', constant_values=0)
print(f"\nPadded image (7√ó7) - ready for 3√ó3 convolution:")
print(padded_image)
print("Padding prevents size reduction during convolution")

print("\n‚úÖ Padding modes summary:")
print("'constant'  - Fill with constant value (default 0)")
print("'edge'      - Repeat edge values")
print("'reflect'   - Mirror without repeating edge")
print("'symmetric' - Mirror with repeating edge")
print("'wrap'      - Circular/periodic boundary")
print("'linear_ramp' - Linear ramp from edge to end value")
print("'maximum'   - Fill with maximum of vector")
print("'minimum'   - Fill with minimum of vector")
print("'mean'      - Fill with mean of vector")

print("\nüí° Essential for image processing and CNNs!")

### Q90: What is the difference between np.array() and np.asarray()?

**Difficulty:** Medium  
**Topic:** Array Creation

In [None]:
# Solution Q90

print("="*60)
print("Key Difference: Copy behavior")
print("="*60)

# Test with existing NumPy array
original = np.array([1, 2, 3, 4, 5])
print(f"\nOriginal array: {original}")
print(f"ID: {id(original)}")

# Using np.array() - ALWAYS creates a copy
arr_copy = np.array(original)
print(f"\nnp.array(original): {arr_copy}")
print(f"ID: {id(arr_copy)}")
print(f"Same object? {arr_copy is original}")
print("‚úÖ np.array() ALWAYS creates a NEW copy")

# Using np.asarray() - Returns original if possible
arr_asarray = np.asarray(original)
print(f"\nnp.asarray(original): {arr_asarray}")
print(f"ID: {id(arr_asarray)}")
print(f"Same object? {arr_asarray is original}")
print("‚úÖ np.asarray() returns ORIGINAL if already ndarray")

# Modify to see the difference
arr_copy[0] = 999
arr_asarray[0] = 888

print(f"\nAfter modifications:")
print(f"Original: {original}")
print(f"arr_copy: {arr_copy}")
print(f"arr_asarray: {arr_asarray}")
print("Note: asarray shares memory with original!")

print("\n" + "="*60)
print("With Python lists (both create new arrays)")
print("="*60)

py_list = [1, 2, 3, 4, 5]
arr1 = np.array(py_list)
arr2 = np.asarray(py_list)

print(f"\nPython list: {py_list}")
print(f"np.array(list): {arr1}, ID: {id(arr1)}")
print(f"np.asarray(list): {arr2}, ID: {id(arr2)}")
print(f"Same object? {arr1 is arr2}")
print("Both create new arrays from Python lists")

print("\n" + "="*60)
print("With dtype conversion")
print("="*60)

float_arr = np.array([1.0, 2.0, 3.0])
print(f"\nOriginal float array: {float_arr}, dtype: {float_arr.dtype}")

# Force dtype change
int_copy = np.array(float_arr, dtype=np.int32)
int_asarray = np.asarray(float_arr, dtype=np.int32)

print(f"\nnp.array(float_arr, dtype=int32): {int_copy}")
print(f"Same object? {int_copy is float_arr}")

print(f"\nnp.asarray(float_arr, dtype=int32): {int_asarray}")
print(f"Same object? {int_asarray is float_arr}")
print("When dtype differs, both create new arrays")

print("\n" + "="*60)
print("Performance implications")
print("="*60)

import time

large_arr = np.arange(1000000)

# np.array() - always copies
start = time.time()
for _ in range(1000):
    _ = np.array(large_arr)
time_array = time.time() - start

# np.asarray() - no copy if possible
start = time.time()
for _ in range(1000):
    _ = np.asarray(large_arr)
time_asarray = time.time() - start

print(f"\nCreating from existing array (1000 iterations):")
print(f"np.array():   {time_array:.4f}s (always copies)")
print(f"np.asarray(): {time_asarray:.4f}s (no copy needed)")
print(f"Speedup: {time_array/time_asarray:.1f}x")

print("\n‚úÖ Summary:")
print("\nnp.array():")
print("  - ALWAYS creates a new copy")
print("  - Use when you need independent array")
print("  - Safer but uses more memory")

print("\nnp.asarray():")
print("  - Returns input if already ndarray (no copy)")
print("  - Creates copy only if necessary")
print("  - More memory efficient")
print("  - Use in functions that accept array-like inputs")

print("\nüí° Use np.asarray() in functions for efficiency:")
print("def my_function(data):")
print("    data = np.asarray(data)  # Efficient conversion")
print("    # ... work with data ..."))

---
# üéâ Complete! All 90 Questions Covered!

You've now completed **90 comprehensive NumPy interview questions** covering:

## üìö **All Sections:**

### **Section 1: Fundamentals (Q1-15)**
‚úÖ Array creation, views vs copies, reshaping, indexing, transpose, broadcasting basics

### **Section 2: Array Operations (Q16-30)**
‚úÖ Normalization, aggregations, concatenation, sorting, set operations, statistics, clipping

### **Section 3: Advanced Indexing & Manipulation (Q31-45)**
‚úÖ Fancy indexing, boolean masks, unique operations, standardization, distances, moving averages

### **Section 4: Linear Algebra & Statistics (Q46-60)**
‚úÖ Matrix operations, eigenvalues, solving equations, distributions, polynomials, sampling

### **Section 5: Performance & Real-World (Q61-75)**
‚úÖ Vectorization, memory management, image processing, financial analysis, ML algorithms, optimization

### **Section 6: Additional Exam Topics (Q76-90)**
‚úÖ File I/O (CSV, .npy, .npz), structured arrays, memory layout (C/F order), dtype operations
‚úÖ meshgrid, newaxis, repeat/tile, all/any, division operators, squeeze/expand_dims
‚úÖ take/put, apply_along_axis, padding, array vs asarray

## üéØ **Question Types Covered:**

‚úÖ Multiple choice questions  
‚úÖ Code output prediction  
‚úÖ Implementation challenges  
‚úÖ Conceptual understanding  
‚úÖ Performance comparisons  
‚úÖ Real-world applications  
‚úÖ Debugging scenarios  
‚úÖ Best practices  

## üìñ **Next Steps:**

1. **Practice actively** - Run all cells and experiment
2. **Time yourself** - Simulate exam conditions
3. **Explain concepts** - Teach someone else
4. **Review weak areas** - Focus on difficult questions
5. **Build projects** - Apply knowledge practically

## üöÄ **Advanced Topics to Explore:**

- **NumPy with Pandas** - Data analysis workflows
- **NumPy with Matplotlib** - Data visualization
- **NumPy with SciPy** - Scientific computing
- **NumPy with Scikit-learn** - Machine learning pipelines
- **NumPy with TensorFlow/PyTorch** - Deep learning

## üìö **Resources:**

- Official NumPy Documentation: https://numpy.org/doc/
- NumPy User Guide: https://numpy.org/doc/stable/user/
- NumPy API Reference: https://numpy.org/doc/stable/reference/
- Practice Problems: https://www.w3resource.com/python-exercises/numpy/

**You're now fully prepared for NumPy interviews and exams! üí™üéì**

Good luck! üçÄ

In [None]:
# Solution Q74
print("Monte Carlo simulation to estimate œÄ\n")

# Simulate throwing darts at a square with inscribed circle
n_samples = 100000

# Random points in unit square [0,1] √ó [0,1]
x = np.random.rand(n_samples)
y = np.random.rand(n_samples)

# Check if points fall inside unit circle (x¬≤ + y¬≤ ‚â§ 1)
distances = np.sqrt(x**2 + y**2)
inside_circle = distances <= 1
n_inside = np.sum(inside_circle)

# Estimate œÄ
# Area of circle / Area of square = œÄ/4
# So œÄ ‚âà 4 √ó (points in circle / total points)
pi_estimate = 4 * n_inside / n_samples

print(f"Samples: {n_samples:,}")
print(f"Points inside circle: {n_inside:,}")
print(f"Points outside circle: {n_samples - n_inside:,}")
print(f"\nEstimated œÄ: {pi_estimate:.6f}")
print(f"Actual œÄ: {np.pi:.6f}")
print(f"Error: {abs(pi_estimate - np.pi):.6f}")
print(f"Error %: {abs(pi_estimate - np.pi)/np.pi * 100:.3f}%")

# Show convergence with different sample sizes
print("\nConvergence analysis:")
sample_sizes = [100, 1000, 10000, 100000, 1000000]

for n in sample_sizes:
    x_n = np.random.rand(n)
    y_n = np.random.rand(n)
    inside_n = np.sum(np.sqrt(x_n**2 + y_n**2) <= 1)
    pi_n = 4 * inside_n / n
    error = abs(pi_n - np.pi)
    print(f"n={n:>7,}: œÄ‚âà{pi_n:.6f}, error={error:.6f}")

print("\n‚úÖ Monte Carlo method:")
print("1. Generate random points in unit square")
print("2. Count points inside unit circle")
print("3. œÄ ‚âà 4 √ó (inside / total)")
print("\nMore samples ‚Üí better estimate!")

### Q75: What are the best practices for optimizing NumPy code?

**Difficulty:** Medium  
**Topic:** Performance Optimization

In [None]:
# Solution Q75
import time

print("NumPy Performance Best Practices\n")
print("="*60)

# 1. Vectorization vs Loops
n = 100000
arr = np.random.rand(n)

start = time.time()
result_loop = np.array([x**2 + 2*x + 1 for x in arr])
time_loop = time.time() - start

start = time.time()
result_vec = arr**2 + 2*arr + 1
time_vec = time.time() - start

print("1. VECTORIZATION")
print(f"   Loop: {time_loop:.4f}s")
print(f"   Vectorized: {time_vec:.4f}s")
print(f"   Speedup: {time_loop/time_vec:.1f}x")

# 2. Pre-allocation vs Growing
print("\n2. PRE-ALLOCATION")
start = time.time()
bad = np.array([])
for i in range(1000):
    bad = np.append(bad, i)
time_bad = time.time() - start

start = time.time()
good = np.zeros(1000)
for i in range(1000):
    good[i] = i
time_good = time.time() - start

print(f"   Growing array: {time_bad:.4f}s")
print(f"   Pre-allocated: {time_good:.4f}s")
print(f"   Speedup: {time_bad/time_good:.1f}x")

# 3. In-place operations
arr = np.random.rand(1000000)
print("\n3. IN-PLACE OPERATIONS")
print(f"   arr = arr * 2  (creates new array)")
print(f"   arr *= 2       (modifies in-place, saves memory)")

# 4. Use appropriate dtypes
print("\n4. APPROPRIATE DATA TYPES")
large_int = np.arange(1000000, dtype=np.int64)
small_int = np.arange(1000000, dtype=np.int32)
print(f"   int64: {large_int.nbytes / 1024 / 1024:.2f} MB")
print(f"   int32: {small_int.nbytes / 1024 / 1024:.2f} MB")
print(f"   Savings: {(1 - small_int.nbytes/large_int.nbytes)*100:.0f}%")

# 5. Broadcasting vs explicit loops
print("\n5. BROADCASTING")
matrix = np.random.rand(1000, 1000)
vector = np.random.rand(1000)

start = time.time()
result_loop = np.array([matrix[i] + vector for i in range(1000)])
time_loop = time.time() - start

start = time.time()
result_broadcast = matrix + vector
time_broadcast = time.time() - start

print(f"   Loop: {time_loop:.4f}s")
print(f"   Broadcasting: {time_broadcast:.4f}s")
print(f"   Speedup: {time_loop/time_broadcast:.1f}x")

print("\n" + "="*60)
print("\n‚úÖ BEST PRACTICES SUMMARY:")
print("1. Use vectorized operations (avoid Python loops)")
print("2. Pre-allocate arrays (don't grow with append)")
print("3. Use in-place operations when possible (arr *= 2)")
print("4. Choose smallest appropriate dtype")
print("5. Leverage broadcasting")
print("6. Use views instead of copies when safe")
print("7. Use built-in NumPy functions (they're optimized)")
print("8. Consider memory layout (C vs Fortran order)")
print("9. Use np.where() instead of if-else in loops")
print("10. Profile your code to find bottlenecks")

---
# üéâ Congratulations!

You've completed all **75 NumPy interview questions**!

## üìö What You've Mastered:

### **Section 1: Fundamentals (Q1-15)**
‚úÖ Array creation, properties, data types, indexing, reshaping, views vs copies

### **Section 2: Array Operations (Q16-30)**
‚úÖ Mathematical operations, aggregations, broadcasting, sorting, set operations

### **Section 3: Advanced Indexing & Manipulation (Q31-45)**
‚úÖ Fancy indexing, boolean masks, np.where/select, array manipulation, distances

### **Section 4: Linear Algebra & Statistics (Q46-60)**
‚úÖ Matrix operations, eigenvalues, solving equations, distributions, polynomials

### **Section 5: Performance & Real-World (Q61-75)**
‚úÖ Optimization, memory management, ML preprocessing, financial analysis, simulations

## üéØ Interview Preparation Tips:

1. **Practice coding** - Run all cells and experiment with variations
2. **Understand concepts** - Don't just memorize, understand why
3. **Time yourself** - Practice solving under time pressure
4. **Explain out loud** - Practice explaining solutions clearly
5. **Review mistakes** - Mark difficult questions for review

## üìñ Additional Resources:

- **Official NumPy Docs**: https://numpy.org/doc/
- **NumPy Tutorial**: https://numpy.org/learn/
- **Practice Problems**: https://www.w3resource.com/python-exercises/numpy/

## üöÄ Next Steps:

- **Pandas**: Built on NumPy, for data analysis
- **SciPy**: Scientific computing
- **Scikit-learn**: Machine learning
- **TensorFlow/PyTorch**: Deep learning

**Good luck with your interviews! üí™**