###1) What are broadcasting rules?

Broadcasting allows NumPy to perform operations on arrays of different shapes.
The dimensions are compared from right to left.
They are compatible if they are equal or one of them is 1.
If compatible, the smaller array is expanded automatically.

2) Why is broadcasting important in Machine Learning?

Machine learning involves large matrix operations. Broadcasting lets us apply scalars or smaller arrays to big datasets without writing loops.
It improves speed, reduces memory usage, and keeps the code simple.
This is essential for high-performance numerical computation.

3) What is a random number?

A random number is a value generated without a predictable pattern.
In NumPy, random numbers are used for simulations, sampling, and initializing ML models.
They help in testing algorithms and training neural networks.

4) Difference between rand and randint

rand generates floating-point numbers between 0 and 1.
randint generates integers within a specified range.
rand is mainly used in probability or ML weight initialization, while randint is used for discrete values.

5) Explain sorting in NumPy.

Sorting means arranging data in ascending or descending order.
NumPy provides np.sort() to sort elements.
It can sort 1D or multi-dimensional arrays along rows or columns.
Sorting is useful for ranking and analysis.

6) How can we search data in NumPy?

We can search using conditions, np.where(), or boolean indexing.
It helps to find positions or values matching a requirement.
Searching is widely used in filtering and decision-making tasks.

7) What is joining? Explain with examples.

Joining means combining two or more arrays into one.
NumPy provides functions like concatenate, vstack, and hstack.
It is useful when merging datasets or appending records.

Example:

np.concatenate((a, b))

8) Explain shape and reshape.

shape tells the structure of the array (rows, columns).
reshape changes the structure without changing data.
It is useful when preparing data for ML models.

Example:

a.reshape(2,3)

9) Difference between copy and view.

A copy creates a new array with separate memory. Changes will not affect the original.
A view shares the same memory, so changes in one affect the other.
Copy is safer, view is faster.

10) Difference between flatten() and ravel()

Both convert multi-dimensional arrays into 1D.
flatten() returns a copy of data.
ravel() returns a view when possible.
Changes in ravel may affect the original array.

11) Difference between copy and shallow copy

A copy duplicates the entire data into new memory.
A shallow copy copies references but not nested objects.
In NumPy, shallow copies may still reflect changes in shared data.

### Task 1 – Sales Data Analysis

In [1]:
import numpy as np

# rows -> regions, columns -> months
sales = np.array([
    [20000, 22000, 25000, 24000],  # Region 1
    [18000, 21000, 23000, 26000],  # Region 2
    [30000, 32000, 31000, 33000]   # Region 3
])

# Total sales per region
total_sales_region = np.sum(sales, axis=1)
print("Total sales per region:", total_sales_region)

# Average sales per month
avg_sales_month = np.mean(sales, axis=0)
print("Average sales per month:", avg_sales_month)

# Best performing region
best_region = np.argmax(total_sales_region)
print("Best performing region:", best_region)


Total sales per region: [ 91000  88000 126000]
Average sales per month: [22666.66666667 25000.         26333.33333333 27666.66666667]
Best performing region: 2


### Task 2 – Student Ranking System (sorting + argsort)

In [2]:
import numpy as np

marks = np.array([85, 92, 78, 90, 88])

# Sort marks descending
sorted_marks = np.sort(marks)[::-1]
print("Sorted marks:", sorted_marks)

# Ranking indices
ranks = np.argsort(marks)[::-1]
print("Ranking (student positions):", ranks)


Sorted marks: [92 90 88 85 78]
Ranking (student positions): [1 3 4 0 2]


### Task 3 – Production Quality Check

In [3]:
import numpy as np

quality = np.array([45, 60, 55, 30, 75, 90])

good = quality[quality >= 50]
bad = quality[quality < 50]

print("Good products:", good)
print("Defective products:", bad)


Good products: [60 55 75 90]
Defective products: [45 30]


### Task 4 – Broadcasting in Pricing System

In [4]:
import numpy as np

prices = np.array([100, 200, 300])
tax = 10  # fixed tax

final_price = prices + tax
print("Final prices after tax:", final_price)


Final prices after tax: [110 210 310]


## Task 5 – Customer Dataset

In [5]:
import numpy as np

# customer_id, age, purchase
data = np.array([
    [1, 22, 250],
    [2, 35, 400],
    [3, 28, 150],
    [4, 40, 500],
    [5, 30, 300]
])

# 1. Total customers
print("Total customers:", data.shape[0])

# 2. Extract purchase column
purchase = data[:, 2]
print("Purchase column:", purchase)

# 3. Average purchase
avg_purchase = np.mean(purchase)
print("Average purchase:", avg_purchase)

# 4. Max & Min purchase
print("Maximum purchase:", np.max(purchase))
print("Minimum purchase:", np.min(purchase))

# 5. Customers above average purchase
above_avg = data[purchase > avg_purchase]
print("Customers above average purchase:\n", above_avg)


Total customers: 5
Purchase column: [250 400 150 500 300]
Average purchase: 320.0
Maximum purchase: 500
Minimum purchase: 150
Customers above average purchase:
 [[  2  35 400]
 [  4  40 500]]


### Task 6 – Employee Dataset

In [6]:
import numpy as np

# employee data
emp_id = np.array([101, 102, 103, 104, 105])
names = np.array(["Asha", "Ravi", "Neha", "Ravi", "Kiran"])
age = np.array([25, 30, 35, 30, 28])
salary = np.array([30000, 50000, 70000, 50000, 45000])

# 1. Correlation between age & salary
correlation = np.corrcoef(age, salary)
print("Correlation matrix:\n", correlation)

# 2. Mean salary
print("Mean salary:", np.mean(salary))

# 3. Find duplicate names
unique, counts = np.unique(names, return_counts=True)
duplicates = unique[counts > 1]
print("Duplicate names:", duplicates)


Correlation matrix:
 [[1.         0.99586528]
 [0.99586528 1.        ]]
Mean salary: 49000.0
Duplicate names: ['Ravi']
