### Theoretical Questions

#### 1. Overfitting and Underfitting
- **Overfitting:** Overfitting occurs when a machine learning model captures noise in the training data, rather than the underlying distribution. It performs very well on training data but poorly on unseen data. Overfitting typically happens when the model is too complex relative to the amount and variability of the data.
- **Underfitting:** Underfitting occurs when a model is too simple to capture the underlying structure of the data. It performs poorly on both training and unseen data. Underfitting typically happens when the model is too simple relative to the complexity of the data.
- **Techniques to address these issues:**
  - **Overfitting:** Use cross-validation, prune decision trees, apply regularization techniques (like Lasso, Ridge), reduce the complexity of the model, increase the amount of training data, or use dropout in neural networks.
  - **Underfitting:** Increase the model complexity, add more features, reduce regularization, or use more sophisticated algorithms.

#### 2. Bias-Variance Tradeoff
- **Bias:** Bias is the error introduced by approximating a real-world problem, which may be complex, by a much simpler model. High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).
- **Variance:** Variance is the error introduced by the model's sensitivity to small fluctuations in the training set. High variance can cause an algorithm to model the noise in the training data, rather than the intended outputs (overfitting).
- **Tradeoff:** The bias-variance tradeoff is the balance between these two sources of error that must be managed to build models that generalize well. Increasing the complexity of the model typically reduces bias but increases variance, and vice versa.

#### 3. Gradient Descent
- **How it works:** Gradient Descent is an optimization algorithm used to minimize the cost function in machine learning models. It iteratively adjusts the parameters of the model by moving in the direction of the negative gradient of the cost function.
- **Variants:**
  - **Batch Gradient Descent:** Uses the entire dataset to compute the gradient. It can be slow and computationally expensive for large datasets.
  - **Stochastic Gradient Descent (SGD):** Uses one data point at a time to compute the gradient. It's faster and can handle large datasets but introduces more noise in the gradient estimation.
  - **Mini-batch Gradient Descent:** Uses a small random subset of data to compute the gradient. It balances the efficiency of Batch Gradient Descent and the noise reduction of SGD.
- **Usage:** Use Batch Gradient Descent for smaller datasets, SGD for large and streaming datasets, and Mini-batch Gradient Descent for a compromise between the two.

### Data Manipulation Tasks

####  Pandas Data Manipulation
Let's start with the given CSV file (`Sample_Sales_Data.csv`).

**Tasks:**
- **Load the data into a Pandas DataFrame.**
- **Calculate the total sales for each product category.**
- **Find the top 5 products with the highest sales.**

In [1]:
import pandas as pd

# Load the data into a Pandas DataFrame
df = pd.read_csv('Sample_Sales_Data.csv')

# Calculate the total sales for each product category
total_sales_by_category = df.groupby('ProductCategory')['TotalSales'].sum()

# Find the top 5 products with the highest sales
top_5_products = df.groupby('ProductName')['TotalSales'].sum().nlargest(5)

# Output the results
print("Total sales by category:")
print(total_sales_by_category)
print("\nTop 5 products with the highest sales:")
print(top_5_products)


Total sales by category:
ProductCategory
Clothing         6800
Electronics    186000
Furniture       21250
Name: TotalSales, dtype: int64

Top 5 products with the highest sales:
ProductName
Smartphone    120000
Laptop         50000
Sofa           10000
Monitor         9000
Table           9000
Name: TotalSales, dtype: int64


####  Numpy Array Operations
Let's create a 3x3 NumPy array with random integers and perform the specified operations.

**Tasks:**
- **Compute the sum of all elements in the array.**
- **Find the mean and standard deviation of the elements.**
- **Normalize the array (subtract the mean and divide by the standard deviation).**

In [2]:
import numpy as np

# Create a 3x3 NumPy array with random integers
array = np.random.randint(1, 100, size=(3, 3))

# Compute the sum of all elements in the array
sum_of_elements = np.sum(array)

# Find the mean and standard deviation of the elements
mean = np.mean(array)
std_dev = np.std(array)

# Normalize the array
normalized_array = (array - mean) / std_dev

# Output the results
print("Array:")
print(array)
print("\nSum of all elements:", sum_of_elements)
print("Mean of elements:", mean)
print("Standard deviation of elements:", std_dev)
print("\nNormalized array:")
print(normalized_array)


Array:
[[15 36 15]
 [41 22 21]
 [95 38 12]]

Sum of all elements: 295
Mean of elements: 32.77777777777778
Standard deviation of elements: 24.256855973691128

Normalized array:
[[-0.73289703  0.13283759 -0.73289703]
 [ 0.33896488 -0.44431883 -0.48554428]
 [ 2.56513962  0.2152885  -0.85657341]]
