###
### 1.  What is a parameter?

A **parameter** is a numerical value that describes a characteristic of a **population** in statistics. It is fixed but usually unknown, as it represents the true value for the entire population.

**Example:**

* The average height of all adults in a country (true population mean, μ) is a parameter.
* In practice, we estimate it using a **statistic** (like the sample mean, x̄).


####
### 2. What is correlation? What does negative correlation mean?

**Correlation** is a statistical measure that indicates the strength and direction of a relationship between two variables, ranging from **-1 to +1**.

* A **positive correlation** means that as one variable increases, the other tends to increase.
* A **negative correlation** means that as one variable increases, the other tends to decrease (inverse relationship).

**Example:** Hours studied ↑ vs. errors made in an exam ↓ → negative correlation.


####
### 3. Define Machine Learning. What are the main components in Machine Learning?

**Machine Learning (ML)** is a subset of artificial intelligence where systems learn patterns from data and improve their performance on tasks **without being explicitly programmed**.

**Main components of ML:**

* **Dataset:** Collection of input data used for training and testing.
* **Features:** Independent variables or attributes used for prediction.
* **Model/Algorithm:** Mathematical method (e.g., regression, decision tree) that learns from data.
* **Training Process:** Feeding data to the model to adjust parameters.
* **Evaluation:** Measuring performance using metrics (e.g., accuracy, precision).
* **Prediction:** Applying the trained model to unseen data.


####
### 4. How does loss value help in determining whether the model is good or not?

The **loss value** measures how far a model’s predictions are from the actual values — it quantifies the **error** in the model.

* A **low loss value** indicates that predictions are close to the true outputs → the model is performing well.
* A **high loss value** suggests large errors → the model is not learning effectively.
* Tracking loss during training helps in monitoring whether the model is improving, overfitting, or underfitting.


####
### 5. What are continuous and categorical variables?


* **Continuous Variables:** Numerical variables that can take an **infinite number of values within a range**. They are measurable.

  * *Examples:* height, weight, temperature, income.

* **Categorical Variables:** Variables that represent **distinct categories or groups** with no inherent numerical meaning.

  * *Examples:* gender, blood type, car brand, marital status.


####
### 6. How do we handle categorical variables in Machine Learning? What are the common techniques?

Categorical variables must be converted into a numerical format for most ML algorithms. Common techniques include:

* **Label Encoding:** Assigns each category a unique integer (e.g., `Male=0, Female=1`).
* **One-Hot Encoding:** Creates binary columns for each category (e.g., `Red=[1,0,0], Blue=[0,1,0]`).
* **Ordinal Encoding:** Assigns ordered integers when categories have a natural order (e.g., `Low=1, Medium=2, High=3`).
* **Target Encoding:** Replaces categories with statistical measures (e.g., mean of target variable for that category).

👉 Choice depends on whether the variable is **nominal** (no order) or **ordinal** (with order).


####
### 7. What do you mean by training and testing a dataset?

* **Training Dataset:** A portion of the data used to **teach the model** by adjusting its parameters so it can learn patterns and relationships.
* **Testing Dataset:** A separate portion of the data used **after training** to evaluate how well the model generalizes to unseen data.

👉 Splitting ensures the model is not just memorizing but can perform accurately on new, real-world data.



####
### 8. What is sklearn.preprocessing?

**`sklearn.preprocessing`** is a module in **Scikit-learn** that provides tools to **transform and scale data** before feeding it into machine learning models.

It includes methods for:

* **Scaling & Normalization:** `StandardScaler`, `MinMaxScaler`.
* **Encoding Categorical Variables:** `LabelEncoder`, `OneHotEncoder`.
* **Generating Polynomial Features:** `PolynomialFeatures`.
* **Handling Missing Values / Imputation:** `SimpleImputer`.

👉 These preprocessing steps ensure data is in the right format and scale for better model performance.


####
### 9. What is a Test set?

A **Test set** is a subset of the dataset that is **kept aside during training** and used only to **evaluate the final model’s performance**.

* It represents **unseen data** to check how well the model generalizes.
* Metrics like **accuracy, precision, recall, F1-score, RMSE** are calculated on the test set.
* Helps prevent overfitting by ensuring the model works beyond the training data.


####
### 10. How do we split data for model fitting (training and testing) in Python? How do you approach a Machine Learning problem?

**Splitting Data in Python:**
We typically use **`train_test_split`** from Scikit-learn:

```python
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```

* `test_size=0.2` → 20% data for testing, 80% for training.
* `random_state` ensures reproducibility.

---

**Approach to a Machine Learning Problem:**

1. **Understand the Problem:** Define objectives and target variable.
2. **Collect & Explore Data:** Perform EDA (Exploratory Data Analysis).
3. **Preprocess Data:** Handle missing values, encode categories, scale features.
4. **Split Data:** Train/Test (and sometimes validation) split.
5. **Select Model:** Choose algorithm(s) suitable for the problem.
6. **Train the Model:** Fit on training data.
7. **Evaluate Performance:** Test using metrics (accuracy, RMSE, etc.).
8. **Tune Hyperparameters:** Optimize model performance.
9. **Deploy & Monitor:** Put the model into production and track results.


####
### 11. Why do we have to perform EDA before fitting a model to the data?

**`Exploratory Data Analysis (EDA)`** is the process of **analyzing and visualizing data** to understand its structure, patterns, and anomalies before applying machine learning models.

It helps in:

* **Identifying Missing Values & Outliers:** Detects gaps or extreme values that can affect model performance.
* **Understanding Data Distributions:** Reveals how features and the target variable are distributed.
* **Detecting Relationships & Correlations:** Highlights dependencies between features which may inform feature selection.
* **Assessing Class Imbalance:** Checks if the target variable has skewed classes that need handling.

👉 Performing EDA ensures the data is **clean, consistent, and properly structured**, which improves model accuracy and reliability.


####
### 12. What is correlation?

**`Correlation`** is a statistical measure that describes the **strength and direction of a relationship** between two variables.

It helps in understanding how one variable **changes in relation to another**.

Key points:

* **Positive Correlation:** Both variables increase or decrease together.
* **Negative Correlation:** One variable increases while the other decreases.
* **No Correlation:** No predictable relationship between the variables.
* **Common Measure:** `Pearson correlation coefficient` (ranges from -1 to +1).

👉 Correlation is useful in **feature selection**, identifying redundant variables, and understanding patterns in the dataset before modeling.


####
### 13. What does negative correlation mean?

**`Negative Correlation`** refers to a relationship between two variables where **as one variable increases, the other decreases**, and vice versa.

Key points:

* **Direction:** Inversely proportional – when one goes up, the other goes down.
* **Strength:** Measured by the correlation coefficient (ranges from -1 to 0 for negative correlation).
* **Example:** As temperature rises, heating energy consumption usually decreases.
* **Use in ML:** Helps identify features that move in opposite directions, which can inform feature selection and model interpretation.

👉 Negative correlation indicates an **inverse relationship** between variables.


####
### 14. How can you find correlation between variables in Python?

**Finding Correlation Between Variables in Python** can be done using **pandas** and **visualization libraries** like **seaborn**.

Common methods:

* **Using `pandas.DataFrame.corr()`:**

```python
import pandas as pd
correlation_matrix = df.corr()
print(correlation_matrix)
```

This computes pairwise correlation (Pearson by default) between numerical features.

* **Visualizing with a Heatmap (Seaborn):**

```python
import seaborn as sns
import matplotlib.pyplot as plt

sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.show()
```

Heatmaps make it easier to spot strong positive or negative correlations.

* **Other Correlation Methods:** `df.corr(method='spearman')` or `method='kendall'` for non-linear or rank-based relationships.

👉 These techniques help identify **relationships between features** and guide feature selection or engineering for machine learning.


####
### 15. What is causation? Explain difference between correlation and causation with an example.

**`Causation`** (or causal relationship) occurs when a change in one variable **directly causes a change** in another variable.

Key points:

* **Direction of Influence:** One variable directly affects the other.
* **Requires Evidence:** Usually determined through experiments or controlled studies, not just observation.

**Difference between Correlation and Causation:**

| Aspect           | Correlation                                  | Causation                                 |
| ---------------- | -------------------------------------------- | ----------------------------------------- |
| **Definition**   | Measures how two variables move together     | One variable **directly affects** another |
| **Relationship** | May be direct, inverse, or even coincidental | Must be a direct cause-effect link        |
| **Example**      | Ice cream sales ↑ and drowning incidents ↑   | Heating the room ↑ → Room temperature ↑   |
| **Implication**  | Does not imply one causes the other          | Implies a causal effect                   |

👉 **Important:** Correlation does **not imply causation**. Two variables can be correlated due to coincidence or a third confounding factor.


####
### 16. What is an Optimizer? What are different types of optimizers? Explain each with an example.

**`Optimizer`** is an algorithm used in **machine learning and deep learning** to **adjust the model’s parameters (weights and biases) during training** in order to **minimize the loss function** and improve model performance.

Optimizers control **how the model learns** and can significantly affect **training speed and accuracy**.

**Common Types of Optimizers:**

* **1. Gradient Descent (GD):**
  Updates weights using the gradient of the loss function with respect to the parameters.

  * **Batch Gradient Descent:** Uses the entire dataset for one update.

    ```python
    # Conceptual
    weights = weights - learning_rate * gradient(loss)
    ```
  * **Example:** Training a linear regression model on a small dataset.

* **2. Stochastic Gradient Descent (SGD):**
  Updates weights for **each training sample**, making it faster but more noisy.

  * **Example:** Large datasets where full batch computation is costly.

* **3. Mini-batch Gradient Descent:**
  Compromise between GD and SGD; updates weights using small batches of data.

  * **Example:** Commonly used in deep learning (batch size = 32, 64, etc.).

* **4. Momentum:**
  Accelerates gradient descent by considering **past updates** to reduce oscillations.

  * **Example:** Helps speed up training in neural networks with steep or flat regions.

* **5. Adaptive Methods:**

  * **Adagrad:** Adjusts learning rate per parameter based on historical gradients.
  * **RMSprop:** Modifies Adagrad to work well in non-stationary settings.
  * **Adam:** Combines Momentum and RMSprop for fast and efficient training.

    ```python
    # Using Adam in TensorFlow/Keras
    model.compile(optimizer='adam', loss='mse')
    ```

👉 Optimizers are **crucial** for efficiently finding the best model parameters and ensuring **faster convergence** during training.


####
### 17. What is sklearn.linear_model ?

**`sklearn.linear_model`** is a module in **Scikit-learn** that provides a collection of **linear models** for regression and classification tasks.

It includes algorithms that assume a **linear relationship between input features and the target variable**.

Key features and classes:

* **Linear Regression:** `LinearRegression()` – Fits a line to predict continuous values.
* **Logistic Regression:** `LogisticRegression()` – Used for binary or multiclass classification.
* **Ridge & Lasso Regression:** `Ridge()`, `Lasso()` – Linear regression with **L2 or L1 regularization** to prevent overfitting.
* **ElasticNet:** `ElasticNet()` – Combines L1 and L2 regularization.
* **SGDRegressor / SGDClassifier:** Implements **stochastic gradient descent** for linear models.

👉 This module is widely used for **predictive modeling** when relationships are approximately linear and provides tools for **regularization, feature selection, and efficient optimization**.


####
### 18. What does model.fit() do? What arguments must be given?

**`model.fit()`** is a method in **Scikit-learn** used to **train a machine learning model** on a given dataset. It **learns the relationships between input features and the target variable** by adjusting the model’s parameters.

Key points:

* **Purpose:** Fits the model to the training data so it can make predictions on new data.
* **Common Arguments:**

  * `X` → Input features (2D array or DataFrame)
  * `y` → Target variable (1D array, Series, or DataFrame)
  * Optional parameters may include `sample_weight` or other model-specific options

Example:

```python
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)  # X_train = features, y_train = target
```

👉 After `fit()`, the model has **learned the optimal parameters** (e.g., weights in linear regression) and is ready for prediction with `model.predict()`.


####
### 19. What does model.predict() do? What arguments must be given?

**`model.predict()`** is a method in **Scikit-learn** used to **make predictions using a trained machine learning model**.

Key points:

* **Purpose:** Uses the model’s learned parameters (from `model.fit()`) to predict the target variable for new input data.
* **Common Arguments:**

  * `X` → Input features (2D array or DataFrame) for which predictions are required

Example:

```python
from sklearn.linear_model import LinearRegression

# Assuming model is already trained
predictions = model.predict(X_test)  # X_test = new input features
```

👉 The output is an **array of predicted values** (continuous for regression, class labels for classification) based on the input data.


####
### 20. What are continuous and categorical variables?

**Variables** in a dataset are generally classified into **continuous** and **categorical** based on the type of values they hold.

* **Continuous Variables:**

  * Take **numerical values** that can vary **infinitely within a range**.
  * Can be **measured** and often allow decimal points.
  * **Examples:** Age, Height, Weight, Temperature, Income.

* **Categorical Variables:**

  * Take **discrete values** representing **categories or groups**.
  * Often **non-numeric** or encoded as numbers for modeling.
  * **Examples:** Gender (Male/Female), Color (Red/Blue/Green), Payment Type (Cash/Card).

👉 Correctly identifying these variables is important for **data preprocessing, feature encoding, and choosing the right ML algorithms**.


####
### 21. What is feature scaling? How does it help in Machine Learning?

**`Feature Scaling`** is the process of **normalizing or standardizing the range of independent variables (features)** in a dataset so that they have comparable scales.

Key points:

* **Purpose:** Ensures that **all features contribute equally** to the model and prevents features with larger ranges from dominating.
* **Common Techniques:**

  * **Standardization:** `StandardScaler()` → scales features to have **mean = 0** and **standard deviation = 1**
  * **Normalization:** `MinMaxScaler()` → scales features to a **range \[0, 1]**
* **Importance in ML:**

  * Improves **convergence speed** for gradient-based algorithms (e.g., **Gradient Descent, Neural Networks**)
  * Helps algorithms that rely on **distance metrics** (e.g., **KNN, K-Means, SVM**) perform better

Example:

```python
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

👉 Feature scaling ensures the model **learns efficiently and avoids bias toward features with larger magnitudes**.


####
### 22. How do we perform scaling in Python?

**`Feature Scaling in Python`** is typically done using **Scikit-learn’s preprocessing module**.

Common methods:

* **1. Standardization (Z-score Scaling):**
  Scales features to have **mean = 0** and **standard deviation = 1**.

  ```python
  from sklearn.preprocessing import StandardScaler

  scaler = StandardScaler()
  X_scaled = scaler.fit_transform(X)  # X = input features
  ```

* **2. Normalization (Min-Max Scaling):**
  Scales features to a **specific range**, usually `[0, 1]`.

  ```python
  from sklearn.preprocessing import MinMaxScaler

  scaler = MinMaxScaler()
  X_scaled = scaler.fit_transform(X)
  ```

* **3. MaxAbs Scaling:**
  Scales features by **maximum absolute value**, preserving sparsity.

  ```python
  from sklearn.preprocessing import MaxAbsScaler

  scaler = MaxAbsScaler()
  X_scaled = scaler.fit_transform(X)
  ```

* **4. Robust Scaling:**
  Uses **median and IQR**, robust to outliers.

  ```python
  from sklearn.preprocessing import RobustScaler

  scaler = RobustScaler()
  X_scaled = scaler.fit_transform(X)
  ```

👉 Scaling ensures **all features are on a similar scale**, which improves **model training efficiency and performance**, especially for **distance-based and gradient-based algorithms**.


####
### 23. What is sklearn.preprocessing?

**`sklearn.preprocessing`** is a module in **Scikit-learn** that provides tools to **prepare and transform data** before feeding it into machine learning models.

It includes methods for:

* **Scaling & Normalization:** `StandardScaler`, `MinMaxScaler`, `RobustScaler`.
* **Encoding Categorical Variables:** `LabelEncoder`, `OneHotEncoder`.
* **Handling Missing Values / Imputation:** `SimpleImputer`.
* **Generating Polynomial Features:** `PolynomialFeatures`.

👉 These preprocessing steps ensure the data is **in the right format and scale**, improving model performance and training efficiency.


####
### 24. How do we split data for model fitting (training and testing) in Python?

**`Splitting Data for Model Fitting`** is the process of dividing a dataset into **training and testing sets** to evaluate a machine learning model’s performance on unseen data.

In Python, this is commonly done using **Scikit-learn’s `train_test_split`**:

```python
from sklearn.model_selection import train_test_split

# X = features, y = target variable
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)
```

* **Arguments:**

  * `X` → Input features
  * `y` → Target variable
  * `test_size` → Fraction of data for testing (e.g., 0.2 = 20%)
  * `random_state` → Ensures reproducibility

* **Purpose:**

  * **Training set:** Used to **train/fit** the model
  * **Testing set:** Used to **evaluate** model performance on unseen data

👉 Proper splitting helps **prevent overfitting** and gives a **reliable estimate of model performance**.


####
### 25. Explain data encoding?

**`Data Encoding`** is the process of **converting categorical variables into numerical values** so that machine learning models can process them, as most algorithms require numerical input.

Common methods:

* **Label Encoding:**
  Assigns a **unique integer** to each category.

  ```python
  from sklearn.preprocessing import LabelEncoder

  le = LabelEncoder()
  y_encoded = le.fit_transform(y)  # y = categorical target
  ```

  *Example:* `['Red', 'Blue', 'Green'] → [0, 1, 2]`

* **One-Hot Encoding:**
  Creates **binary columns** for each category to avoid implying ordinal relationships.

  ```python
  from sklearn.preprocessing import OneHotEncoder

  ohe = OneHotEncoder()
  X_encoded = ohe.fit_transform(X)  # X = categorical features
  ```

  *Example:* `['Red', 'Blue'] → [[1,0], [0,1]]`

* **Ordinal Encoding:**
  Assigns **ordered integers** to categories with a natural ranking.
  *Example:* `['Low', 'Medium', 'High'] → [0, 1, 2]`

👉 Data encoding ensures categorical features are **interpretable by ML models** while preserving relationships or avoiding unintended biases.
