**Theory**

---

### 1. **What is a parameter?**

A **parameter** is an internal variable in a machine learning model that is learned from the training data (e.g., weights in linear regression or neural networks).

---

### 2. **What is correlation?**

**Correlation** measures the linear relationship between two variables. It ranges from -1 (perfect negative) to +1 (perfect positive).

---

### 3. **What does negative correlation mean?**

**Negative correlation** means that as one variable increases, the other decreases. For example, more hours spent watching TV might correlate with lower test scores.

---

### 4. **Define Machine Learning. What are the main components in Machine Learning?**

**Machine Learning** is a subset of AI that enables systems to learn from data and make predictions.
Main components:

* **Data**
* **Model**
* **Loss function**
* **Optimizer**
* **Evaluation**

---

### 5. **How does loss value help in determining whether the model is good or not?**

A **lower loss value** indicates that the model's predictions are closer to the actual values, implying better performance.

---

### 6. **What are continuous and categorical variables?**

* **Continuous variables**: Numerical and can take infinite values (e.g., height, salary).
* **Categorical variables**: Represent groups or categories (e.g., gender, color).

---

### 7. **How do we handle categorical variables in Machine Learning? What are the common techniques?**

Common techniques:

* **Label Encoding**
* **One-Hot Encoding**
* **Ordinal Encoding**

---

### 8. **What do you mean by training and testing a dataset?**

* **Training set**: Used to train the model.
* **Testing set**: Used to evaluate model performance on unseen data.

---

### 9. **What is sklearn.preprocessing?**

It’s a module in scikit-learn that provides functions for preprocessing data, such as scaling, encoding, and imputing missing values.

---

### 10. **What is a Test set?**

A **test set** is a portion of data used to evaluate how well a trained model performs on unseen data.

---

### 11. **How do we split data for model fitting (training and testing) in Python?**

Using `train_test_split` from `sklearn.model_selection`:

```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
```

---

### 12. **How do you approach a Machine Learning problem?**

1. Understand the problem
2. Collect and clean data
3. Perform EDA
4. Feature Engineering
5. Choose model
6. Train/test split
7. Train model
8. Evaluate and tune
9. Deploy

---

### 13. **Why do we have to perform EDA before fitting a model to the data?**

**EDA (Exploratory Data Analysis)** helps identify patterns, detect outliers, and understand the distribution and relationships in the dataset.

---

### 14. **How can you find correlation between variables in Python?**

Using `pandas`:

```python
df.corr()
```

---

### 15. **What is causation? Explain difference between correlation and causation with an example.**

**Causation** means one variable directly affects another.
**Correlation** does not imply causation.
Example: Ice cream sales and drowning deaths are correlated but not causally related—both increase in summer.

---

### 16. **What is an Optimizer? What are different types of optimizers? Explain each with an example.**

An **optimizer** adjusts model parameters to minimize the loss function.
Types:

* **SGD**: Simple but can be slow.
* **Adam**: Combines momentum and RMSprop, widely used.
* **RMSprop**: Scales learning rates based on recent gradients.

Example in TensorFlow:

```python
optimizer = tf.keras.optimizers.Adam()
```

---

### 17. **What is sklearn.linear\_model?**

A module in scikit-learn that contains linear models like Linear Regression, Logistic Regression, etc.

---

### 18. **What does model.fit() do? What arguments must be given?**

`model.fit(X, y)` trains the model on features `X` and target `y`.

---

### 19. **What does model.predict() do? What arguments must be given?**

`model.predict(X)` generates predictions for the input features `X`.

---

### 20. **What is feature scaling? How does it help in Machine Learning?**

**Feature scaling** standardizes the range of features. It improves model performance and convergence speed.
Helps especially with distance-based models (e.g., KNN, SVM).

---

### 21. **How do we perform scaling in Python?**

Using `StandardScaler` from `sklearn.preprocessing`:

```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

---

### 22. **Explain data encoding?**

**Encoding** converts categorical values into numerical format for ML models.
Types:

* Label Encoding
* One-Hot Encoding

---


