
---

### **1. What is a parameter?**

A **parameter** is a variable in a machine learning model that is learned from the training data. It determines how the input data is transformed into the desired output. Examples include weights in linear regression and neural networks.

---

### **2. What is correlation?**

**Correlation** is a statistical measure that indicates the extent to which two or more variables fluctuate together. A positive correlation means that the variables increase or decrease together, while a negative correlation indicates one variable increases as the other decreases.

---

### **3. What does negative correlation mean?**

**Negative correlation** occurs when one variable increases and the other decreases. For example, if study time increases and the number of errors on a test decreases, they have a negative correlation.

---

### **4. Define Machine Learning. What are the main components in Machine Learning?**

**Machine Learning** is a subset of AI that enables computers to learn from data and make predictions without being explicitly programmed.
**Main components:**

* Data
* Model
* Loss function
* Optimizer
* Evaluation metrics

---

### **5. How does loss value help in determining whether the model is good or not?**

The **loss value** indicates how far the predicted output is from the actual output. A lower loss means the model's predictions are close to the true values, hence a better model.

---

### **6. What are continuous and categorical variables?**

* **Continuous variables** are numerical and can take any value (e.g., height, weight).
* **Categorical variables** represent categories or groups (e.g., gender, color).

---

### **7. How do we handle categorical variables in Machine Learning? What are the common techniques?**

Common techniques:

* **Label Encoding**
* **One-Hot Encoding**
* **Ordinal Encoding**
  These methods convert categories into numerical values for model training.

---

### **8. What do you mean by training and testing a dataset?**

* **Training set** is used to teach the model patterns from data.
* **Testing set** is used to evaluate the model’s performance on unseen data.

---

### **9. What is sklearn.preprocessing?**

`sklearn.preprocessing` is a module in Scikit-learn that provides methods for:

* Encoding categorical variables
* Scaling numerical features
* Normalization, binarization, and more

---

### **10. What is a Test set?**

A **Test set** is a portion of data that is kept aside to evaluate the final performance of a trained model. It checks how well the model generalizes.

---

### **11. How do we split data for model fitting (training and testing) in Python?**

Using Scikit-learn:

```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
```

---

### **12. How do you approach a Machine Learning problem?**

Steps:

1. Understand the problem
2. Collect and clean data
3. Perform EDA
4. Feature selection and engineering
5. Split data
6. Train model
7. Evaluate and tune
8. Deploy

---

### **13. Why do we have to perform EDA before fitting a model to the data?**

**Exploratory Data Analysis (EDA)** helps to understand the structure, detect outliers, check correlations, and uncover patterns, ensuring better preprocessing and model performance.

---

### **14. What is correlation?**

(Repeated; see Q2)

---

### **15. What does negative correlation mean?**

(Repeated; see Q3)

---

### **16. How can you find correlation between variables in Python?**

Using Pandas:

```python
df.corr()
```

For visualization:

```python
import seaborn as sns
sns.heatmap(df.corr(), annot=True)
```

---

### **17. What is causation? Explain difference between correlation and causation with an example.**

**Causation** means one variable **directly affects** the other.
**Correlation** is just a relationship, not cause-effect.
Example: Ice cream sales and drowning incidents may be correlated (both rise in summer), but one does not cause the other.

---

### **18. What is an Optimizer? What are different types of optimizers? Explain each with an example.**

An **optimizer** minimizes the loss function to improve model accuracy.
Common types:

* **SGD**: Simple but may be unstable
* **Adam**: Adaptive learning rate
* **RMSProp**: Good for recurrent neural networks

Example using Adam in Keras:

```python
model.compile(optimizer='adam', loss='mse')
```

---

### **19. What is sklearn.linear\_model?**

It is a Scikit-learn module for linear models like:

* Linear Regression
* Logistic Regression
* Ridge and Lasso Regression

---

### **20. What does model.fit() do? What arguments must be given?**

`model.fit()` trains the model using training data.
Arguments:

* `X_train` – input features
* `y_train` – target labels

---

### **21. What does model.predict() do? What arguments must be given?**

`model.predict()` returns predictions from the trained model.
Argument:

* `X_test` – new input data

---

### **22. What are continuous and categorical variables?**

(Repeated; see Q6)

---

### **23. What is feature scaling? How does it help in Machine Learning?**

**Feature scaling** standardizes feature ranges, improving model performance and convergence.
Example: Scaling helps gradient descent converge faster.

---

### **24. How do we perform scaling in Python?**

Using StandardScaler:

```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

---

### **25. What is sklearn.preprocessing?**

(Repeated; see Q9)

---

### **26. How do we split data for model fitting (training and testing) in Python?**

(Repeated; see Q11)

---

### **27. Explain data encoding.**

**Data encoding** converts categorical data into numerical format.
Techniques:

* **Label Encoding**: assigns integers to categories
* **One-Hot Encoding**: creates binary columns for each category

---

