Here’s a comprehensive breakdown of each question:

---

### **1. What is a parameter?**

A **parameter** is a configuration variable that is internal to a model and whose value is estimated from the data during training (e.g., weights in linear regression or neural networks).

---

### **2. What is correlation?**

**Correlation** measures the strength and direction of a linear relationship between two variables. Values range from -1 to +1.

---

### **3. What does negative correlation mean?**

A **negative correlation** means that as one variable increases, the other tends to decrease. For example, more exercise might be associated with lower body weight.

---

### **4. Define Machine Learning. What are the main components in Machine Learning?**

**Machine Learning (ML)** is a field of AI where algorithms learn patterns from data to make decisions or predictions.

**Main components:**

* **Data**
* **Features**
* **Model**
* **Loss function**
* **Optimizer**
* **Evaluation metrics**

---

### **5. How does loss value help in determining whether the model is good or not?**

The **loss value** quantifies how far off a model's predictions are from the actual labels. A lower loss indicates a better-fitting model.

---

### **6. What are continuous and categorical variables?**

* **Continuous variables**: Numeric values with infinite possibilities (e.g., height, salary).
* **Categorical variables**: Discrete values representing categories (e.g., gender, color).

---

### **7. How do we handle categorical variables in Machine Learning? What are the common techniques?**

**Techniques:**

* **Label Encoding**: Assigns a number to each category.
* **One-Hot Encoding**: Creates binary columns for each category.
* **Ordinal Encoding**: For categories with a natural order.

---

### **8. What do you mean by training and testing a dataset?**

* **Training set**: Data used to train the model.
* **Testing set**: Data used to evaluate model performance on unseen data.

---

### **9. What is `sklearn.preprocessing`?**

It’s a **module** in scikit-learn with tools for:

* Scaling (e.g., `StandardScaler`)
* Encoding (e.g., `OneHotEncoder`)
* Normalization
* Imputation

---

### **10. What is a Test set?**

A **test set** is a portion of the dataset not seen by the model during training, used to assess its generalization.

---

### **11. How do we split data for model fitting (training and testing) in Python?**

Using **`train_test_split`** from `sklearn.model_selection`:

```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
```

---

### **12. How do you approach a Machine Learning problem?**

1. Understand the problem
2. Collect and clean data
3. Perform Exploratory Data Analysis (EDA)
4. Feature engineering
5. Choose and train a model
6. Evaluate the model
7. Tune parameters
8. Deploy the model

---

### **13. Why do we have to perform EDA before fitting a model to the data?**

**EDA (Exploratory Data Analysis)** helps understand:

* Data distribution
* Relationships between variables
* Missing values
* Outliers
* Trends and patterns

---

### **14. How can you find correlation between variables in Python?**

Using **Pandas**:

```python
df.corr()  # Pearson correlation matrix
```

Or **Seaborn heatmap**:

```python
import seaborn as sns
sns.heatmap(df.corr(), annot=True)
```

---

### **15. What is causation? Explain difference between correlation and causation with an example.**

**Causation** means one variable **directly affects** another.
**Correlation** means two variables move together but **may not** be causally related.

**Example:**
Ice cream sales and drowning both increase in summer. They’re correlated but one doesn’t cause the other.

---

### **16. What is an Optimizer? What are different types of optimizers? Explain each with an example.**

An **optimizer** updates model parameters to minimize the loss.

**Types:**

* **SGD**: Simple gradient descent.
* **Adam**: Adaptive learning rate.
* **RMSProp**: Adjusts learning rate based on recent gradients.

**Example with Keras:**

```python
model.compile(optimizer='adam', loss='mse')
```

---

### **17. What is `sklearn.linear_model`?**

A module in scikit-learn with linear models:

* `LinearRegression`
* `LogisticRegression`
* `Ridge`
* `Lasso`

---

### **18. What does `model.fit()` do? What arguments must be given?**

Trains the model on data.

```python
model.fit(X_train, y_train)
```

---

### **19. What does `model.predict()` do? What arguments must be given?**

Makes predictions on new/unseen data.

```python
y_pred = model.predict(X_test)
```

---

### **20. What is feature scaling? How does it help in Machine Learning?**

Feature scaling standardizes/normalizes data so that features contribute equally. It improves convergence speed and accuracy for algorithms like KNN, SVM, and Gradient Descent.

---

### **21. How do we perform scaling in Python?**

Using **`StandardScaler`**:

```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

Or **MinMaxScaler**:

```python
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)
```

---

### **22. Explain data encoding?**

**Data encoding** converts categorical variables into numerical format so ML algorithms can process them.

**Examples:**

* Label Encoding
* One-Hot Encoding
* Binary Encoding

---

Let me know if you’d like flashcards, a quiz version, or a PDF of this summary!
