---
#-------------------->> Questions Answer <<--------------------
---

### **1. What is a parameter?**  
A **parameter** in machine learning is a variable that the model learns from data during training. Parameters help the model make predictions.  

Example:  
In a **linear regression model (y = mx + c)**, **m (slope)** and **c (intercept)** are parameters that the model learns from the dataset.  

---

### **2. What is correlation?**  
**Correlation** measures the relationship between two variables. It tells us how one variable changes with respect to another.  

The **correlation coefficient (r)** ranges from **-1 to 1**:  
- **+1** → Perfect positive correlation (both variables increase together).  
- **-1** → Perfect negative correlation (one increases, the other decreases).  
- **0** → No correlation (no relationship between variables).  

---

### **3. What does negative correlation mean?**  
A **negative correlation** means that as one variable increases, the other decreases.  

Example:  
- The more hours spent watching TV, the lower the exam score.  
- Increase in fuel efficiency leads to a decrease in fuel consumption.  

---

### **4. Define Machine Learning. What are the main components in Machine Learning?**  
**Machine Learning (ML)** is a field of AI where computers learn from data without explicit programming.  

**Main components of ML:**  
1. **Dataset** – The raw data for training and testing.  
2. **Features** – Input variables that influence the model’s prediction.  
3. **Model** – A mathematical representation of the data.  
4. **Loss Function** – Measures the difference between predicted and actual values.  
5. **Optimizer** – Adjusts model parameters to minimize the loss.  
6. **Training Process** – The phase where the model learns patterns from data.  

---

### **5. How does loss value help in determining whether the model is good or not?**  
The **loss value** tells how much the model’s predictions differ from actual results.  

- A **low loss value** means the model is making accurate predictions.  
- A **high loss value** means the model needs improvement.  

Common **loss functions**:  
- **Mean Squared Error (MSE)** – Used in regression problems.  
- **Cross-Entropy Loss** – Used in classification tasks.  

---

### **6. What are continuous and categorical variables?**  
- **Continuous Variables:** Numeric values that can take any value within a range (e.g., height, temperature).  
- **Categorical Variables:** Variables with fixed categories (e.g., colors: Red, Blue, Green).  

---

### **7. How do we handle categorical variables in Machine Learning? What are the common techniques?**  
Categorical variables need to be converted into numerical form using:  
1. **One-Hot Encoding** – Converts categories into binary values.  
2. **Label Encoding** – Assigns numerical labels to categories.  
3. **Target Encoding** – Replaces categories with mean target values.  

Example:  
```python
from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder()
X_encoded = encoder.fit_transform(X)
```

---

### **8. What do you mean by training and testing a dataset?**  
- **Training Dataset:** Used to train the model.  
- **Testing Dataset:** Used to evaluate model performance on unseen data.  

A common split is **80% training, 20% testing**.  

---

### **9. What is sklearn.preprocessing?**  
`sklearn.preprocessing` is a Scikit-Learn module for **data preprocessing**, including:  
- Feature scaling (`StandardScaler`, `MinMaxScaler`)  
- Encoding categorical variables (`OneHotEncoder`, `LabelEncoder`)  

Example:  
```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

---

### **10. What is a Test set?**  
A **test set** is a subset of the dataset used to evaluate a trained model. It helps measure **generalization** performance.  

---

### **11. How do we split data for model fitting (training and testing) in Python?**  
We use `train_test_split` from Scikit-Learn:  
```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
This splits data into **80% training** and **20% testing**.  

---

### **12. How do you approach a Machine Learning problem?**  
**Steps to solve an ML problem:**  
1. Understand the problem.  
2. Collect and preprocess data.  
3. Perform **Exploratory Data Analysis (EDA)**.  
4. Select important features.  
5. Split data into training and testing sets.  
6. Choose a suitable model.  
7. Train the model.  
8. Evaluate model performance.  
9. Tune hyperparameters.  
10. Deploy and monitor the model.  

---

### **13. Why do we have to perform EDA before fitting a model to the data?**  
EDA (Exploratory Data Analysis) helps in:  
- Understanding data distribution.  
- Identifying missing values.  
- Detecting outliers.  
- Finding correlations between variables.  
- Selecting important features for the model.  

---

### **14. What is correlation?**  
Correlation measures the strength of the relationship between two variables. It can be **positive, negative, or zero**.  

---

### **15. What does negative correlation mean?**  
A **negative correlation** means that as one variable increases, the other decreases.  

Example:  
- The more exercise you do, the less body fat you may have.  

---

### **16. How can you find correlation between variables in Python?**  
Using `pandas.corr()` function:  
```python
import pandas as pd
df.corr()
```
This returns a **correlation matrix** showing relationships between variables.  

---

### **17. What is causation? Explain difference between correlation and causation with an example.**  
- **Correlation:** Two variables move together but do not cause each other.  
- **Causation:** One variable directly influences another.  

Example:  
- **Correlation:** Ice cream sales and drowning rates increase together (no causation).  
- **Causation:** Smoking **causes** lung cancer.  

---

### **18. What is an Optimizer? What are different types of optimizers? Explain each with an example.**  
An **optimizer** updates model parameters to minimize loss.  

Types of optimizers:  
1. **Gradient Descent** – Adjusts weights based on loss function.  
2. **Stochastic Gradient Descent (SGD)** – Uses random samples for weight updates.  
3. **Adam Optimizer** – Uses adaptive learning rates.  

Example of Adam Optimizer in TensorFlow:  
```python
import tensorflow as tf
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
```

---

### **19. What is sklearn.linear_model?**  
`sklearn.linear_model` provides linear models such as:  
- **Linear Regression** (`LinearRegression()`)  
- **Logistic Regression** (`LogisticRegression()`)  

Example:  
```python
from sklearn.linear_model import LinearRegression
model = LinearRegression()
```

---

### **20. What does model.fit() do? What arguments must be given?**  
`model.fit(X, y)` trains the model using `X` (features) and `y` (target variable).  

Example:  
```python
model.fit(X_train, y_train)
```

---

### **21. What does model.predict() do? What arguments must be given?**  
`model.predict(X)` generates predictions for input `X`.  

Example:  
```python
y_pred = model.predict(X_test)
```

---

### **22. What are continuous and categorical variables?**  
Same as **question 6**.  

---

### **23. What is feature scaling? How does it help in Machine Learning?**  
Feature scaling ensures that all numerical features have the same scale, improving model performance.  

Techniques:  
1. **Standardization:** `(X - mean) / std`  
2. **Normalization:** `(X - min) / (max - min)`  

---

### **24. How do we perform scaling in Python?**  
Using `StandardScaler`:  
```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

---

### **25. Explain data encoding?**  
Data encoding converts categorical data into numerical format:  
- **One-Hot Encoding**  
- **Label Encoding**  
- **Ordinal Encoding**  

Example:  
```python
from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder()
X_encoded = encoder.fit_transform(X)
```

---

