### 1. What is a parameter?

A **parameter** is a configuration variable that is internal to the model and whose value can be estimated from data. In machine learning, parameters are the parts of the model that are learned from the historical training data, such as weights in a linear regression model.

### 2. What is correlation?  What does negative correlation mean?

Correlation is a statistical measure that expresses the extent to which two variables are linearly related. It ranges from -1 to 1.

Negative correlation means that as one variable increases, the other variable tends to decrease. For example, if hours of study increase and the number of errors in a test decreases, they are negatively correlated.

### 3. Define Machine Learning. What are the main components in Machine Learning?

Machine Learning is a subset of artificial intelligence that enables systems to learn from data and make predictions. Main components include:
- Data
- Model
- Loss function
- Optimizer
- Evaluation metrics

### 4. How does loss value help in determining whether the model is good or not?

The loss value quantifies how far off a model's predictions are from the actual values. Lower loss indicates a better performing model.

### 5. What are continuous and categorical variables?

- **Continuous variables** can take any value (e.g., height, weight).
- **Categorical variables** take fixed values from a limited set (e.g., colors, gender).

### 6. How do we handle categorical variables in Machine Learning? What are the common techniques?

Techniques include:
- Label Encoding
- One-Hot Encoding
- Ordinal Encoding

### 7. What do you mean by training and testing a dataset?

Training a dataset means using it to fit the model. Testing dataset is used to evaluate how well the model performs on unseen data.

### 8. What is sklearn.preprocessing?

`sklearn.preprocessing` is a module in Scikit-learn used to prepare data before training. It includes functions for scaling, encoding, and transforming data.

### 9. What is a Test set?

The test set is a subset of data used to assess the performance of a trained model.

### 10. How do we split data for model fitting (training and testing) in Python?  How do you approach a Machine Learning problem?

Using `train_test_split` from `sklearn.model_selection`:
```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

```

#### How do we approach a ML Problem:

1. Define the problem
2. Collect data
3. Perform EDA
4. Preprocess data
5. Train model
6. Evaluate model
7. Tune hyperparameters

### 11. Why do we have to perform EDA before fitting a model to the data?

EDA helps us understand data distribution, missing values, relationships, and outliers, which guide data preprocessing and model choice.

### 14. How can you find correlation between variables in Python?

Using pandas:
```python
df.corr()
```

### 15. What is causation? Explain difference between correlation and causation with an example.

- **Causation** means one variable directly affects another.
- **Correlation** means a relationship exists, but not necessarily causation.
Example: Ice cream sales and drowning deaths are correlated (due to summer), but one does not cause the other.

### 16. What is an Optimizer? What are different types of optimizers? Explain each with an example.

An optimizer adjusts model parameters to minimize loss. Common optimizers:
- SGD: Stochastic Gradient Descent
- Adam: Adaptive Moment Estimation
- RMSprop: Root Mean Square Propagation

### 17. What is sklearn.linear_model?

It is a module in Scikit-learn for linear models such as Linear Regression, Logistic Regression.

### 18. What does model.fit() do? What arguments must be given?

`model.fit(X, y)` trains the model using features `X` and target `y`.

### 19. What does model.predict() do? What arguments must be given?

`model.predict(X_test)` predicts outcomes for new data `X_test` using the trained model.

### 22. What is feature scaling? How does it help in Machine Learning?

Feature scaling standardizes the range of independent variables. It helps algorithms like SVM or KNN that rely on distances.

### 23. How do we perform scaling in Python?

Using `StandardScaler` or `MinMaxScaler` from sklearn:
```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

### 25. Explain data encoding?

Data encoding transforms categorical variables into numerical format. Common methods include label and one-hot encoding.