**1. What is a parameter?**
-	Model parameters are learned from data and control predictions.
-	Hyperparameters are set before training and control the learning process.
-	Function parameters are code-level inputs used to configure models.
-	Statistical parameters describe data distributions and are estimated from data.

**2. What is correlation?**
- Correlation is a statistical measure that shows the strength and direction of a relationship between two variables. It helps us understand whether, and how strongly, the values of two variables are related.
- Correlation tells us how strongly two variables are related. It can be positive, negative, or zero. The Pearson correlation coefficient (r) quantifies this relationship from -1 to +1. In machine learning, it’s useful during feature selection and EDA to understand variable relationships. However, it doesn’t capture nonlinear patterns and should not be confused with causation.


**3. What does negative correlation mean?**
- Negative correlation means that as one variable increases, the other decreases. There is an inverse relationship between the two variables.
- Negative correlation means that two variables move in opposite directions—if one increases, the other decreases. It’s represented by a correlation coefficient between 0 and -1. In machine learning, identifying such relationships helps in feature selection, understanding patterns in data, and improving model interpretability.

**4. Define Machine Learning. What are the main components in Machine Learning?**
- Machine Learning (ML) is a branch of Artificial Intelligence (AI) that focuses on creating systems that can learn from data and improve their performance over time without being explicitly programmed. Instead of writing rules manually, ML algorithms find patterns in data and make predictions or decisions based on it.
- Machine Learning is the science of enabling systems to learn from data and make predictions or decisions. Its main components include data (raw input), features (predictor variables), model (the learner), algorithm (training method), loss function (error calculator), training process, evaluation metrics, and inference (making predictions). Together, these elements work to build intelligent systems that improve over time.

**5. How does loss value help in determining whether the model is good or not?**
- The loss value is a key metric in machine learning that tells us how far off the model’s predictions are from the actual target values. It plays a central role in evaluating and improving the model’s performance during training.
- The loss value helps determine if a model is good by measuring how far its predictions deviate from actual outcomes. A lower loss means better predictions, while a higher loss indicates poor performance. It is crucial during training, as the model uses it to adjust its internal parameters and improve. Unlike accuracy, loss provides a more sensitive and continuous feedback, especially useful when comparing models or detecting overfitting.

**6. What are continuous and categorical variables?**
- In machine learning and data analysis, variables (also called features or columns) are typically classified into two main types: continuous and categorical. Understanding this distinction is essential for data preprocessing, feature selection, and model building.
- Continuous variables are numerical and measurable values that can take any value within a range, like height or income. Categorical variables represent group labels or categories, like gender or education level. Continuous variables are used in regression and may need scaling, while categorical variables require encoding before being used in machine learning models.

**7. How do we handle categorical variables in Machine Learning? What are the common techniques?**
- Categorical variables must be converted into numerical format before feeding them into most machine learning models, because models work with numbers — not text or labels. This process is called encoding.
- To use categorical variables in machine learning, we must convert them into numbers. The most common techniques are label encoding (for ordered categories), one-hot encoding (for unordered categories), and ordinal encoding (with a defined rank). Other techniques like frequency encoding, target encoding, and binary encoding help when dealing with high-cardinality data or when reducing dimensionality is important. The choice depends on the type of data and the model being used.

**8. What do you mean by training and testing a dataset?**
- In machine learning, training and testing a dataset refers to splitting your data into two parts so that the model can learn from one part and be evaluated on the other. This is essential for building models that generalize well to new, unseen data.
- Training a dataset means using part of the data to teach the machine learning model how to make predictions by learning patterns. Testing a dataset means evaluating the trained model on new, unseen data to check its performance. This separation helps ensure the model is not just memorizing but can generalize to real-world situations.

**9. What is sklearn.preprocessing?**
- sklearn.preprocessing is a module in the Scikit-learn (sklearn) library that provides tools for preprocessing and transforming data before feeding it into a machine learning model. Preprocessing is a crucial step because raw data often contains issues like different scales, missing values, or non-numeric values.
- sklearn.preprocessing is a Scikit-learn module used to clean, scale, and transform raw data into a suitable format for machine learning models. It includes tools for scaling numerical features, encoding categorical variables, normalizing data, and creating polynomial features. Preprocessing ensures that the input data is consistent, properly scaled, and model-friendly, which directly affects model performance.

**10. What is a Test set?**
- A test set is a portion of the dataset that is kept separate from the training process and is used only to evaluate the performance of a trained machine learning model.
- A test set is a reserved part of the dataset used to evaluate how well a machine learning model performs on new, unseen data. It is not used during training or tuning. The test set provides a realistic measure of model accuracy and generalization, making it a critical component of the ML workflow.


**11. How do we split data for model fitting (training and testing) in Python?**

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Sample dataset
data = pd.DataFrame({
    'Age': [22, 25, 47, 52, 46, 56, 44, 36],
    'Salary': [21000, 25000, 47000, 52000, 46000, 56000, 44000, 36000],
    'Purchased': [0, 0, 1, 1, 1, 1, 0, 0]
})

X = data[['Age', 'Salary']]   # features
y = data['Purchased']         # target

# Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)

print("Training data shape:", X_train.shape)
print("Testing data shape:", X_test.shape)

Training data shape: (6, 2)
Testing data shape: (2, 2)


**bold text**

**12. How do you approach a Machine Learning problem?**
- We load the Iris flower dataset and split it into training and testing sets. Then we train a Random Forest model to learn from the training data. Finally, we test the model on new data and print the accuracy to see how well it performs in predicting flower species.

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import pandas as pd

# Load and prepare data
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 1.0


**13. Why do we have to perform EDA before fitting a model to the data?**
-	Understand the data
You learn what each column means, what values are present, and what the target is.
-	Detect missing or incorrect values
Models don’t work well with missing or wrongly formatted data. EDA helps identify these issues.
-	Identify patterns and relationships
You can find which features are strongly related to the target, which helps in feature selection.
-	Spot outliers or unusual data
Outliers can badly affect model performance, especially in regression tasks.
-	Choose the right model or transformation
If your target is skewed or categorical, it might guide you toward using classification instead of regression or using log-transformations.
-	Avoid garbage in, garbage out
If the input data is not clean and well-understood, the model will give poor results.

**14. How can you find correlation between variables in Python?**

In [None]:
import pandas as pd

# Sample dataset
data = pd.DataFrame({
    'Age': [22, 25, 47, 52, 46, 56, 44, 36],
    'Salary': [21000, 25000, 47000, 52000, 46000, 56000, 44000, 36000],
    'Experience': [1, 2, 20, 24, 20, 30, 18, 10]
})

# Calculate correlation matrix
correlation_matrix = data.corr()

print(correlation_matrix)

                 Age    Salary  Experience
Age         1.000000  0.999757    0.994507
Salary      0.999757  1.000000    0.992829
Experience  0.994507  0.992829    1.000000


**15. What is causation? Explain difference between correlation and causation with an example.**
- Causation means one variable directly affects another.
If A causes B, then changing A will result in a change in B.
-	Correlation shows a relationship, but not why it happens.
-	Causation shows a reason-effect link — one thing actually changes the other.
- Models based on correlation can be useful for prediction, but we must be careful not to assume causation unless it’s proven (usually through experiments or deeper analysis).

**16. What is an Optimizer? What are different types of optimizers? Explain each with an example.**
- An optimizer is a method used in machine learning to adjust the weights of a model during training to minimize the loss (error).
It plays a key role in how well and how fast your model learns.
	1.	SGD (Stochastic Gradient Descent)
Uses the gradient of the loss to update weights. It works with small batches and is simple but can be slow.
	2.	SGD with Momentum
Improves SGD by adding a fraction of the previous update to the current one. Helps speed up learning and avoid getting stuck.
	3.	AdaGrad
Adapts the learning rate for each parameter. Works well with sparse data but learning slows down over time.
	4.	RMSprop
Fixes AdaGrad’s issue by using a moving average of squared gradients. Useful for non-stationary problems like time series.
	5.	Adam (Adaptive Moment Estimation)
Most popular optimizer. Combines momentum and RMSprop ideas. Automatically adjusts learning rates and works well in most cases.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD, Adagrad, RMSprop, Adam

# Define a simple model
def create_model():
    model = Sequential()
    model.add(Dense(16, input_dim=10, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    return model

# 1. SGD
model_sgd = create_model()
model_sgd.compile(optimizer=SGD(learning_rate=0.01), loss='binary_crossentropy', metrics=['accuracy'])

# 2. SGD with Momentum
model_momentum = create_model()
model_momentum.compile(optimizer=SGD(learning_rate=0.01, momentum=0.9), loss='binary_crossentropy', metrics=['accuracy'])

# 3. AdaGrad
model_adagrad = create_model()
model_adagrad.compile(optimizer=Adagrad(learning_rate=0.01), loss='binary_crossentropy', metrics=['accuracy'])

# 4. RMSprop
model_rmsprop = create_model()
model_rmsprop.compile(optimizer=RMSprop(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

# 5. Adam
model_adam = create_model()
model_adam.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


**17. What does model.fit() do? What arguments must be given?**
- The model.fit() function is used to train your machine learning model on the training data.
It feeds the input data (X) and the labels (y) into the model and adjusts the weights to minimize the loss.


In [None]:
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
model.fit(X_train, y_train)  # No epochs here

**18.What does model.predict() do? What arguments must be given?**
- The model.predict() function is used to make predictions on new, unseen input data using a trained model.

- It takes the input features (like test data) and returns the model’s output — either predicted class labels or probabilities depending on the type of model.

In [None]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data            # Features
y = iris.target          # Target labels

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the model
model = RandomForestClassifier()
model.fit(X_train, y_train)  # model.fit() trains the model

# Make predictions
y_pred = model.predict(X_test)  # model.predict() returns predicted labels

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print("Predicted Labels:", y_pred)
print("Actual Labels:   ", y_test)
print("Accuracy:", accuracy)

Predicted Labels: [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0]
Actual Labels:    [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0]
Accuracy: 1.0


**21. What is feature scaling? How does it help in Machine Learning?**
- Feature scaling is the process of standardizing or normalizing the range of independent variables (features) so that they are on a similar scale.
- Many machine learning algorithms use distance, gradient descent, or weights to learn. If one feature has much larger values than others, it can dominate the learning process.
-	Helps in algorithms like:
	-	KNN (K-Nearest Neighbors)
	-	SVM (Support Vector Machine)
	-	Logistic Regression
	-	Neural Networks
	-	Gradient Descent–based models
-	Faster convergence in optimization
-	Better accuracy and stable training

**22. Explain data encoding?**
- Data encoding is the process of converting categorical (non-numerical) values into a numerical format so that machine learning models can understand and use them.

- Most machine learning algorithms can’t work with text directly  they need numbers. So we encode data like "Male", "Female", "Red", "Green" into numbers.
- Types of encoding
 -	Label Encoding: Assigns a unique number to each label.
 - One-Hot Encoding: Turns categories into binary columns.
 -	Ordinal Encoding: Converts ordered categories to numbers based on order.
