1. What is a parameter?

A parameter is an internal coefficient of a model that is learned automatically during training. It defines how the input data is transformed into output predictions.

Every Machine Learning model works as a mathematical function. Parameters are the unknowns in this function that need to be estimated from data. During training, algorithms optimize parameters to minimize the loss function.

Examples:
In Linear Regression,
y=w⋅x+b

where w = slope (weight) and b = bias (intercept). Both are parameters.
In Logistic Regression, parameters are weights of features in the sigmoid function.
In Neural Networks, millions of parameters exist in the form of weights and biases.

Importance:
Parameters represent the knowledge learned from data.
Well-trained parameters → accurate predictions.
Difference from Hyperparameters:
Parameters = learned by model (e.g., weights).
Hyperparameters = set manually before training (e.g., learning rate, number of layers).

2. What is correlation?

Definition:
Correlation is a statistical measure that expresses the strength and direction of a linear relationship between two variables.

Range:
+1 : Perfect positive correlation (variables rise together).
-1 : Perfect negative correlation (one rises, the other falls).
0 : No linear relationship.

Applications in ML:
Detect redundant features (high correlation).
Feature selection (remove correlated variables to avoid multicollinearity).

3. What does negative correlation mean?

Definition:
Negative correlation means there is an inverse relationship between two variables: as one increases, the other decreases.

Example:Number of Study Hours and Number of TV Hours
When the number of hours spent studying increases, the number of hours spent watching television usually decreases.
This relationship represents a negative correlation.
In statistics, negative correlation means that when one variable goes up, the other variable tends to go down.

Interpretation in ML:
If two features are negatively correlated, including both might confuse the model. Feature engineering may be required.

4. Define Machine Learning. What are its main components?
Machine Learning is a branch of AI that allows systems to learn from data, identify patterns, and make predictions or decisions without being explicitly programmed.

Key Characteristics:
Data-driven (requires datasets).
Improves performance over time.
Learns patterns automatically.

Main Components:
Data : Raw information collected for training.
Features (X) : Independent input variables.
Target (y) : Output variable to predict.
Model/Algorithm : Mathematical/statistical representation used to map X → y.
Loss Function : Evaluates prediction error.
Optimizer : Algorithm to minimize loss (e.g., Gradient Descent).
Evaluation Metrics : Used to judge model quality (accuracy, F1 score, RMSE).

Example: Predicting house prices → Data = past house sales, Features = area, location, rooms, Target = price.

5. How does loss value determine if a model is good?
The loss function calculates how far predicted values are from actual values.

Interpretation:
Small loss → model predictions are close to real values → good model.
Large loss → poor predictions → bad model.

Common Loss Functions:
Regression → Mean Squared Error (MSE), Mean Absolute Error (MAE).
Classification → Cross-Entropy Loss.

Example:
Predict house price = ₹50,00,000
Actual = ₹51,00,000
Loss = 1,00,000 → small → good model.

6. Continuous vs Categorical Variables

Continuous Variables:
Can take infinite values within a range.
Examples: Temperature, salary, height.
Measured on interval or ratio scales.
Categorical Variables:
Represent groups or categories.

Two types:
Nominal: No order (e.g., gender, blood group).
Ordinal: With order (e.g., education level).
In ML: Continuous values are used directly, categorical must be encoded.

7. Handling categorical variables
ML algorithms require numbers. Techniques:
Label Encoding : Converts categories into integers.
One-Hot Encoding : Creates binary variables.
Target Encoding : Uses mean of target per category.
Frequency Encoding : Uses frequency of categories.
Impact: Choosing the wrong method can cause bias.

8. Training vs Testing Dataset

Training Set: Used to fit the model and learn parameters.
Testing Set: Used to evaluate model generalization on unseen data.
It prevents overfitting (memorizing training data but failing on new data).

9. What is sklearn.preprocessing?
A module in Scikit-learn providing tools for data transformation.
Functions:
Scaling → StandardScaler, MinMaxScaler.
Encoding → LabelEncoder, OneHotEncoder.
Normalization → Adjust feature magnitudes.
Purpose: Ensures all features are in a suitable format for ML models.

10. What is a Test set?
A subset of data kept aside for final model evaluation.
Purpose: Measures how well the trained model generalizes to unseen data.

11. Data Splitting in Python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

test_size=0.2 → 20% data for testing.
Ensures fair evaluation.

12. How do you approach an ML problem?

Steps:
Define the problem.
Collect and understand data.
Perform EDA.
Preprocess (missing values, encoding, scaling).
Split dataset.
Select algorithm.
Train model.
Evaluate using metrics.
Tune hyperparameters.
Deploy & monitor.

13. Why do EDA before model fitting?

Detect missing data.
Check feature distributions.
Identify outliers.
Detect correlations between variables.
Helps in feature engineering.
Without EDA, the model might learn from biased or incomplete data.

14. Finding correlation in Python
import pandas as pd
df.corr()
Visualization:
import seaborn as sns
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")

15. What is causation? Difference from correlation

Correlation: Two variables move together, but one may not cause the other.
Causation: One variable directly influences another.

Example:
Ice cream sales increase and drowning increase (correlation).
Heat increase causes both (causation).


16. Optimizers in ML
Definition: Algorithms that adjust model parameters to minimize loss.

Types:
Gradient Descent → Updates weights by slope of loss function.
SGD : Updates for each sample.
Adam : Adaptive learning rate, widely used.
RMSProp : Uses moving average of squared gradients.
Importance: Helps in fast and efficient convergence.

17. sklearn.linear_model
A scikit-learn module for linear models.
Includes:
LinearRegression
LogisticRegression
Ridge, Lasso, ElasticNet

18. model.fit()
Trains the model on given training data.

Arguments:
X_train → Features.
y_train → Labels.

19. model.predict()
Uses trained model to make predictions.
Arguments: X_test → Features for prediction.

20. Feature Scaling
Process of adjusting values of features to a common scale.
It prevents domination of large features.
Speeds up convergence in gradient descent.

Methods:
Standardization (Z-score).
Normalization (0–1 range).

21. Scaling in Python
from sklearn.preprocessing import StandardScaler, MinMaxScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

scaler = MinMaxScaler()
X_normalized = scaler.fit_transform(X)

22. Data Encoding
Converting categorical variables into numeric values.
Methods: Label Encoding, One-Hot Encoding, Target Encoding.
Importance: Essential since ML algorithms require numerical input.