<a href="https://colab.research.google.com/github/NehaKumarink/Python-DA-Assignment/blob/main/ML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Q1) What is a parameter?**

In Machine Learning (ML), a parameter is a variable that the model learns from the training data.

These parameters define how the model makes predictions and are updated during the training process.

**Examples of Parameters in ML:**

- Weights (W) & Biases (b) in neural networks and linear regression models.

- Coefficients in logistic regression.

- Decision boundaries in decision trees.

**Q2) What is correlation?**

Correlation in Machine Learning

Correlation measures the relationship between two variables—how one variable changes in relation to another.

It helps determine whether an increase in one feature leads to an increase or decrease in another.

**Negative Correlation Mean**

A negative correlation means that when one variable increases, the other decreases.

The closer the correlation value is to -1, the stronger the inverse relationship.

**Q3) Define Machine Learning. What are the main components in Machine Learning?**

Machine Learning (ML) is a branch of artificial intelligence (AI) that enables computers to learn from data and make predictions or decisions without being explicitly programmed.

It identifies patterns and relationships in data to improve performance over time.

**Example**: A spam filter learns from past emails to classify new emails as spam or not.

**Q4) How does loss value help in determining whether the model is good or not**?

The loss value is a numerical representation of how far the model's predictions are from the actual values.

It helps in determining whether the model is good or needs improvement.

A lower loss means better performance, while a higher loss indicates poor predictions.

A good model is one that generalizes well to unseen data, meaning it performs well not just on the training set but also on the test/validation set.

You can determine this using loss values, evaluation metrics, and visual analysis.

In [None]:
from sklearn.metrics import accuracy_score

# Actual vs. Predicted labels for classification
y_true = [0, 1, 1, 0, 1, 0, 1]
y_pred = [0, 1, 0, 0, 1, 0, 1]

# Calculate accuracy
accuracy = accuracy_score(y_true, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")

Model Accuracy: 0.86


**Q5) What are continuous and categorical variables?**

**Continuous Variables** (Numeric Data)

Definition: Variables that can take an infinite number of values within a range.

Examples:
Height (e.g., 5.6 feet, 6.1 feet)
Weight (e.g., 65.5 kg, 72.3 kg)
Temperature (e.g., 36.5°C, 40.2°C)

Characteristics:
Can have decimal values (e.g., 5.75)
Measured, not counted
Often used in regression models

**Categorical Variables**

Definition: Variables that represent distinct groups or categories.

Examples:
Binary: Yes/No, Male/Female
Nominal (No order): Colors (Red, Blue, Green), Car Brands (Toyota, BMW)
Ordinal (Has order): Education Level (High School, Bachelor’s, Master’s)

**Characteristics**:

Represent categories or groups

Can be nominal (no order) or ordinal (ordered categories)

Used in classification models


**Q6) How do we handle categorical variables in Machine Learning? What are the common techniques?**

Handling categorical variables in ML:

One-Hot Encoding – Creates binary columns for each category (best for low-cardinality nominal data).

Label Encoding – Assigns unique integers to categories (use for ordinal data).

Ordinal Encoding – Respects category order (e.g., "Low" < "Medium" < "High").

Target Encoding – Replaces categories with target mean (risk of data leakage).

Frequency Encoding – Uses category occurrence count.

Hash Encoding – Converts categories into fixed-length numerical hashes (useful for high-cardinality data).

**Q7)What do you mean by training and testing a dataset?**

In Machine Learning, training and testing a dataset refers to splitting data to evaluate model performance.

Training Dataset: Used to train the model by learning patterns and relationships.

Testing Dataset: Used to assess the model’s accuracy and generalization on unseen data.

Typically, data is split 80% for training and 20% for testing, but ratios can vary. This helps prevent overfitting and ensures the model performs well on new data.

**Q8) What is sklearn.preprocessing?**

sklearn.preprocessing is a module in Scikit-Learn that provides tools for transforming and normalizing data before training ML models.

It helps improve model performance by scaling, encoding, and modifying features.

**Q9) What is a Test set?**

In machine learning, a test set is a dataset used to evaluate the performance of a trained model.

It consists of data that the model has never seen before during training, allowing for an unbiased assessment of how well the model generalizes to new data.

Key Points About the Test Set:
- Used for final evaluation – It helps determine the real-world effectiveness of the model.
- Separate from training & validation sets – Prevents data leakage and overfitting.
- Performance metrics – Common evaluation metrics include accuracy, precision, recall, F1-score, and RMSE.

Example Usage

Training Set – Used to train the model.

Validation Set – Used to fine-tune hyperparameters.

Test Set – Used to evaluate the final model’s accuracy and generalization ability.

**Q10)How do we split data for model fitting (training and testing) in Python?** **How to Approach a Machine Learning Problem?**

In Python, we typically use scikit-learn's train_test_split() function to split data into training and test sets.

In [None]:
from sklearn.model_selection import train_test_split

# Sample dataset (features X, target y)
X = [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]
y = [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]

# Splitting data (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print("Training set size:", len(X_train))
print("Test set size:", len(X_test))

Training set size: 8
Test set size: 2


A structured approach ensures efficiency and accuracy.

Step-by-Step Approach:
Step 1: Define the Problem

Understand the business problem.

Identify the type of ML problem (Classification, Regression, Clustering, etc.).

Step 2: Collect and Explore Data

Gather relevant data (CSV, databases, APIs).
Perform Exploratory Data Analysis (EDA): check for missing values, outliers, distributions.

Step 3: Preprocess the Data

Handle missing values, duplicates.
Encode categorical variables.
Normalize/Standardize numerical features if needed.

Step 4: Split Data

Use train_test_split() to create training and testing sets.
Sometimes, create a validation set for hyperparameter tuning.

Step 5: Choose a Model

Select a suitable algorithm (e.g., Decision Tree, SVM, Neural Networks).

Step 6: Train the Model

Fit the model using the training data.
Tune hyperparameters for better performance.

Step 7: Evaluate the Model

Use metrics like accuracy, precision, recall, RMSE, R² score to measure performance.
Check for overfitting/underfitting.

Step 8: Improve the Model

Try different feature engineering techniques.
Tune hyperparameters using GridSearchCV or RandomizedSearchCV.
Experiment with different models (e.g., Ensemble Learning).

Step 9: Deploy the Model

Save the model using joblib or pickle.
Deploy it as an API using Flask, FastAPI, or Django.

Step 10: Monitor & Maintain

Track performance on new data.

Update the model as data evolves.

**Q11) Why do we have to perform EDA before fitting a model to the data?**

EDA is a crucial step in Machine Learning because it helps you understand, clean, and prepare your data before fitting it into a model.

Here’s why it’s essential:

1️ Understand the Data Structure
Identifies features (columns) and target variable.
Understands data types (numerical, categorical, text).
Helps decide which features to use in the model.

2️ Detect & Handle Missing Values
Missing data can cause errors or bias in the model.
Common techniques: removal, imputation (mean, median, mode), or using algorithms that handle missing data.

3️ Identify Outliers
Outliers can skew predictions and cause poor model performance.
Visualization tools like box plots, scatter plots help detect them.
Outliers can be removed or treated using transformations.

4️ Detect Data Imbalance (for Classification)
Imbalanced datasets (e.g., 95% Class A, 5% Class B) lead to biased models.
Solution: Use oversampling (SMOTE), undersampling, or weighted loss functions.

5️ Check Feature Correlations
Identifies highly correlated features, which may lead to multicollinearity.
Feature selection techniques (e.g., Variance Inflation Factor (VIF)) help remove redundant features.

6 Choose the Right Data Transformations
Normalization & Standardization for numerical features (especially for distance-based models like KNN, SVM).
Encoding categorical variables (One-Hot Encoding, Label Encoding).

7 Select the Right Model & Feature Engineering
EDA guides feature selection, engineering, and model choice.
Example: If features are highly correlated, Decision Trees may perform better than Linear Regression.


Q12)What is correlation?

Correlation is a statistical measure that describes the relationship between two variables. It shows whether and how strongly they move together.

Positive correlation: Both variables increase or decrease together.

Negative correlation: One variable increases while the other decreases.

No correlation: No clear relationship between the variables.

It’s usually measured using the correlation coefficient (r), ranging from -1 (strong negative) to +1 (strong positive), with 0 meaning no correlation.

Q13) What does negative correlation mean?

Negative correlation means that when one variable increases, the other decreases. They move in opposite directions.

For example:

The more you exercise, the less you weigh.

The more time spent on social media, the lower the grades (possibly).

Q14) How can you find correlation between variables in Python?

You can find the correlation between variables in Python using Pandas and NumPy. Here are some common methods:

Using corr() in Pandas (for DataFrames)

In [1]:
import pandas as pd

# Sample data
data = {'A': [1, 2, 3, 4, 5], 'B': [5, 4, 3, 2, 1]}
df = pd.DataFrame(data)

# Compute correlation
correlation = df.corr()
print(correlation)


     A    B
A  1.0 -1.0
B -1.0  1.0


In [2]:
import numpy as np

x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])

correlation_matrix = np.corrcoef(x, y)
print(correlation_matrix)


[[ 1. -1.]
 [-1.  1.]]


In [3]:
from scipy.stats import pearsonr

x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]

corr, _ = pearsonr(x, y)
print(f'Pearson correlation: {corr}')

Pearson correlation: -1.0


Q15) What is causation? Explain difference between correlation and causation with an example.


Causation means that one event directly causes another. If A causes B, changing A will directly result in a change in B.

Correlation vs. Causation

Correlation: Two variables are related but one does not necessarily cause the other.

Causation: One variable directly influences the other.

Example:
Correlation: Ice cream sales and drowning incidents increase in summer.

More ice cream does not cause drowning! They are related due to hot weather (a third factor).

Causation: Drinking alcohol and impaired driving ability.
More alcohol directly reduces driving ability.

Q16) What is an Optimizer? What are different types of optimizers? Explain each with an example.

An optimizer is an algorithm used in machine learning and deep learning to adjust the parameters (weights and biases) of a model to minimize the loss function and improve performance.

It helps the model learn by updating weights efficiently to achieve better predictions.

Types of Optimizers

Optimizers can be broadly classified into two categories:

First-order optimizers (use gradients of the loss function)

Second-order optimizers (use second derivatives like Hessian matrices, which are computationally expensive)

The most commonly used optimizers are first-order optimizers based on Gradient Descent.

1. Gradient Descent (GD)
This is the simplest optimization algorithm that minimizes the loss function by iteratively updating the weights in the opposite direction of the gradient.

Types of Gradient Descent:
Batch Gradient Descent:

Computes gradient using the entire dataset.
Pros: Converges smoothly.
Cons: Slow for large datasets.
Example: Training a simple linear regression model on a small dataset.
Stochastic Gradient Descent (SGD):

Updates weights for each individual data point.
Pros: Faster, works well for large datasets.
Cons: High variance in updates, can be noisy.
Example: Training an image classification model on millions of images.
Mini-batch Gradient Descent:

Uses small batches of data instead of the whole dataset or a single data point.
Pros: Balances speed and stability.
Example: Training a deep learning model using batches of 32 or 64 samples.

2. Momentum-based Optimizer
Momentum helps the optimizer move faster by accumulating past gradients, reducing oscillations in the updates.

Example:
Used in deep learning models to speed up convergence in CNNs and RNNs.

3. Adaptive Learning Rate Optimizers
These adjust the learning rate dynamically during training.

(i) AdaGrad (Adaptive Gradient Algorithm)
Gives larger updates for infrequent parameters and smaller updates for frequent ones.
Pros: Good for sparse data (e.g., NLP).
Cons: Learning rate keeps decreasing over time.
Example:
Used in text-based applications like word embeddings.

(ii) RMSprop (Root Mean Square Propagation)
Solves AdaGrad's problem by maintaining an exponentially decaying average of squared gradients.
Pros: Works well for RNNs.
Cons: Requires careful tuning.
Example:
Used in speech recognition and NLP models.

(iii) Adam (Adaptive Moment Estimation)
Combines Momentum and RMSprop, making it one of the most widely used optimizers.
Pros: Works well for most deep learning problems.
Cons: Can sometimes generalize poorly.
Example:
Used in training deep neural networks for tasks like image recognition and NLP.

4. AdamW (Adam with Weight Decay)
An improved version of Adam that fixes weight decay handling.
Pros: Better generalization.
Example: Used in transformers like BERT for NLP tasks.


Q17) What is sklearn.linear_model ?

klearn.linear_model is a module in Scikit-Learn that provides various linear models for regression and classification tasks.

It includes popular algorithms like Linear Regression, Logistic Regression, Ridge, Lasso, ElasticNet, SGD (Stochastic Gradient Descent), and more.

In [4]:
from sklearn.linear_model import LinearRegression
model = LinearRegression()

In [5]:
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()

In [6]:
from sklearn.linear_model import Ridge, Lasso
ridge = Ridge(alpha=1.0)
lasso = Lasso(alpha=0.1)

In [7]:
from sklearn.linear_model import ElasticNet
elastic = ElasticNet(alpha=0.1, l1_ratio=0.5)

In [8]:
from sklearn.linear_model import SGDRegressor, SGDClassifier
sgd_reg = SGDRegressor()
sgd_clf = SGDClassifier()

When to Use sklearn.linear_model?

When working with structured/tabular data.

When assuming a linear relationship between input features and target.

When using regularization to prevent overfitting.

When dealing with large datasets (SGD works well with online learning).

Q18) What does model.fit() do? What arguments must be given?

In machine learning, model.fit() is a method used to train a model using provided data. It adjusts the model’s parameters based on input data and corresponding target labels.

Functionality of model.fit()
When called, model.fit():

Feeds the input data to the model
Performs forward and backward passes (calculates loss and updates model weights using backpropagation)
Iterates over multiple epochs (passes over the entire dataset multiple times)
Monitors training metrics (e.g., loss, accuracy)
Arguments of model.fit()
Depending on the library (e.g., TensorFlow/Keras, PyTorch), the required arguments may vary. Here’s how it works in Keras (TensorFlow):

Required Arguments
x: Input training data (numpy array, tensor, or dataset)
y: Target labels (for supervised learning)
Common Optional Arguments
epochs: Number of times the model will iterate over the dataset
batch_size: Number of samples per gradient update
validation_data: Data used for validation (tuple of (x_val, y_val))
shuffle: Whether to shuffle data before each epoch (True by default)
callbacks: List of functions to monitor and modify training (e.g., EarlyStopping)
verbose: Logging level (0 for silent, 1 for progress bar, 2 for one log per epoch)

In [9]:
from tensorflow import keras
import numpy as np

# Dummy dataset
x_train = np.random.rand(1000, 10)
y_train = np.random.randint(0, 2, size=(1000,))

# Define a simple model
model = keras.Sequential([
    keras.layers.Dense(32, activation='relu', input_shape=(10,)),
    keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

Epoch 1/10


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 23ms/step - accuracy: 0.4880 - loss: 0.7006 - val_accuracy: 0.5200 - val_loss: 0.6829
Epoch 2/10
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.5017 - loss: 0.7039 - val_accuracy: 0.5050 - val_loss: 0.6837
Epoch 3/10
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.5367 - loss: 0.6948 - val_accuracy: 0.5400 - val_loss: 0.6839
Epoch 4/10
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.5409 - loss: 0.6910 - val_accuracy: 0.5650 - val_loss: 0.6840
Epoch 5/10
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.5544 - loss: 0.6878 - val_accuracy: 0.5800 - val_loss: 0.6841
Epoch 6/10
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.5579 - loss: 0.6844 - val_accuracy: 0.5700 - val_loss: 0.6841
Epoch 7/10
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━

<keras.src.callbacks.history.History at 0x7ff2e293d910>

Q19) What does model.predict() do? What arguments must be given?

model.predict() Overview
The model.predict() method is used to generate predictions from a trained machine learning model. It takes input data and returns the model's output, typically probabilities or class labels (depending on the model type).

Functionality of model.predict()
Takes input data (without labels, since we're making predictions).
Performs a forward pass through the model.
Outputs predictions (e.g., probabilities, regression values, or class labels).
Arguments of model.predict()
In TensorFlow/Keras, the most commonly used arguments are:

Required
x: The input data (NumPy array, TensorFlow tensor, or dataset).
Optional
batch_size: Number of samples per batch for computation (default: automatic selection).
verbose: Logging level (0 = silent, 1 = progress bar).
steps: Number of batches to process (for generators).
callbacks: Custom functions to monitor prediction.


In [10]:
import numpy as np

# Dummy test data
x_test = np.random.rand(5, 10)

# Generate predictions
predictions = model.predict(x_test)

# Print results
print(predictions)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 148ms/step
[[0.42121094]
 [0.42983934]
 [0.5119582 ]
 [0.5330314 ]
 [0.47106582]]


If your model is a classification model, predictions will contain probabilities (e.g., for binary classification, values between 0 and 1).

For binary classification, you may convert probabilities to class labels like this:

In [11]:
predicted_labels = (predictions > 0.5).astype(int)

Q20) What are continuous and categorical variables?

Continuous vs. Categorical Variables
In data science and statistics, variables are classified based on the type of data they represent. The two main types are continuous and categorical variables.

1. Continuous Variables
A continuous variable can take an infinite number of values within a given range. These variables are measurable and often represent quantities.

Examples:
Height (e.g., 170.5 cm, 172.2 cm)
Weight (e.g., 65.3 kg, 72.8 kg)
Temperature (e.g., 36.6°C, 98.4°F)
Salary (e.g., $45,500.75, $60,100.20)
Key Characteristics:
✔ Can take decimal or fractional values
✔ Can be measured precisely
✔ Can be transformed (e.g., normalized, standardized)

2. Categorical Variables
A categorical variable represents a finite number of distinct groups or categories. These variables are not measurable but can be counted or labeled.

Types of Categorical Variables:
Nominal: Categories have no inherent order

Examples:
Gender (Male, Female, Other)
Blood Type (A, B, AB, O)
Eye Color (Blue, Green, Brown)
Ordinal: Categories have a meaningful order but no fixed numerical difference

Examples:
Education Level (High School, Bachelor's, Master's, PhD)
Customer Satisfaction (Low, Medium, High)
Economic Status (Low Income, Middle Income, High Income)
Key Characteristics:
✔ Represent distinct groups
✔ Cannot take fractional values
✔ Can be encoded as numbers (e.g., One-Hot Encoding, Label Encoding)

Q21) What is feature scaling? How does it help in Machine Learning?

Feature scaling ensures all numerical features have a similar range, improving model performance and training speed.

🔹 Why Use It?

Prevents large values from dominating smaller ones
Speeds up training (especially for Gradient Descent)
Improves accuracy in distance-based models (KNN, SVM, K-Means)

🔹 Common Methods:

Standardization (Z-score) – Centers data around 0, best for SVM, Linear Regression.

Min-Max Scaling – Scales between 0 and 1, best for Neural Networks, KNN.

Robust Scaling – Handles outliers, uses median & IQR.

🔹 Needed for: KNN, SVM, Neural Networks
🔹 Not needed for: Decision Trees, Random Forest

Q22) How do we perform scaling in Python?

Scaling in Python is done using scikit-learn's MinMaxScaler or StandardScaler:

MinMax Scaling (Normalization): Scales data to a fixed range (e.g., 0 to 1).

Standard Scaling (Z-score normalization): Centers data with mean 0 and standard deviation 1.

Q23) What is sklearn.preprocessing?

sklearn.preprocessing is a module in scikit-learn that provides tools for scaling, normalizing, encoding, and transforming data before training machine learning models.

Common functions:

Scaling: StandardScaler, MinMaxScaler
Normalization: Normalizer
Encoding: LabelEncoder, OneHotEncoder
Imputation: SimpleImputer

Q24) How do we split data for model fitting (training and testing) in Python?

In Python, we split data into training and testing sets using train_test_split from scikit-learn:

In [15]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Sample dataset
data = {
    'Age': [25, 30, 35, 40, 45, 50, 55, 60, 65, 70],
    'Salary': [3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 11000, 12000],
    'Purchased': [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]  # Target variable (Binary classification)
}

# Convert to DataFrame
df = pd.DataFrame(data)

# Features (X) and Target (y)
X = df[['Age', 'Salary']]  # Independent variables
y = df['Purchased']         # Dependent variable

# Split into training (80%) and testing (20%) sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Display the shapes of the resulting datasets
print("Training set size:", X_train.shape, y_train.shape)
print("Testing set size:", X_test.shape, y_test.shape)

Training set size: (8, 2) (8,)
Testing set size: (2, 2) (2,)


Q25) Explain data encoding?

Data encoding is the process of converting categorical data into numerical format so that machine learning models can process it.

Types of Data Encoding:

1. Label Encoding (Ordinal Encoding)
Converts categorical labels into numbers (0,1,2,...).
Works when categories have a meaningful order.

In [16]:
from sklearn.preprocessing import LabelEncoder

data = ['Low', 'Medium', 'High']
encoder = LabelEncoder()
encoded_data = encoder.fit_transform(data)
print(encoded_data)  # Output: [1, 2, 0] (order may vary)


[1 2 0]


2. One-Hot Encoding

Converts categorical values into binary (0s and 1s) columns.
Suitable for nominal data (no order).

In [17]:
from sklearn.preprocessing import OneHotEncoder
import pandas as pd

df = pd.DataFrame({'Color': ['Red', 'Blue', 'Green']})
encoder = OneHotEncoder(sparse_output=False)
encoded_data = encoder.fit_transform(df)
print(encoded_data)

[[0. 0. 1.]
 [1. 0. 0.]
 [0. 1. 0.]]


3) Ordinal Encoding
Assigns numbers based on a predefined order (e.g., Low < Medium < High).


In [18]:
from sklearn.preprocessing import OrdinalEncoder

data = [['Low'], ['Medium'], ['High']]
encoder = OrdinalEncoder(categories=[['Low', 'Medium', 'High']])
encoded_data = encoder.fit_transform(data)
print(encoded_data)  # Output: [[0.], [1.], [2.]]

[[0.]
 [1.]
 [2.]]


4. Frequency Encoding
Replaces categories with their occurrence count in the dataset.

In [19]:
df = pd.DataFrame({'City': ['Dubai', 'Abu Dhabi', 'Dubai', 'Sharjah', 'Dubai']})
freq_encoding = df['City'].value_counts().to_dict()
df['City_Encoded'] = df['City'].map(freq_encoding)
print(df)

        City  City_Encoded
0      Dubai             3
1  Abu Dhabi             1
2      Dubai             3
3    Sharjah             1
4      Dubai             3


5. Target Encoding (for categorical target variables)
Replaces categories with their mean target value in classification problems.