##### 1. What is the definition of a target function ? In the sense of a real-life example, express the target function. How is a target function's fitness assessed ?

**Ans:**
The target function, also known as the objective function or the fitness function, is a central concept in machine learning and optimization problems. It represents the desired relationship or behavior that a machine learning model or an optimization algorithm aims to learn or optimize.

In the context of a real-life example, let's consider a scenario where you want to build a model to predict housing prices based on various features such as location, size, number of bedrooms, etc. The target function, in this case, would be a function that accurately estimates the price of a house given its features. The target function maps the input (features of a house) to the desired output (the predicted price of the house).

The target function's fitness or performance is assessed by evaluating how well it accomplishes its objective. The specific assessment metric used depends on the problem at hand and the type of target function. In the case of the housing price prediction example, some common metrics to assess the target function's fitness could be:

Mean Squared Error (MSE): This metric measures the average squared difference between the predicted prices and the actual prices of the houses in a dataset. Lower MSE values indicate better fitness.

Root Mean Squared Error (RMSE): RMSE is the square root of MSE and provides a more interpretable measure of the average prediction error. Lower RMSE values indicate better fitness.

Mean Absolute Error (MAE): MAE calculates the average absolute difference between the predicted prices and the actual prices. It provides a measure of the average prediction error, regardless of the direction. Lower MAE values indicate better fitness.

R-squared (R2): R2 represents the proportion of the variance in the housing prices that can be explained by the model. It ranges between 0 and 1, with higher values indicating better fitness.

The fitness assessment typically involves comparing the model's predictions to the ground truth values using the chosen metric(s). The target function's fitness is determined by how well it minimizes the chosen metric(s), achieving accurate predictions or optimal solutions based on the specific problem domain.

##### 2. What are predictive models, and how do they work? What are descriptive types, and how do you use them? Examples of both types of models should be provided. Distinguish between these two forms of models ?

**Ans:**
Predictive models aim to make predictions or forecasts about future events or outcomes based on available data. These models use historical data and patterns to learn relationships and make predictions on new, unseen data. They are commonly used in various fields, including finance, healthcare, marketing, and weather forecasting. Predictive models employ algorithms and statistical techniques to analyze data and generate predictions.

For example, in finance, a predictive model can be built to forecast stock prices based on historical stock market data, company financials, news sentiment, and other relevant factors. The model would learn patterns from the historical data and use them to predict future stock prices. The model's accuracy and performance can be evaluated based on its ability to make accurate predictions on new data.

Descriptive Models:
Descriptive models, on the other hand, aim to describe and summarize the relationships and patterns in data without making explicit predictions. They are used to gain insights, understand underlying patterns, and describe the characteristics of a dataset. Descriptive models are often used in exploratory data analysis, data visualization, and uncovering patterns or trends in large datasets.

For example, in marketing, a descriptive model can be built to analyze customer behavior and segment customers based on their preferences and purchasing patterns. The model would identify different customer segments based on data such as demographics, purchasing history, and online activity. These segments can then be used to develop targeted marketing strategies or personalized recommendations.

The main distinction between predictive and descriptive models lies in their objectives. Predictive models focus on making predictions about future outcomes, while descriptive models focus on summarizing and understanding existing data patterns. Predictive models require labeled data for training and evaluation, while descriptive models often use unsupervised learning techniques or exploratory data analysis to uncover patterns.

##### 3. Describe the method of assessing a classification model's efficiency in detail. Describe the various measurement parameters ?

**Ans:**
Assessing the efficiency of a classification model involves evaluating its performance and determining how well it can accurately classify instances into different classes. Several measurement parameters are commonly used to assess the performance of classification models. Let's discuss them in detail:

Accuracy: Accuracy is one of the most straightforward and commonly used metrics. It measures the proportion of correctly classified instances out of the total number of instances. Accuracy is calculated as (TP + TN) / (TP + TN + FP + FN), where TP is true positives, TN is true negatives, FP is false positives, and FN is false negatives. However, accuracy alone may not be sufficient if the classes are imbalanced or if misclassifying certain instances is more critical than others.

Precision: Precision focuses on the accuracy of positive predictions made by the model. It measures the proportion of true positive predictions out of all positive predictions. Precision is calculated as TP / (TP + FP). Precision is useful when the cost of false positives is high, and it is important to minimize the number of false positives.

Recall (Sensitivity/True Positive Rate): Recall, also known as sensitivity or true positive rate, measures the proportion of actual positive instances that are correctly classified as positive by the model. It is calculated as TP / (TP + FN). Recall is important when the cost of false negatives is high, and it is crucial to identify as many positive instances as possible.

F1 Score: The F1 score combines precision and recall into a single metric. It provides a balanced measure of the model's performance by taking their harmonic mean. The F1 score is calculated as 2 * (Precision * Recall) / (Precision + Recall). It is useful when both precision and recall need to be considered together.

Specificity (True Negative Rate): Specificity measures the proportion of actual negative instances that are correctly classified as negative by the model. It is calculated as TN / (TN + FP). Specificity is important when the cost of false positives is high, and it is crucial to minimize the number of false positives.

Area Under the ROC Curve (AUC-ROC): The ROC curve plots the true positive rate (sensitivity) against the false positive rate (1-specificity) at various classification thresholds. The AUC-ROC represents the overall performance of the classification model across all possible thresholds. A higher AUC-ROC value indicates better discrimination and classification performance.

These measurement parameters provide different perspectives on the performance of a classification model. The choice of which metric(s) to use depends on the specific problem, the relative importance of false positives and false negatives, and the balance between precision and recall required for the task.
![MM.png](attachment:MM.png)

##### 4. Describe :
1. In the sense of machine learning models, what is underfitting? What is the most common reason for underfitting ?
2. What does it mean to overfit? When is it going to happen?
3. In the sense of model fitting, explain the bias-variance trade-off.

**Ans:** The following is the short notes on:

**(1)In the sense of machine learning models, what is underfitting? What is the most common reason for underfitting:**

Underfitting is a scenario in data science where a data model is unable to capture the relationship between the input and output variables accurately, generating a high error rate on both the training set and unseen data.

**(2)What does it mean to overfit? When is it going to happen**

Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model.
![F.png](attachment:F.png)

**(3)In the sense of model fitting, explain the bias-variance trade-off**

The bias is known as the difference between the prediction of the values by the ML model and the correct value. Being high in biasing gives a large error in training as well as testing data. By high bias, the data predicted is in a straight line format, thus not fitting accurately in the data in the data set.
![B.png](attachment:B.png)

##### 5. Is it possible to boost the efficiency of a learning model? If so, please clarify how ?

**Ans:** Building a machine learning model is not enough to get the right predictions, as you have to check the accuracy and need to validate the same to ensure get the precise results. And validating the model will improve the performance of the ML model. Some ways of boosting the efficiency of a learning model are mentioned below:
1. Add more Data Samples
2. Look at the problem differently: Looking at the problem from a new perspective can add valuable information to your model and help you uncover hidden relationships between the story variables. Asking different questions may lead to better results and, eventually, better accuracy.
3. Adding Context to Data: More context can always lead to a better understanding of the problem and, eventually, better performance of the model. Imagine we are  selling a car, a BMW. That alone doesn’t give us much information about the car. But, if we add the color, model and distance traveled, then you’ll start to have a better picture of the car and its possible value.
4. Finetuning our hyperparameter: to get the answer, we will need to do some trial and error until you reach your answer.
5. Train our model using cross-validation
6. Expoerimenting with different Algorithms.

##### 6. How would you rate an unsupervised learning model's success? What are the most common success indicators for an unsupervised learning model ?

**Ans:** In case of supervised learning, it is mostly done by measuring the performance metrics such as accuracy, precision,
recall, AUC, etc. on the training set and the holdout sets whereas for Unsupervised Learning it is different. Since there is no pre-evidence or records for patterns, we cannot directly compute the accuracy by comparing actual and predicted outputs but there exist many evaluation metrics to measure the performance of unsupervised learning algorithms after the training process.

Some of them are-

Clustering - Jaccard similarity index, Rand Index, Purity, Silhouette measure, Sum of squared errors, etc.

Association rule mining – Lift, Confidence

Time series analysis – Root mean square error, mean absolute error, mean absolute percentage error, etc.

Autoencoders - Reconstruction errors

Natural Language processing (like sentiment analysis and text clustering) – Comparing the correlation between natural words after converting them to numerical vectors.
    
Principal component analysis – Reconstruction error, Scree plot
    
Generative adversarial networks – Discriminator functions
    
Recurrent neural networks and LSTM (In numerical series) – Root mean square error, mean absolute error, mean absolute percentage error, etc.
    
Recurrent neural networks and LSTM (In semantic series) - Word to vector correlation
    
Anomaly detection (like DBSCAN, OPTICS) – Cohesion, Separation, Sum of squared errors, etc.
    
Expectation/ Maximization problems – Log-likelihood
    
Survival analysis (Cox model 1) – Simple hazard ration, R Squared

Survival analysis (Cox model 2) – Two group hazard ratio and brier score, Log-rank test, Somers’ rank correlation, Time-dependent ROC – AUC, Power validation, etc.

Few other examples of such measures are:
- Silhouette coefficient.
- Calisnki-Harabasz coefficient.
- Dunn index.
- Xie-Beni score.
- Hartigan index.

##### 7. Is it possible to use a classification model for numerical data or a regression model for categorical data with a classification model? Explain your answer ?

**Ans:** Categorical Data is the data that generally takes a limited number of possible values. Also, the data in the category need not be numerical, it can be textual in nature. All machine learning models are some kind of mathematical model that need numbers to work with. This is one of the primary reasons we need to pre-process the categorical data before we can feed it to machine learning models.
        
If a categorical target variable needs to be encoded for a classification predictive modeling problem, then the LabelEncoder class can be used.

##### 8. Describe the predictive modeling method for numerical values. What distinguishes it from categorical predictive modeling ?

**Ans:** predictive modeling is a statistical technique using machine learning and data mining to predict and forecast likely  future outcomes with the aid of historical and existing data. It works by analyzing current and historical data and projecting what it learns on a model generated to forecast likely outcomes.
      
Classification is the process of identifying the category or class label of the new observation to which it belongs.Predication is the process of identifying the missing or unavailable numerical data for a new observation. That is the key difference between classification and prediction.

##### 9. Make quick notes on:
1. The process of holding out
2. Cross-validation by tenfold
3. Adjusting the parameters

**Ans:** The Quick notes on the following topics is below:
- **The process of holding out:**       
The hold-out method for training machine learning model is the process of splitting the data in different splits and using 
one split for training the model and other splits for validating and testing the models. The hold-out method is used for 
both model evaluation and model selection.


- **Cross-validation by tenfold:**
10-fold cross validation would perform the fitting procedure a total of ten times, with each fit being performed on a training set consisting of 90% of the total training set selected at random, with the remaining 10% used as a hold out
set for validation.


- **Adjusting the parameters:**  
A fancy name for training: the selection of parameter values, which are optimal in some desired sense (eg. minimize an objective function you choose over a dataset you choose). The parameters are the weights and biases of the network

##### 10. Define the following terms: 
1. Purity vs. Silhouette width
2. Boosting vs. Bagging
3. The eager learner vs. the lazy learner

**Ans:** The Following is the short notes on:

- **Purity vs Silhouette width:**  
    - Purity is a measure of the extent to which clusters contain a single class. Its calculation can be thought of as follows: For each cluster, count the number of data points from the most common class in said cluster.
    - The silhouette width is also an estimate of the average distance between clusters. Its value is comprised between 1 and -1 with a value of 1 indicating a very good cluster.
    
    
- **Boosting vs. Bagging:**
    - Bagging is a way to decrease the variance in the prediction by generating additional data for training from dataset using combinations with repetitions to produce multi-sets of the original data.
    - Boosting is an iterative technique which adjusts the weight of an observation based on the last classification.
    
    
- **The eager learner vs. the lazy learner:**
    - A lazy learner delays abstracting from the data until it is asked to make a prediction.
    - while an eager learner abstracts away from the data during training and uses this abstraction to make predictions rather than directly compare queries with instances in the dataset.