# Missing Values
"**Missing values**" in the context of machine learning and data sets are those values that are absent or unavailable for certain observations or variables in a data set. These values can arise for a variety of reasons, such as data collection errors, measurement failures, or simply because the information is not available.

The presence of missing values can negatively impact the performance of machine learning models, as many algorithms cannot directly handle these values. Therefore, it is important to address missing values appropriately before training a model. Some common strategies for dealing with missing values include imputation, where missing values are replaced with estimates based on other values in the data set, or removing observations or variables with missing values if they are few compared to the size of the data set.

### Most common strategies for dealing with missing values
1. **Deletion of observations or variables:** This strategy involves removing rows or columns that contain missing values. This may be appropriate if the number of observations or variables with missing values is small compared to the total size of the dataset, and if the deletion does not introduce significant bias in the remaining data.
2. **Imputation:** Imputation involves estimating missing values based on the available information in the dataset. Some common imputation techniques include:
    - **Mean or median:** Replace missing values with the mean or median of the corresponding variable.
    - **Most frequent value:** Replace missing values with the most frequent value (mode) of the variable.
    - **Imputation with predictive models:** Use predictive models (such as regression, KNN, or decision trees) to predict missing values based on other variables in the dataset.
3. **Missing value indicators:** Instead of imputing missing values, some algorithms can be modified to account for the presence of missing values as an additional feature. A specific value (such as -1 or NaN) is assigned to missing values, and the model learns to handle them during training.
4. **Advanced techniques:** There are more advanced techniques for addressing missing values, such as multiple imputation (where multiple imputed values are generated for missing values to account for uncertainty), or methods specific to temporal or time series data.

<div class="alert alert-block">
<b>ðŸ’¡:</b> The choice of missing value handling strategy depends on the specific context of the problem, the nature of the data, and the potential impact on the analysis or machine learning model. It is important to carefully evaluate the different options and their implications before making a decision.
</div>