# Data Preprocessing for Financial Data

This notebook implements the steps for preprocessing financial data as outlined in the `data_preprocessing.py` script. The process includes:

1. Importing necessary libraries.
2. Loading data from a CSV file.
3. Processing columns and initializing a dictionary for outliers.
4. Detecting outliers using the IQR method.

## 1. Importing Necessary Libraries

```python
import pandas as pd
import numpy as np
```

## 2. Loading Data

The script loads data from a CSV file `data_stock.csv` located in the `data/raw_data` directory.

```python
data = pd.read_csv("data/raw_data/data_stock.csv")
```

## 3. Processing Columns

Here we retrieve column names (excluding the date column) and initialize a dictionary to store outliers.

```python
column_names = data.columns[1:]  # Start from the second column
outliers_data = {}
```

## 4. Detecting Outliers

The script uses the Interquartile Range (IQR) method to identify outliers for each column.

```python
for column_name in column_names:
    ticker_data = data[column_name]
    # Code for calculating Q1, Q3, and detecting outliers
    # Example: Q1 = np.percentile(ticker_data, 25), etc.
```

## Conclusion

After completing these steps, you will have preprocessed the financial data, identifying and handling outliers where necessary. The outliers will be added to a filed called `outliers.csv`.