# **Preparing Indices**
*Notebook created Dec. 17, 2025*

This notebook documents the processing of ISO index files in preparation for evaluating the relationship between SWM and MJO/BSISO. All processes will be performed locally.

## **Sources**
| Source | MJO | BSISO | Period | Temporal Resolution | File Extension |
| :--- | :---: | :---: | :---: | :---: | :---: |
| [Bureau of Meteorology](https://www.bom.gov.au/climate/mjo/#tabs=Monitoring) | <span style="color:green">**Yes**</span> | No | 1974 – Present (Realtime) | Daily | .txt |
| [APEC Climate Center](https://apcc21.org/prediction/bsiso/moni?lang=en) | No | <span style="color:green">**Yes**</span> | 1981 – Present (Realtime) | Daily | .dat |
| [Bimodal ISO Index](iprc.soest.hawaii.edu/users/kazuyosh/Bimodal_ISO.html) | <span style="color:green">**Yes**</span> | <span style="color:green">**Yes**</span> | 1979 – 2020 | Daily | .txt |

## **Pre-Processing**
- **Objective:** Convert .dat and .txt raw files into .csv files.
1. Save converted .csv files in the `02_processed` folder.
2. Index files will be converted to .csv files via Excel.
- Files:
  - `BSISO_APEC.csv`
  - `BSISO_Kikuchi.csv`
  - `MJO_BoM.csv`
  - `MJO_Kikuchi.csv`

## **Processing**
- **Objective #1:** Trim data and standardize headers.
   - MJO: `year`, `month`, `day`, `nrm`, `phase`
   - BSISO: `year`, `month`, `day`, `nrm`, `phase`
- **Objective #2:** Filter out days where the normalized amplitude (`amplitude` or `nrm`) > 1.
   - Track the number of filtered days per phase.
1. Save filtered .csv files in the `03_final` folder.
2. Save figures in the `06_figures` folder.

### ***Objective 1***
Trim data and standardize headers (via Excel).
- MJO: `year`, `month`, `day`, `nrm`, `phase`
- BSISO: `year`, `month`, `day`, `nrm`, `phase`
- Period: May to October, 1979 to 2020

<span style="color:red">Troublesome files: `BSISO_APEC.csv`</span>

### ***Objective 2***
Filter out days where the normalized amplitude (`amplitude` or `nrm`) > 1.
- Track the number of filtered days per phase.
- Perform EDA on filtered data.

Test files: `BSISO_Kikuchi.csv`, `MJO_Kikuchi.csv`

#### **Reminders**
- Don't forget labels and legends.
- Color coding:
  - BSISO: `RdBu`, `coolwarm_r`
  - MJO: `RdBu_r`, `coolwarm`

In [1]:
# Filtering BSISO_Kikuchi.csv and MJO_Kikuchi.csv (Test)
import pandas as pd
from pathlib import Path

PROJECT_ROOT = Path(r"C:\Users\Nitro 5\Documents\MS\Thesis\GitHub\MS_Thesis_SWM")
DATA_DIR = PROJECT_ROOT / "01_data" / "02_processed"

files = [
    "BSISO_Kikuchi.csv",
    "MJO_Kikuchi.csv"
]

def filter_file(infile, outfile):    
    df = pd.read_csv(infile)

    # Date filtering
    df = df[
    (df["year"] >= 1979) &
    (df["year"] <= 2020) &
    (df["month"].isin([5, 6, 7, 8, 9, 10]))
        ]
    # Amplitude filtering
    df = df[df["Amp(nrm)"] > 1]

    # Output
    df.to_csv(outfile, index=False)

for fname in files:
    infile = DATA_DIR / fname
    outfile = DATA_DIR / fname.replace("Kikuchi.csv", "K_filtered.csv")

    filter_file(infile, outfile)