# METAR Data Cleaning

### Step 0: Importing **Libraries**

Including our own METAR cleaning module, called 'metar_cleaning'.

In [1]:
import pandas as pd
import os
from metar_cleaning import metar_cleaning

### Step 1: Importing **METAR Data**

Let's import the METAR Data that we pulled from AVWX API.

Example file name: ```METAR_9_2024.csv```

In [None]:
# Define the directory containing the CSV files
directory = "path/to/csv/directory"

# Dictionary to store DataFrames
dataframes = {}

# Loop through all CSV files in the directory
for filename in os.listdir(directory):
    if filename.endswith(".csv"):  # Ensure it's a CSV file
        file_path = os.path.join(directory, filename)
        
        # Generate a valid Python variable name (remove ".csv" and replace invalid characters)
        df_name = filename[:-4]  # Remove .csv extension
        df_name = df_name.replace(" ", "_").replace("-", "_")  # Replace spaces and dashes with underscores
        
        # Read the CSV file into a DataFrame
        dataframes[df_name] = pd.read_csv(file_path)
        
        print(f"Loaded {filename} into DataFrame: {df_name}")

# Display the names of the loaded DataFrames
print("\nLoaded DataFrames:", list(dataframes.keys()))

We concatenate all the METAR Data into one DataFrame:

In [5]:
# Concat METAR files
metar_data_df = pd.concat(dataframes,axis=0,ignore_index=True)

len(metar_data_df)

2472515

### Step 2: **Cleaning** METAR Data

Applying the metar_cleaning function to the METAR Data. Run the cell, sit back and relax.

In [3]:
metar_cleaned_df = metar_cleaning(metar_data_df)

Cleaning METAR data...: 100%|██████████| 71342/71342 [00:00<00:00, 839646.99it/s]
Cleaning METAR data...: 100%|██████████| 71342/71342 [00:00<00:00, 94244.46it/s] 
Cleaning METAR data...: 100%|██████████| 71342/71342 [00:08<00:00, 8701.40it/s] 
Cleaning METAR data...: 100%|██████████| 71340/71340 [00:00<00:00, 838980.76it/s]
Cleaning METAR data...: 100%|██████████| 71340/71340 [00:00<00:00, 926531.97it/s]
Cleaning METAR data...: 100%|██████████| 71340/71340 [00:01<00:00, 40469.66it/s]
Cleaning METAR data...: 100%|██████████| 71340/71340 [00:00<00:00, 371558.06it/s]
Cleaning METAR data...: 100%|██████████| 71340/71340 [00:01<00:00, 53615.78it/s]
Cleaning METAR data...: 100%|██████████| 71340/71340 [00:00<00:00, 83827.37it/s]

Cleaning completed





Saving the cleaned METAR Data as a new .csv file.

In [None]:
metar_cleaned_df.to_csv('path/to/csv/directory/metar_data_cleaned.csv', index=False)