# Data Assimilation with GPS Data

This project demonstrates the Ensemble Kalman Filter (EnKF) using real GPS data procured from two phones measuring the location of the same thing (me driving my car around a parking lot). It is meant as a hands-on experience in overcoming each obstacle of real-world data assimilation, including asynchronous and irregularly-spaced observations that don't directly map to the timing of the state model, and obtaining a solid initial $\mathbf{R}$ matrix. The second issue is determining the accuracy of the model - for my case, I was able to make an approximate "true" solution by mapping out my route using an online tool. 

<img src = "../img/routes.png" height = 300 width = 300/>

## Plotting the data

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

def plot_gps_data(file_path, data_format):
    # Load data
    data = pd.read_csv(file_path)
    
    # Check the data format and extract coordinates
    if data_format == 'madds_format':
        # Assuming 'lat' and 'long' columns for Maddison's phone data
        latitudes = data['lat']
        longitudes = data['long']
    elif data_format == 'marios_format':
        # Assuming 'latitude' and 'longitude' columns for Mario's phone data
        latitudes = data['latitude']
        longitudes = data['longitude']
    else:
        raise ValueError("Data format not recognized")
    
    # Create a scatter plot
    plt.figure(figsize=(10, 6))
    plt.scatter(longitudes, latitudes, alpha=0.7, marker='o', s=20, label=f'Data from {data_format}')
    plt.title(f'GPS Data Visualization for {data_format}')
    plt.xlabel('Longitude')
    plt.ylabel('Latitude')
    plt.grid(True)
    plt.legend()
    plt.show()

# Paths to the datasets
madds_file_path = '/mnt/data/True_Run_maddisons_phone.csv'
marios_file_path = '/mnt/data/true_run_marios_phone.csv'

# Plotting each dataset
plot_gps_data(madds_file_path, 'madds_format')
plot_gps_data(marios_file_path, 'marios_format')


## Interpolating the data with splines