variant1
This script assumes that your CSV files have columns named lon1, lat1, lon2, lat2, which represent the start and end coordinates for the drift displacements. The haversine function calculates the displacement (in kilometers) for each set of coordinates. Then, the script calculates the average displacement in each file, as an example of how you might compare the two sets of displacements. You might want to perform a different comparison, depending on your specific requirements.

In [1]:

import pandas as pd
import numpy as np



In [2]:
def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
    c = 2 * np.arcsin(np.sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

In [None]:
# read the csv files
df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')

# calculate displacement for each set of coordinates in df1
df1['displacement'] = df1.apply(lambda row: haversine(row['lon1'], row['lat1'], row['lon2'], row['lat2']), axis=1)

# calculate displacement for each set of coordinates in df2
df2['displacement'] = df2.apply(lambda row: haversine(row['lon1'], row['lat1'], row['lon2'], row['lat2']), axis=1)

# compare the displacements
# for example, compute the average displacement in each file
average_displacement_df1 = df1['displacement'].mean()
average_displacement_df2 = df2['displacement'].mean()

print('Average displacement in file 1:', average_displacement_df1)
print('Average displacement in file 2:', average_displacement_df2)

variant 2

Defines a function calculate_distance that calculates the Euclidean distance between two points in a specific coordinate system (EPSG:3857). The function uses GDAL's osr library to transform the input coordinates (assumed to be in EPSG:4326) to this coordinate system.
Reads the CSV files.
Calculates the displacement for each set of coordinates in both dataframes.
Computes and prints the average displacement in each dataframe

In [None]:
import pandas as pd
from osgeo import osr

def calculate_distance(lon1, lat1, lon2, lat2):
    """
    Calculate the Haversine distance.
    """
    source = osr.SpatialReference()
    source.ImportFromEPSG(4326)  # Coordinate system of the points

    target = osr.SpatialReference()
    target.ImportFromEPSG(3857)  # Coordinate system you want to convert to (3857 is used for calculating distances)

    transform = osr.CoordinateTransformation(source, target)

    point1 = transform.TransformPoint(lon1, lat1)  # returns a tuple with x, y, z coordinates
    point2 = transform.TransformPoint(lon2, lat2)

    distance = ((point2[0]-point1[0])**2 + (point2[1]-point1[1])**2)**0.5  # Euclidean distance

    return distance

# read the csv files
df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')

# calculate displacement for each set of coordinates in df1
df1['displacement'] = df1.apply(lambda row: calculate_distance(row['lon1'], row['lat1'], row['lon2'], row['lat2']), axis=1)

# calculate displacement for each set of coordinates in df2
df2['displacement'] = df2.apply(lambda row: calculate_distance(row['lon1'], row['lat1'], row['lon2'], row['lat2']), axis=1)

# compare the displacements
# for example, compute the average displacement in each file
average_displacement_df1 = df1['displacement'].mean()
average_displacement_df2 = df2['displacement'].mean()

print('Average displacement in file 1:', average_displacement_df1)
print('Average displacement in file 2:', average_displacement_df2)


variant 3
Sure, if you want to calculate the displacement along the x (longitude) and y (latitude) axes separately, you can do so by subtracting the x and y coordinates of the start and end points. Here's how you can modify the previous script to do this:

In [None]:
import pandas as pd
from osgeo import osr

def calculate_displacement(lon1, lat1, lon2, lat2):
    """
    Calculate the displacement along x and y axes.
    """
    source = osr.SpatialReference()
    source.ImportFromEPSG(4326)  # Coordinate system of the points

    target = osr.SpatialReference()
    target.ImportFromEPSG(3857)  # Coordinate system you want to convert to (3857 is used for calculating distances)

    transform = osr.CoordinateTransformation(source, target)

    point1 = transform.TransformPoint(lon1, lat1)  # returns a tuple with x, y, z coordinates
    point2 = transform.TransformPoint(lon2, lat2)

    displacement_x = point2[0] - point1[0]  # Displacement along x-axis
    displacement_y = point2[1] - point1[1]  # Displacement along y-axis

    return displacement_x, displacement_y

# read the csv files
df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')

# calculate displacement for each set of coordinates in df1
df1[['displacement_x', 'displacement_y']] = df1.apply(lambda row: calculate_displacement(row['lon1'], row['lat1'], row['lon2'], row['lat2']), axis=1, result_type='expand')

# calculate displacement for each set of coordinates in df2
df2[['displacement_x', 'displacement_y']] = df2.apply(lambda row: calculate_displacement(row['lon1'], row['lat1'], row['lon2'], row['lat2']), axis=1, result_type='expand')

# compare the displacements
# for example, compute the average displacement in each file
average_displacement_x_df1 = df1['displacement_x'].mean()
average_displacement_y_df1 = df1['displacement_y'].mean()
average_displacement_x_df2 = df2['displacement_x'].mean()
average_displacement_y_df2 = df2['displacement_y'].mean()

print('Average x-displacement in file 1:', average_displacement_x_df1)
print('Average y-displacement in file 1:', average_displacement_y_df1)
print('Average x-displacement in file 2:', average_displacement_x_df2)
print('Average y-displacement in file 2:', average_displacement_y_df2)
