Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: A Script to convert the shoreline positions along each transect to lat lon #206

Closed
2320sharon opened this issue Dec 6, 2023 · 9 comments
Assignees
Labels
enhancement New feature or request V2 for version 2 of coastseg

Comments

@2320sharon
Copy link
Collaborator

During our last meeting it was requested that a script be built that could convert the shoreline positions on each transect into latitude and longitude points. This would make it easier to create 2D shorelines from the shoreline positions on each of the transects.

Pseudo code

  1. Read a geojson file with the transect geometries and IDs
  2. Read the csv file containging the shoreline transect intersections
  3. For each transect
    1. Get the Lat, Lon for the start and end point from the geojson file for that transect ID
    2. Transform the shoreline point to lat lon
    3. save the shoreline position
  4. Save the new shoreline positions as lat lon along each transect to a new csv file (or whatever file format we decide on)
@2320sharon 2320sharon added enhancement New feature or request V2 for version 2 of coastseg labels Dec 6, 2023
@2320sharon 2320sharon self-assigned this Dec 6, 2023
@mlundine
Copy link

Found some code I had that does something similar. This is sort of scratch code but take a look for the general idea I have.

image

You would need a geojson of transects and a csv with all of the intersections (in UTM coordinates), somehow coded to assign the intersections to a particular transect.

def gb(x1, y1, x2, y2):
    angle = degrees(atan2(y2 - y1, x2 - x1))
    bearing = angle
    #bearing = (90 - angle) % 360
    return bearing

for transect_id in transect_ids:
    ##Get the angle of the transect
    transect_df = gpd.read_file(transect_path)
    transect_df = transect_df.reset_index()
    transect = transect_df.iloc[transect_id]
    transect_intersections = intersections['transect_id']
    first, last = transect['geometry'].boundary
    if switch_dir == False:
        angle = radians(gb(first.x, first.y, last.x, last.y))
    else:
        angle = radians(gb(last.x, last.y, first.x, first.y))
    ##Compute Projected X,Y positions
    northings = firstY + transect_intersections*np.sin(angle)
    eastings = firstX + transect_intersections*np.cos(angle)

The firstY and firstX are the northing and easting (in UTM) of the first intersection (temporally) along a particular transect.
Something to discuss is whether or not simple trigonometry is acceptable.

If this makes sense I can go ahead and apply the idea to the CoastSeg outputs.

@2320sharon
Copy link
Collaborator Author

Thanks for posting this. Yes give this a shot and lets see how it goes. If this works it means we can give this solution to the TCA team soon

@dbuscombe-usgs
Copy link
Member

Looks right .... ?

In geodesy, this is called the Direct Problem, e.g. https://notebook.community/OSGeo-live/CesiumWidget/GSOC/notebooks/Numerical%20Cartography/The%20Geodesic%20Problem

the method I was thinking about looks a bit like this and is good for very short distances under ~10 km - there are formulas that are more accurate for longer distances - the Vincenty, Haversine, Lambert, etc

from math import asin, atan2, cos, degrees, radians, sin

def get_point_at_distance(lat1, lon1, d, bearing, R=6371):
    """
    lat: initial latitude, in degrees
    lon: initial longitude, in degrees
    d: target distance from initial
    bearing: (true) heading in degrees
    R: optional radius of sphere, defaults to mean radius of earth

    Returns new lat/lon coordinate {d}km from initial, in degrees
    """
    lat1 = radians(lat1)
    lon1 = radians(lon1)
    a = radians(bearing)
    lat2 = asin(sin(lat1) * cos(d/R) + cos(lat1) * sin(d/R) * cos(a))
    lon2 = lon1 + atan2(
        sin(a) * sin(d/R) * cos(lat1),
        cos(d/R) - sin(lat1) * sin(lat2)
    )
    return (degrees(lat2), degrees(lon2),)

Example

lat = 52.20472
lon = 0.14056
distance = 15
bearing = 90
lat2, lon2 = get_point_at_distance(lat, lon, distance, bearing)

@dbuscombe-usgs
Copy link
Member

For us, the bearing is always zero

@dbuscombe-usgs
Copy link
Member

Scratch code is great, btw - fewer dependencies!!

@2320sharon
Copy link
Collaborator Author

2320sharon commented Mar 1, 2024

Here is the update from today. Credit for the this solution goes to @mlundine

# load modules
import os
import numpy as np
import pandas as pd
import geopandas as gpd
import pandas as pd
import datetime
import shapely
from math import degrees, atan2, radians

def gb(x1, y1, x2, y2):
    angle = degrees(atan2(y2 - y1, x2 - x1))
    bearing = angle
    #bearing = (90 - angle) % 360
    return bearing

def arr_to_LineString(coords):
    """
    Makes a line feature from a list of xy tuples
    inputs: coords
    outputs: line
    """
    points = [None]*len(coords)
    i=0
    for xy in coords:
        points[i] = shapely.geometry.Point(xy)
        i=i+1
    line = shapely.geometry.LineString(points)
    return line

def transect_timeseries_to_wgs84(transect_timeseries_merged_path,
                                 config_gdf_path):
    """
    Edits the transect_timeseries_merged.csv or tidally_corrected_timeseries_merged.csv
    so that there are additional columns with lat (shore_y) and lon (shore_x).
    inputs:
    transect_timeseries_merged_path (str): path to the transect_timeseries_merged.csv
    config_gdf_path (str): path to the config_gdf geojson
    """
    
    ##Load in data, make some new paths
    timeseries_data = pd.read_csv(transect_timeseries_merged_path)
    config_gdf = gpd.read_file(config_gdf_path)
    transects = config_gdf[config_gdf['type']=='transect']
    new_gdf_wgs84_path = os.path.join(os.path.dirname(config_gdf_path), 'intersections_gdf_wgs84.geojson')
    new_gdf_utm_path = os.path.join(os.path.dirname(config_gdf_path), 'intersections_gdf_utm.geojson')
    
    ##Gonna do this in UTM to keep the math simple...problems when we get to longer distances (10s of km)
    org_crs = transects.crs
    utm_crs = transects.estimate_utm_crs()
    transects_utm = transects.to_crs(utm_crs)

    ##need some placeholders
    shore_x_vals = [None]*len(timeseries_data)
    shore_y_vals = [None]*len(timeseries_data)
    timeseries_data['shore_x'] = shore_x_vals
    timeseries_data['shore_y'] = shore_y_vals

    dates = [None]*len(timeseries_data['dates'])
    points = [None]*len(timeseries_data['dates'])

    k=0
    ##first loop over all transects
    for i in range(len(transects_utm)):
        transect = transects_utm.iloc[i]
        transect_id = transect['id']
        first = transect.geometry.coords[0]
        last = transect.geometry.coords[1]
        idx = timeseries_data['transect_id'].str.contains(transect_id)
        ##in case there is a transect in the config_gdf that doesn't have any intersections
        if np.any(idx):
            timeseries_data_filter = timeseries_data[idx]
        else:
            continue
        idxes = timeseries_data_filter.index
        distances = timeseries_data_filter['cross_distance']
        j=0
        ##now loop over all cross-shore distances in a particular transect
        for distance in distances:
            idx = idxes[j]
            if distance<0:
                angle = radians(gb(first[0], first[1], last[0], last[1])+180)
            else:
                angle = radians(gb(first[0], first[1], last[0], last[1]))
            shore_x_utm = first[0]+distance*np.cos(angle)
            shore_y_utm = first[1]+distance*np.sin(angle)
            point_utm = shapely.Point(shore_x_utm, shore_y_utm)

            ##lazy conversion to WGS84
            point_gdf_utm = gpd.GeoDataFrame({'geometry':[point_utm]},
                                             crs=utm_crs)
            point_gdf_wgs84 = point_gdf_utm.to_crs(org_crs)
            shore_x_wgs84 = point_gdf_wgs84.geometry.x[0]
            shore_y_wgs84 = point_gdf_wgs84.geometry.y[0]
            
            date = timeseries_data['dates'].iloc[idx]
            dates[k] = date
            
            points[k] = (shore_x_wgs84, shore_y_wgs84)
            timeseries_data.loc[idx,'shore_x'] = shore_x_wgs84
            timeseries_data.loc[idx, 'shore_y'] = shore_y_wgs84
            j=j+1
            k=k+1


    ##I want these as geojson too, saving as UTM and WGS84
    new_df_wgs84 = pd.DataFrame({'dates':dates,
                                 'points':points}
                                )
    new_dates = np.unique(dates)
    new_lines = [None]*len(np.unique(dates))
    for i in range(len(new_lines)):
        date = new_dates[i]
        new_df_filter = new_df_wgs84[new_df_wgs84['dates']==date]
        new_line = arr_to_LineString(new_df_filter['points'])
        new_lines[i] = new_line
        new_dates[i] = date

    
    new_gdf_wgs84 = gpd.GeoDataFrame({'dates':new_dates,
                                      'geometry':new_lines},
                                     crs=org_crs)
    new_gdf_utm = new_gdf_wgs84.to_crs(utm_crs)
    new_gdf_utm.to_file(new_gdf_utm_path)
    new_gdf_wgs84.to_file(new_gdf_wgs84_path)
    timeseries_data.to_csv(transect_timeseries_merged_path)
    
# Code to test the script
# transect_timeseries_merged_path = r"C:\development\doodleverse\coastseg\CoastSeg\sessions\rsa3_model_extract\transect_time_series_merged.csv"
# config_gdf_path = r"C:\development\doodleverse\coastseg\CoastSeg\sessions\rsa3_model_extract\config_gdf.geojson"
# # tests on raw transects
# transect_timeseries_to_wgs84(transect_timeseries_merged_path, config_gdf_path)
# # test  tidally corrected
# transect_timeseries_merged_path = r"C:\development\doodleverse\coastseg\CoastSeg\sessions\rsa3_model_extract\transect_time_series_tidally_corrected.csv"
# transect_timeseries_to_wgs84(transect_timeseries_merged_path, config_gdf_path)

Tasks

  • Drop the new unnamed columns formed by this function
  • See if we can speed up the function

@2320sharon
Copy link
Collaborator Author

I made the following code to export the shore_x and shore_y columns along with their transect_id as points to a geojson file, just in case users want the shorelines as point as well. This covers issue #220

import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

def export_to_geojson(csv_file_path, output_file_path, x_col, y_col, id_col):
    """
    Export specified columns from a CSV file to a GeoJSON format, labeled by a unique identifier.
    
    Parameters:
    - csv_file_path: str, path to the input CSV file.
    - output_file_path: str, path for the output GeoJSON file.
    - x_col: str, column name for the x coordinates (longitude).
    - y_col: str, column name for the y coordinates (latitude).
    - id_col: str, column name for the unique identifier (transect id).
    
    Returns:
    - str, path for the created GeoJSON file.
    """
    # Load the data from the CSV file
    data = pd.read_csv(csv_file_path)
    
    # Convert to GeoDataFrame
    gdf = gpd.GeoDataFrame(
        data, 
        geometry=[Point(xy) for xy in zip(data[x_col], data[y_col])], 
        crs="EPSG:4326"
    )
    
    # Keep only necessary columns
    gdf = gdf[[id_col, 'geometry']].copy()
    
    # Export to GeoJSON
    gdf.to_file(output_file_path, driver='GeoJSON')
    
    # Return the path to the output file
    return output_file_path

Test Code

csv_file_path = '/path/to/your/input.csv'
output_file_path = '/path/to/your/output.geojson'
x_col = 'shore_x'
y_col = 'shore_y'
id_col = 'transect_id'

export_to_geojson(csv_file_path, output_file_path, x_col, y_col, id_col)

@mlundine
Copy link

mlundine commented Mar 4, 2024

# load modules
import os
import numpy as np
import pandas as pd
import geopandas as gpd
import pandas as pd
import datetime
import shapely

def transect_timeseries_to_wgs84(transect_timeseries_merged_path,
                                 config_gdf_path):
    """
    Edits the transect_timeseries_merged.csv or transect_timeseries_tidally_corrected.csv
    so that there are additional columns with lat (shore_y) and lon (shore_x).
    inputs:
    transect_timeseries_merged_path (str): path to the transect_timeseries_merged.csv
    config_gdf_path (str): path to the config_gdf.geojson
    """
    
    ##Load in data, make some new paths
    timeseries_data = pd.read_csv(transect_timeseries_merged_path)
    edit_path = os.path.splitext(transect_timeseries_merged_path)
    config_gdf = gpd.read_file(config_gdf_path)
    transects = config_gdf[config_gdf['type']=='transect']
    new_gdf_shorelines_wgs84_path = os.path.join(os.path.dirname(config_gdf_path), 'intersections_gdf_wgs84.geojson')
    new_gdf_shorelines_utm_path = os.path.join(os.path.dirname(config_gdf_path), 'intersections_gdf_utm.geojson')
    
    ##Gonna do this in UTM to keep the math simple...problems when we get to longer distances (10s of km)
    org_crs = transects.crs
    utm_crs = transects.estimate_utm_crs()
    transects_utm = transects.to_crs(utm_crs)

    ##need some placeholders
    shore_x_vals = [None]*len(timeseries_data)
    shore_y_vals = [None]*len(timeseries_data)
    timeseries_data['shore_x'] = shore_x_vals
    timeseries_data['shore_y'] = shore_y_vals

    ##make an empty gdf to hold points
    size = len(timeseries_data)
    transect_ids = [None]*size
    dates = [None]*size
    points = [None]*size
    points_gdf_utm = gpd.GeoDataFrame({'geometry':points,
                                      'dates':dates,
                                      'id':transect_ids},
                                      crs=utm_crs)

    ##loop over all transects
    for i in range(len(transects_utm)):
        transect = transects_utm.iloc[i]
        transect_id = transect['id']
        first = transect.geometry.coords[0]
        last = transect.geometry.coords[1]
        
        idx = timeseries_data['transect_id'].str.contains(transect_id)
        ##in case there is a transect in the config_gdf that doesn't have any intersections
        ##skip that transect
        if np.any(idx):
            timeseries_data_filter = timeseries_data[idx]
        else:
            continue

        idxes = timeseries_data_filter.index
        distances = timeseries_data_filter['cross_distance']

        angle = np.arctan2(last[1] - first[1], last[0] - first[0])

        shore_x_utm = first[0]+distances*np.cos(angle)
        shore_y_utm = first[1]+distances*np.sin(angle)
        points_utm = [shapely.Point(xy) for xy in zip(shore_x_utm, shore_y_utm)]

        #conversion from utm to wgs84, put them in the transect_timeseries csv and utm gdf
        dummy_gdf_utm = gpd.GeoDataFrame({'geometry':points_utm},
                                         crs=utm_crs)
        dummy_gdf_wgs84 = dummy_gdf_utm.to_crs(org_crs)

        points_wgs84 = [shapely.get_coordinates(p) for p in dummy_gdf_wgs84.geometry]
        points_wgs84 = np.array(points_wgs84)
        points_wgs84 = points_wgs84.reshape(len(points_wgs84),2)
        x_wgs84 = points_wgs84[:,0]
        y_wgs84 = points_wgs84[:,1]
        timeseries_data.loc[idxes,'shore_x'] = x_wgs84
        timeseries_data.loc[idxes,'shore_y'] = y_wgs84
        dates = timeseries_data['dates'].loc[idxes]
        points_gdf_utm.loc[idxes,'geometry'] = points_utm
        points_gdf_utm.loc[idxes,'dates'] = dates
        points_gdf_utm.loc[idxes,'id'] = [transect_id]*len(dates)
        
    ##get points as wgs84 gdf
    points_gdf_wgs84 = points_gdf_utm.to_crs(org_crs)
    
    ##Need to loop over unique dates to make shoreline gdf from points
    new_dates = np.unique([points_gdf_utm['dates']])
    new_lines = [None]*len(np.unique(new_dates))
    for i in range(len(new_lines)):
        date = new_dates[i]
        points_filter = points_gdf_wgs84[points_gdf_wgs84['dates']==date]

        new_line = shapely.LineString(points_filter['geometry'])
        new_lines[i] = new_line
        new_dates[i] = date
    
    new_gdf_shorelines_wgs84 = gpd.GeoDataFrame({'dates':new_dates,
                                                 'geometry':new_lines},
                                                crs=org_crs)

    ##convert to utm, save wgs84 and utm geojsons
    new_gdf_shorelines_utm = new_gdf_shorelines_wgs84.to_crs(utm_crs)
    new_gdf_shorelines_utm.to_file(new_gdf_shorelines_utm_path)
    new_gdf_shorelines_wgs84.to_file(new_gdf_shorelines_wgs84_path)

    ##dropping that extra index column and saving new csv
    timeseries_data.drop(timeseries_data.columns[[0]],axis=1,inplace=True)
    timeseries_data.to_csv(edit_path)

@mlundine
Copy link

mlundine commented Mar 4, 2024

Vectorized one of the loops so much faster now and drops the extra index column.

2320sharon added a commit that referenced this issue Mar 26, 2024
2320sharon added a commit that referenced this issue Mar 26, 2024
…files containing transect shoreline intersections
2320sharon added a commit that referenced this issue Mar 28, 2024
2320sharon added a commit that referenced this issue Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request V2 for version 2 of coastseg
Projects
None yet
Development

No branches or pull requests

3 participants