Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate no duplicate/empty station names in observation point files (+same for obs crossections) #483

Open
arthurvd opened this issue Mar 27, 2023 · 1 comment
Labels
domain: validation priority: high type: enhancement Improvements to existing functionality

Comments

@arthurvd
Copy link
Member

Is your feature request related to a problem? Please describe.

Additional request: validate that there are no duplicate observation point names
Originally posted by @veenstrajelmer in #459 (comment)

Describe the solution you'd like
A (Pydantic) validator that verifies there are no duplicate names in all read observation points. This should work for a single XYNModel and a single ObservationPointModel.
Open Question (@veenstrajelmer): when reading multiple obs files, actually names must be unique across all of them. Agree?
If so, that would mean we (also) need some root validation in the FMModel.Output class.

Same discussion for PolyFile+ ObservationCrossSectionModel

@arthurvd arthurvd added the type: enhancement Improvements to existing functionality label Mar 27, 2023
@arthurvd arthurvd added this to To do in HYDROLIB-core via automation Mar 27, 2023
@veenstrajelmer
Copy link
Collaborator

veenstrajelmer commented Mar 27, 2023

Hi Arthur, indeed, this is across all of them. To avoid duplicate names in the output. However, the xy-coordinates are not relevant here. It is also to avoid accidental equal station names called Den in this example:

1.5 2.3 Den Haag
1.2 3.4 Den Helder

What is also important for both polyfiles as obsfiles is that the names are not empty strings, and indeed that they are not duplicated. I think they are also not allowed to start with a number. A small example code shows that hydrolib-core does not validate any of these faulty inputs:

import hydrolib.core.dflowfm as hcdfm

polyfile_obj = hcdfm.PolyFile()
for ipol in range(2):
    name = 'duplicate_name' #TODO: not allowed to use identical polyline names in 1 file, but this is not catched by hydrolib-core
    name = '' #TODO: when providing name='' it will result in an invalid plifile, but this is not catched by hydrolib-core
    name = '1' #TODO: starting the name with a number is probably not allowed by FM, but this is not catched by hydrolib-core
    
    pointsobj_list = [{'x': -68.35, 'y': 12.2+ipol, 'data': []},
                      {'x': -68.35, 'y': 12.3+ipol, 'data': []},
                      {'x': -68.35, 'y': 12.4+ipol, 'data': []},
                      {'x': -68.35, 'y': 12.5+ipol, 'data': []}]
    
    # TODO: would be convenient of the n_rows and n_columns is derived from the points automatically
    polyobject = hcdfm.PolyObject(metadata={'name':name,'n_rows':4,'n_columns':2}, points=pointsobj_list)
    polyfile_obj.objects.append(polyobject)
    
print('names of polylines in the polyfile')
print([x.metadata.name for x in polyfile_obj.objects])
polyfile_obj.save('test_pli.pli')

There might be more demands for the polyline names, like:

  • not allowed to start with a numeric character
  • not allowed to contain a dash or other non alphanumeric character. They may contain spaces, so this might also be allowed though. Still good to check.

Furthermore, for PolyObject specifically, it would be great that the n_rows and n_columns are derived from the provided points, but this could be a follow-up issue.

In dfm_tools there is now the function dfm_tools.hydrolib_helpers.validate_polyline_names() that checks duplicate names and starting with a non-alpha character. That function can be removed again after this issue is fixed.

@rhutten rhutten removed this from To do in HYDROLIB-core Apr 10, 2024
@rhutten rhutten added this to To do in HYDROLIB-core via automation Apr 18, 2024
@veenstrajelmer veenstrajelmer changed the title Validate no duplicate station names in observation point files (+same for obs crossections) Validate no duplicate/empty station names in observation point files (+same for obs crossections) Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: validation priority: high type: enhancement Improvements to existing functionality
Projects
HYDROLIB-core
  
To do
Development

No branches or pull requests

4 participants