Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unique index on id_source, entity_source_pk_value not wanted #25

Open
DonovanMaillard opened this issue May 9, 2022 · 2 comments
Open

Comments

@DonovanMaillard
Copy link

This unique index can bring some problems for example if we import SINP data from INPN in the instance. A unique source wich contains data from several external sources, with possibly the same entity pk value.

@DonovanMaillard
Copy link
Author

I think it will also make problems on ORB database, because gn2pg will work on already agglomerated data into thematic databases.

For example, invertebrates data published by the thematic pole will come from several sources, have non-unique pk_values, and come from the same source "pole invertébrés" in the ORB database.
It would be more efficient to base on uuid field I think.

@lpofredc
Copy link
Member

complex issue with conflicting expectations, as mentioned in question #31.

It may could be solved by working on the triggering rules, preferably using the uuid or, if no uuid is available, the source/entity_source_pk_values pair as the data identifier.

Does this raise the question of the use of the entity_source_pk_value field? In GN2PG's SQL examples, the entity_source_pk_value of the destination database is the id_synthese of the source database. So, there is normally no duplicates but makes it more difficult to trace the source of the data without UUID.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants