Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement "preserve uploaded metadata" logic #11463

Open
1 of 5 tasks
etj opened this issue Sep 11, 2023 · 6 comments
Open
1 of 5 tasks

Reimplement "preserve uploaded metadata" logic #11463

etj opened this issue Sep 11, 2023 · 6 comments
Labels
4.1.x Epic master regression Issues related to regressions.

Comments

@etj
Copy link
Contributor

etj commented Sep 11, 2023

Regressions

Improvements

  • Cannot save or preview metadata after setting preserve metadata #11468

  • Allow the metadata editing when uploaded metadata is preserved #11466
    The whole concept of "preserving uploaded metadata" is confusing as per Errors in metadata upload #11448 and current behaviour: when "Metadata uploaded preserve" is selected, an error is issued when the "update" button is pressed (or when it is automatically triggered, see Errors in metadata upload #11448).

    The proposal is to clearly state that the preservation is only related to the uploaded XML document; this means that:

    • the metadata editing should still be possibile, since in the metadata editor there are some fields that can not be set using the values from the uploaded document
    • when saving data from the metadata form, the new values should only be stored in the DB, but the XML document should not be regenerated. This could cause a misaligment between the DB fields (which are used in the CSW filtering) and the returned XML, but this change is up to the editor user, which should be notified of the "preserve" flag when saving metadata -- at the moment an error is always returned (and it's badly handled, see Errors in metadata upload #11448).
  • Implement metadata preservation at creation time #11464
    When the new importer was implemented, some configuration pages presented during the uploading stages were removed.
    In one of these pages the user was asked if the uploaded metadata (if any) should be preserved.
    Now, if the user wants the metadata to be preserved, the user needs to go through these steps (some of which are at the moment broken)

    1. upload the data and metadata (to have some fields automatically populated)
    2. go in the "edit metadata page" and tell geonode we want to preserve the metadata
    3. upload the metadata again

    The proposal is to

    • change the client part, so to show a checkbox "preserve metadata" if an xml file is included in the upload list
      e.g.
      image
    • change the backend in order to also handle the preserve metadata boolean.
  • Fix label for Metadata upload preserve #11465

    • "Metadata uploaded preserve" should be "Preserve uploaded metadata".
@etj etj added regression Issues related to regressions. master 4.1.x labels Sep 11, 2023
@gannebamm
Copy link
Contributor

Does the proposal prevent users from changing metadata if the original metadata was uploaded and preserved?

IMHO, the whole feature only makes sense if the metadata document could be used completely. One example is #10342 as part of the contacts. However there are other fields from ISO which are not yet recognized and properly understood in the upload process. I think an upload metadata feature with preservation will confuse users if the GeoNode metadata will diverge from the metadata used in the upload.

@etj
Copy link
Contributor Author

etj commented Sep 11, 2023

Does the proposal prevent users from changing metadata if the original metadata was uploaded and preserved?

No:
image

Part of the proposal is about to only notify the user when editing metadata when the uploaded metadata document is marked as preserved. We should allow saving the updated info, probably asking for a confirmation.
The warning will tell the user "don't do it unless you know what you are doing": this should not be confusing at all, furthermore it will allow power users to perform the editing they really need.

@giohappy
Copy link
Contributor

change the client part, so to show a checkbox "preserve metadata" if an xml file is included in the upload list

The way this should be implemented is with a cog icon as previously proposed for the implementation of more upload options.
The icon will open a modal window.
This way:

  • We have room for additional options (if needed)
  • the options can be set per resource

image

@giohappy giohappy added the Epic label Sep 11, 2023
@giohappy giohappy added this to the 4.2.0 milestone Sep 11, 2023
@giohappy giohappy removed this from the 4.2.0 milestone Oct 13, 2023
@gannebamm
Copy link
Contributor

This feature is still confusing to me. AFAI understand it possibly leads to two diverging 'truths' about the dataset's metadata.
If I remember correctly, the pycsw response will use the DB values, right? However, the stored XML will be used for some requests, which can differ from the DB values?
If a user downloads the ISO metadata, it will send the uploaded preserved XML, but if a user requests Dublin core (or on our fork data-cite), will it use DB values?

I do not understand the business/use case of this functionality. Therefore we do not use it.

@ridoo
Copy link
Contributor

ridoo commented Mar 22, 2024

@gannebamm it seems, that the feature by-passes the CSW xml generation:

# generate an XML document (GeoNode's default is ISO)
if instance.metadata_uploaded and instance.metadata_uploaded_preserve:
md_doc = etree.tostring(dlxml.fromstring(instance.metadata_xml))
else:
md_doc = catalogue.catalogue.csw_gen_xml(instance, settings.CATALOG_METADATA_TEMPLATE)
try:
csw_anytext = catalogue.catalogue.csw_gen_anytext(md_doc)
except Exception as e:
LOGGER.exception(e)
csw_anytext = ""
resources.update(metadata_xml=md_doc, csw_wkt_geometry=instance.geographic_bounding_box, csw_anytext=csw_anytext)

But I may be wrong here

@giohappy
Copy link
Contributor

If a user downloads the ISO metadata, it will send the uploaded preserved XML, but if a user requests Dublin core (or on our fork data-cite), will it use DB values?

@gannebamm correct, that's what happens. There are clients that prepare metadata with external editors and they just want to attach the external XML file to a GeoNode resource and have it back on download unaltered. Of course this will create divergent because:

  • the XML parser doesn't map all the elements and attributes to the DB fields. Only a few of them are mapped (they're listed here)
  • the Dublin core output is always generated from the DB and so, given the point before, its contents will differ from the ISO output (which is the original XML file)

There isn't an easy solution of course. Even if the parser was extended there are field values that can hardly be mapped to the DB, in particular related fields that reference GeoNode users, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.1.x Epic master regression Issues related to regressions.
Projects
None yet
Development

No branches or pull requests

4 participants