Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQSS/9387 - Support protected system metadata #9388

Merged

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Feb 14, 2023

What this PR does / why we need it: Provides a mechanism to restrict metadata edits in specified blocks to those who know the secret key for that block. API only in this PR. See issue for use cases.

Which issue(s) this PR closes:

Closes #9387

Special notes for your reviewer:
Implementation was discussed in tech hours. As part of the work I created some utility methods to determine which blocks have changes when an update is made. These should be more efficient that keeping track of all the individual fields that have changed. I'm not sure if these could be useful elsewhere, e.g. in the version table where indicating just which blocks have changes might be a cheaper option when there are many versions, etc.

Also -

  • this PR removes the unused AbstractDatasetCommand.registrationRequired flag. (I new harvested flag was needed to avoid checking harvested datasets for system metadata, which led me to check on the use of the registrationRequired flag. So removing it is cleanup and avoids having more variant constructors with two optional flags.)
  • this PR fixes some getOrCreateEditVersion logic in the semantic api calls. Prior to this PR, getOrCreateEditVersion was called before a check was made to see if a draft existed, which made that check always true. This is completely unrelated to the system metadata and could be a separate PR if needed. (It is local to those specific calls).

Suggestions on how to test this:
DANS has been testing this PR and @PaulBoon can probably provide more details. In essence, adding a key for a given metadata block should make that block not appear in the edit metadata display and cause all API calls for creating/updating a dataset/version where metadata in this block is changed should now require the extra key parameter for changes to succeed. (see the doc changes). All other blocks and functionality should be unaffected.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:
included

Additional documentation:
included

these methods were always using the update command (because they checked
for isDraft after always creating a draft version). However, since the
Create command assumes a new dsv not associated with the dataset yet,
they also shouldn't use it. This commit reorders the isDraft check where
needed and removes the logic to decide whether to use the Update or
Create commands (since Update should always be used). This should be a
big no-op since the old code always used Update 'accidentally').
needed for system Metadata check in CreateDatasetVersionCommand.
@qqmyers qqmyers added the GDCC: DANS related to GDCC work for DANS label Feb 14, 2023
@qqmyers qqmyers added this to the 5.14 milestone Feb 14, 2023
@coveralls
Copy link

coveralls commented Feb 14, 2023

Coverage Status

coverage: 20.388% (+0.02%) from 20.369% when pulling 4130dfd on GlobalDataverseCommunityConsortium:DANS/system_metadata into 13794a9 on IQSS:develop.

@PaulBoon
Copy link
Contributor

PaulBoon commented Feb 15, 2023

About setting up the testing, activating the feature, for example set '1234' on the citation metadata block.
I managed this in two ways, the first one is via a properties file.

  • Create META-INF/microprofile-config.properties file in the exploded war and to add the required setting, e.g.
    dataverse.metadata.block-system-metadata-keys={"citation":"1234"}.

The second and preferred one is via the asadmin command.

  • /usr/local/payara5/glassfish/bin/asadmin set-config-property --propertyName=dataverse.metadata.block-system-metadata-keys --propertyValue='{"citation":"1234"}' --source=domain --sourceName=domain1
    This is persisted in the domain.xml file as:
    <property name="payara.microprofile.dataverse.metadata.block-system-metadata-keys" value="{&quot;citation&quot;:&quot;1234&quot;}"></property>.
    @poikilotherm Not sure why it won’t persists if I use application and Dataverse as the source and sourceName, but I had to use the domain. Also we probably have to secure it like we do with the database password.

Removing is done by editing the properties file or with delete-config-property.

@poikilotherm
Copy link
Contributor

poikilotherm commented Feb 15, 2023

Hey @PaulBoon sorry to say this, but it looks like you're doing this in a much more complicated way than necessary.

  1. You should not edit values in META-INF/microprofile-config.properties from the WAR. This is solely meant to carry defaults added at build time.
  2. I very much believe you don't need to go for these type of properties. Simply use "system properties" (also called JVM options), e. g. using asadmin create-system-properties (when going for JVM options via asadmin don't forget the -D, which makes it a system property).
  3. Or, for testing / in containers: you can opt for setting an environment variable DATAVERSE_METADATA_BLOCK_SYSTEM_METADATA_KEYS
  4. Obviously, there are also other source. Find them at Payara Docs. Personally, I find the directory source pretty promising for testing, as this works by simply writing to a file named like the property. (Disclaimer: I added this to Payara 😉 )

HTH 😄

Comment on lines 43 to 45

# METADATA SETTINGS
#dataverse.metadata.block-system-metadata-keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you add this here, IMHO we should only add things that we have a default for. (Yes I am aware that there is a similar thing for OAI above - yet it looks like people start to mistake this file as editable for configuration, which is not how it's meant to be used)

Comment on lines 231 to 233
String smdbString = JvmSettings.METADATA_BLOCK_SYSTEM_METADATA_KEYS.lookupOptional().orElse(null);
if (smdbString != null) {
JsonObject systemMetadataBlocks = JsonUtil.getJsonObject(smdbString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a converter instead of fiddling with the String itself?

It's not very hard to do and makes code here much nicer to read and write. 😉 https://download.eclipse.org/microprofile/microprofile-config-3.0/microprofile-config-spec-3.0.html#_adding_custom_converters

JsonObject systemMetadataBlocks = JsonUtil.getJsonObject(smdbString);
HttpServletRequest httpServletRequest = getRequest().getHttpServletRequest();
if (httpServletRequest != null) {
String mdKey = httpServletRequest.getParameter(MetadataBlockServiceBean.SYSTEM_MD_KEY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to move the parameter extraction to DataverseRequest, adding a method there and avoid a bit more if/else stuff by returning an Optional<>?

if (httpServletRequest != null) {
String mdKey = httpServletRequest.getParameter(MetadataBlockServiceBean.SYSTEM_MD_KEY);
for (MetadataBlock mdb : changedMDBs) {
if (systemMetadataBlocks.containsKey(mdb.getName())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought: it has advantages to use a JSON object here, yes. What would also work: create a JvmSetting with a placeholder at the end. Less complexity, no costly JSON parsing.

Asking for dataverse.metadata.block-system-metadata-keys.citation would return the key String if set for the citation block. You can access this variable setting via the same JvmSettings.lookupOptional(), but you need to pass the name of the block as the var arg.

@mreekie mreekie added this to Ready for Review ⏩ in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) via automation May 15, 2023
@mreekie mreekie moved this from Ready for Review ⏩ to ▶ SPRINT READY in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) May 15, 2023
@cmbz cmbz moved this from ▶ SPRINT READY to This Sprint 🏃‍♀️ 🏃 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Jun 21, 2023
@cmbz cmbz moved this from This Sprint 🏃‍♀️ 🏃 to Ready for Review ⏩ in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Jun 22, 2023
@landreev landreev self-requested a review June 29, 2023 14:48
@landreev landreev self-assigned this Jun 29, 2023
@landreev landreev moved this from Ready for Review ⏩ to In Review 🔎 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) Jun 29, 2023
Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this was already heavily reviewed, by @poikilotherm and @PaulBoon, with all the feedback having been addressed. So mine is a review of the reviews for the most part. Everything appears to make sense, so I'm marking it ready for QA to be included in v5.14.

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from In Review 🔎 to Ready for QA ⏩ Jun 29, 2023
@landreev landreev removed their assignment Jun 29, 2023
@landreev landreev mentioned this pull request Jun 30, 2023
Conflicts:
	src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java
	src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AbstractCreateDatasetCommand.java
	src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateNewDatasetCommand.java
@kcondon
Copy link
Contributor

kcondon commented Jul 6, 2023

Issues found:
[X]1. Need to restart payara for system metadata block setting to be enforced using example (just an observation, not sure whether all mp sources need restart) [KC] Doc updated.
[X] 2. Templates feature continues allowing editing system metadata block fields but cannot create a dataset using that template, throws ui error and server log error saying needs key. [KC] Still an issue, can see and modify system protected block in create template, though it seems fixed in edit an existing template (doesn't show protected block if exists). Note that if the existing template was created using protected block, it still attempts to update that block, resulting in error. Could be a separate issue or out of scope/doc. [KC] Fixed
[X]3. Can create a dataset using native api that contains system metadata block entries without a key, see all fields example. [KC} The endpoint now honors the system block but I'm finding the key in the header doesn't work, only as an arg. [KC] Fixed example to use : as key/value separator in header example, = as separator in arg example
[x] 4. Making citation block a system block comes with some constraints: a. cannot use UI to create or edit a dataset due to required citation block fields on save and b. cannot edit any other block in ui or api without key due to automatic citation block update on save field like deposit date. [KC] Doc updated.
[x] 5. Unclear how to reverse/undo the system metadatablock with the example provided. Maybe make it a bit more explicit -in the case of this example, asadmin delete-password-alias dataverse.metadata.block-system-metadata-keys.codemeta , then restart payara for it to take effect. [KC] Doc Updated
[X]6. The semantic api example should state it is changing the title metadata and add ?replace=true to the command, otherwise it fails against an existing dataset already containing a title with an unclear message: status":"ERROR","code":400,"message":"Bad Request. [KC] This works now, thanks. Was wondering though, aside from showing the various key options, does the example make sense? It is updating a field, title, in the citation block but passing a key to the codeMeta20 block when it isn't needed. Sorry I didn't catch that sooner. [KC] Doc example updated
[X]7. The semantic api example shows two command line methods of passing the block key. The header example using -H fails but the second example, passing as an argument ?md.key, works. [KC] Using the examples that update title, if I set citation block as protected, this endpoint does not seem to require keys, it works without passing them. [KC] Was issue where with changed example to codemeta20 block, the keys were non existent so it failed silently. Example was updated and works.
[X]8. Was not able to get the native api edit metadata endpoint to work using the example in the guide so could not confirm it enforces system block. However, the update metadata endpoint does enforce it. [KC] I am seeing some weird behavior here: initially enforces key on protected block, then once passed, can execute same update without key. If I then use a different named file, it prompts for key again. Changed or same values doesn't seem to matter. I also discovered a curlism: if pass a key using -H as in api key, if no space between key:value, it doesn't need quotes but if there is a space, there does need to be quotes. I think in our guides I've seen examples of both. This is not the explanation of the above behavior since not quoting results in an error when there is a space. [KC] The intermittent behavior was due to unchanged content on update.

@qqmyers
Copy link
Member Author

qqmyers commented Jul 10, 2023

  1. Need to restart payara for system metadata block setting to be enforced using example (just an observation, not sure whether all mp sources need restart)
  • noted in docs
  1. Templates feature continues allowing editing system metadata block fields but cannot create a dataset using that template, throws ui error and server log error saying needs key.
  • fixed, protected blocks won't show in template now (FWIW: I don't see an api call to create a template yet, so nothing to fix there)
  1. Can create a dataset using native api that contains system metadata block entries without a key, see all fields example.
  • fixed
  1. Making citation block a system block comes with some constraints: a. cannot use UI to create or edit a dataset due to required citation block fields on save and b. cannot edit any other block in ui or api without key due to automatic citation block update on save field like deposit date.
  • noted in docs
  1. Unclear how to reverse/undo the system metadatablock with the example provided. Maybe make it a bit more explicit -in the case of this example, asadmin delete-password-alias dataverse.metadata.block-system-metadata-keys.codemeta , then restart payara for it to take effect.
  • added example to docs
  1. The semantic api example should state it is changing the title metadata and add ?replace=true to the command, otherwise it fails against an existing dataset already containing a title with an unclear message: status":"ERROR","code":400,"message":"Bad Request.
  • changed in docs
  1. The semantic api example shows two command line methods of passing the block key. The header example using -H fails but the second example, passing as an argument ?md.key, works.
  • I found the opposite - the header worked and query param didn't. Investigating, I found that the query param is case sensitive, so I noted that in the docs and changed the examples to use the same case everywhere (fwiw: also noticed that the codemeta block is actually named codeMeta20 so I changed the example to that)
  1. Was not able to get the native api edit metadata endpoint to work using the example in the guide so could not confirm it enforces system block. However, the update metadata endpoint does enforce it.
  • can't reproduce this - possible an issue with block name and/or case sensitivity?

@kcondon kcondon merged commit 26351af into IQSS:develop Jul 11, 2023
9 of 10 checks passed
IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from QA ✅ to Done 🚀 Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GDCC: DANS related to GDCC work for DANS Size: 30 A percentage of a sprint. 21 hours. (formerly size:33)
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Feature Request/Idea: Support protected metadata managed by external applications
6 participants