Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCK-2523: Fix formatting of conceptDOI #5894

Merged
merged 17 commits into from
Jun 3, 2024
Merged

Conversation

hyunnaye
Copy link
Contributor

@hyunnaye hyunnaye commented May 24, 2024

Description
This PR removes the doi/ prefix in conceptDOI

Review Instructions
Go to https://qa.dockstore.org/workflows/github.com/kathy-t/ghapps-single-workflow:master?tab=info and verify that the link works correctly.

Issue
https://ucsc-cgl.atlassian.net/browse/DOCK-2523
https://ucsc-cgl.atlassian.net/browse/DOCK-2527

Security and Privacy

If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.

  • Security and Privacy assessed

e.g. Does this change...

  • Any user data we collect, or data location?
  • Access control, authentication or authorization?
  • Encryption features?

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • Follow the existing JPA patterns for queries, using named parameters, to avoid SQL injection
  • If you are changing dependencies, check the Snyk status check or the dashboard to ensure you are not introducing new high/critical vulnerabilities
  • Assume that inputs to the API can be malicious, and sanitize and/or check for Denial of Service type values, e.g., massive sizes
  • Do not serve user-uploaded binary images through the Dockstore API
  • Ensure that endpoints that only allow privileged access enforce that with the @RolesAllowed annotation
  • Do not create cookies, although this may change in the future
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

@hyunnaye hyunnaye self-assigned this May 24, 2024
Copy link

codecov bot commented May 24, 2024

Codecov Report

Attention: Patch coverage is 0% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 70.05%. Comparing base (e6eea04) to head (c89dd09).

Current head c89dd09 differs from pull request most recent head b2fd097

Please upload reports for the commit b2fd097 to get more accurate results.

Files Patch % Lines
.../io/dockstore/webservice/helpers/ZenodoHelper.java 0.00% 6 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #5894      +/-   ##
=============================================
- Coverage      73.61%   70.05%   -3.57%     
+ Complexity      5298     4955     -343     
=============================================
  Files            374      371       -3     
  Lines          19404    19192     -212     
  Branches        2021     2012       -9     
=============================================
- Hits           14284    13444     -840     
- Misses          4150     4765     +615     
- Partials         970      983      +13     
Flag Coverage Δ
bitbuckettests 27.15% <0.00%> (+0.10%) ⬆️
hoverflytests ?
integrationtests 49.65% <0.00%> (-7.32%) ⬇️
languageparsingtests 11.08% <0.00%> (-0.01%) ⬇️
localstacktests 21.70% <0.00%> (+0.06%) ⬆️
toolintegrationtests 30.50% <0.00%> (+0.11%) ⬆️
unit-tests_and_non-confidential-tests 28.45% <0.00%> (+2.40%) ⬆️
workflowintegrationtests 38.70% <0.00%> (+4.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hyunnaye hyunnaye marked this pull request as ready for review May 24, 2024 15:50
@@ -184,6 +184,9 @@ protected static String extractDoiFromDoiUrl(String doiUrl) {
try {
URI uri = new URI(doiUrl);
doi = StringUtils.stripStart(uri.getPath(), "/");
if (doi.startsWith("doi/")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you trace/test this?
It isn't clear to me whether this would fix concept DOIs only going forward or whether this is in a part of code that is called for retrieving past concept DOIs as well.

Copy link
Contributor Author

@hyunnaye hyunnaye May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would fix future DOIs and then we would have to do a database migration to change the past DOIs. I didn't realize the migrations happen in this repo so I'll add that onto this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was like this already, but I think you should take advantage of the URI class features instead of converting the URI back to a string and manipulating it. Something like:

URI base = new URI("https://doi.org/doi");
URI uri = new URI(doiUrl);
final String doi = base.relativize(uri);
...

But if we're expecting paths that don't start with doi, then the above won't work as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if we're expecting paths that don't start with doi, then the above won't work as is.

It seems like we only call this for the concept DOI url which we retrieve from the parent_doi link of the published record.

String conceptDoiUrl = publishedDeposit.getLinks().get("parent_doi");
String conceptDoi = extractDoiFromDoiUrl(conceptDoiUrl);

This link has the form https://zenodo.org/doi/10.5281/zenodo.11093964 so we can't assume that the URL starts with https://doi.org/doi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conceptdoi appears to be included in raw form in the zenodo response, could we retrieve it from there and avoid the url manipulation? https://zenodo.org/api/records/11093965

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do stick with extracting the concept DOI from the link url, and we need a cue as to where the Zenodo DOI begins in a given string, Zenodo DOIs always start with 10.5281/zenodo. in the current Zenodo DOI generation scheme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it so we use the conceptdoi from the zenodo response.

Copy link
Contributor

@kathy-t kathy-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ticket has a 1.15.x milestone, should the base be changed to the hotfix branch?

@@ -184,6 +184,9 @@ protected static String extractDoiFromDoiUrl(String doiUrl) {
try {
URI uri = new URI(doiUrl);
doi = StringUtils.stripStart(uri.getPath(), "/");
if (doi.startsWith("doi/")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if we're expecting paths that don't start with doi, then the above won't work as is.

It seems like we only call this for the concept DOI url which we retrieve from the parent_doi link of the published record.

String conceptDoiUrl = publishedDeposit.getLinks().get("parent_doi");
String conceptDoi = extractDoiFromDoiUrl(conceptDoiUrl);

This link has the form https://zenodo.org/doi/10.5281/zenodo.11093964 so we can't assume that the URL starts with https://doi.org/doi

@denis-yuen
Copy link
Member

The ticket has a 1.15.x milestone, should the base be changed to the hotfix branch?

I think it's worth considering depending on how invasive this all ends up being.
Let's see what the migration ends up looking like.

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if latest version has been uploaded

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question for clarification

<sql dbms="postgresql">
UPDATE workflow
SET conceptdoi = REGEXP_REPLACE(conceptdoi, 'doi/', '', 'g')
WHERE archived=false AND conceptdoi IS NOT NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why when archived = false? If there's a reason for it, you might want to add a comment.

Copy link
Contributor

@svonworl svonworl May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the "frozenness" of archived entries is enforced via a trigger that makes an update fail when the archived flag is true both before and after the update. Essentially, the trigger causes updates to an archived entry to throw if they don't unarchive it. So, if we found that we did need to make changes to archived entries during migration, one approach might be to disable the trigger, make the update, and then re-enable the trigger afterwards. https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-DISABLE-ENABLE-TRIGGER

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabled and reenabled the trigger in my new commit

@@ -191,4 +191,11 @@
<modifyDataType columnName="topicmanual" newDataType="varchar(250)" tableName="tool"/>
<modifyDataType columnName="topicmanual" newDataType="varchar(250)" tableName="workflow"/>
</changeSet>
<changeSet author="hyunnaye" id="fixConceptDOIs">
<sql dbms="postgresql">
UPDATE workflow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you only updating the workflow table because it's the only one that has this issue? If somebody creates a DOI in app tool, service, or notebook between now and the 1.16 release, that won't introduce the same misformatted conceptdoi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I added them for the other tables

<sql dbms="postgresql">
UPDATE workflow
SET conceptdoi = REGEXP_REPLACE(conceptdoi, 'doi/', '', 'g')
WHERE archived=false AND conceptdoi IS NOT NULL;
Copy link
Contributor

@svonworl svonworl May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the "frozenness" of archived entries is enforced via a trigger that makes an update fail when the archived flag is true both before and after the update. Essentially, the trigger causes updates to an archived entry to throw if they don't unarchive it. So, if we found that we did need to make changes to archived entries during migration, one approach might be to disable the trigger, make the update, and then re-enable the trigger afterwards. https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-DISABLE-ENABLE-TRIGGER

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question, may not be blocker

ALTER TABLE apptool ENABLE TRIGGER update_archived_apptool;
</sql>
<sql dbms="postgresql">
ALTER TABLE service DISABLE TRIGGER update_archived_service;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have these migrations been tested on qa?
Asking because I thought migrations are run via the dockstore user
https://github.com/dockstore/compose_setup/blob/5c59a56af0cecaea4887bd805aba7277ab7e58b8/templates/init_migration.sh.template

I don't recall if the dockstore user has the ability to disable the frozen trigger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we've done it in the past, as the dockstore user, so I think it's OK (but definitely test). You should be able to test locally if it's easier, if you have your postgres and dockstore DB users set up correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it locally and the migrations work :)

ALTER TABLE apptool ENABLE TRIGGER update_archived_apptool;
</sql>
<sql dbms="postgresql">
ALTER TABLE service DISABLE TRIGGER update_archived_service;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we've done it in the past, as the dockstore user, so I think it's OK (but definitely test). You should be able to test locally if it's easier, if you have your postgres and dockstore DB users set up correctly.

Copy link

sonarcloud bot commented May 31, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

@hyunnaye hyunnaye requested a review from coverbeck May 31, 2024 17:46
@hyunnaye hyunnaye merged commit c0aeab8 into develop Jun 3, 2024
14 of 15 checks passed
@hyunnaye hyunnaye deleted the feature/feature/DOCK-2523 branch June 3, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants