Globus Bug Fixes #10345

qqmyers · 2024-02-27T22:27:30Z

Several issues have been discovered in testing the new Globus capabilities in prep for the MOC demo and DCM 2024 meeting:

Remote stores:error in creating exports and Dataset getThumbnail() failure - both caused by trying to treat a Dataset storageIdentifier as a path when it isn't like a DataFile one
Obsolete/undocumented dataverse-inaccessible setting still used in one place - replaced with the documented files-not-accessible-by-dataverse setting
Previewers/Ext. tools showing when files are not dataverse accessible - previewers/file tools now only show if files are accessible or they require auxFileAccess (since aux files are accessible via the backing store).
The direct and Globus upload APIs only checked the file size/quota limits after the file transfer had happened - the direct upload API now won't send signedURLs for files that are too big and the Globus API will report both the fileSizeLimit and remainingQuota to the caller. (Since Globus doesn't enforce size limits, we still have to trust that users won't try transferring files that they've been told will be rejected/not added to the dataset.)

PR is draft until the latest round of testing is done.

Which issue(s) this PR closes:

Closes #10344
Closes #10231

Special notes for your reviewer:

Suggestions on how to test this:

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

coveralls · 2024-02-27T22:46:59Z

coverage: 20.585% (+0.01%) from 20.574%
when pulling 3be8cac on GlobalDataverseCommunityConsortium:globusstore
into e26fabe on IQSS:develop.

landreev · 2024-03-04T21:18:14Z

Testing with NESE Demo Tape pool - a managed store with files inaccessible to Dataverse - the following in the log must be coming from the calculateDetails() method in the Globus service bean, so it appears that it attempts to calculate the checksum once the upload is complete. It's not fatal - it gives up after 3 tries and sets the hash to "Not available in Dataverse", as it should. But, is it a problem that it even tries, on a store with files-not-accessible-by-dataverse?

[2024-03-04T15:32:48.367-0500] [Payara 6.2023.8] [SEVERE] [] [] [tid: _ThreadID=1770 _ThreadName=pool-217-thread-1] [timeMillis: 1709584368367] [levelValue: 1000] [[
  java.lang.NullPointerException: Cannot invoke "java.io.InputStream.read(byte[])" because "in" is null
	at edu.harvard.iq.dataverse.util.FileUtil.calculateChecksum(FileUtil.java:735)
	at edu.harvard.iq.dataverse.globus.GlobusServiceBean.calculateDetails(GlobusServiceBean.java:1138)
	at edu.harvard.iq.dataverse.globus.GlobusServiceBean.lambda$calculateDetailsAsync$11(GlobusServiceBean.java:1112)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
]]

[2024-03-04T15:32:48.368-0500] [Payara 6.2023.8] [INFO] [] [edu.harvard.iq.dataverse.globus.GlobusServiceBean] [tid: _ThreadID=1770 _ThreadName=pool-217-thread-1] [timeMillis: 1709584368368] [levelValue: 800] [[
  Cannot invoke "java.io.InputStream.read(byte[])" because "in" is null]]

[2024-03-04T15:32:53.374-0500] [Payara 6.2023.8] [SEVERE] [] [] [tid: _ThreadID=1770 _ThreadName=pool-217-thread-1] [timeMillis: 1709584373374] [levelValue: 1000] [[
  java.lang.NullPointerException: Cannot invoke "java.io.InputStream.read(byte[])" because "in" is null
	at edu.harvard.iq.dataverse.util.FileUtil.calculateChecksum(FileUtil.java:735)
	at edu.harvard.iq.dataverse.globus.GlobusServiceBean.calculateDetails(GlobusServiceBean.java:1138)
	at edu.harvard.iq.dataverse.globus.GlobusServiceBean.lambda$calculateDetailsAsync$11(GlobusServiceBean.java:1112)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
]]

[2024-03-04T15:32:53.374-0500] [Payara 6.2023.8] [INFO] [] [edu.harvard.iq.dataverse.globus.GlobusServiceBean] [tid: _ThreadID=1770 _ThreadName=pool-217-thread-1] [timeMillis: 1709584373374] [levelValue: 800] [[
  Cannot invoke "java.io.InputStream.read(byte[])" because "in" is null]]

[2024-03-04T15:32:58.379-0500] [Payara 6.2023.8] [SEVERE] [] [] [tid: _ThreadID=1770 _ThreadName=pool-217-thread-1] [timeMillis: 1709584378379] [levelValue: 1000] [[
  java.lang.NullPointerException: Cannot invoke "java.io.InputStream.read(byte[])" because "in" is null
	at edu.harvard.iq.dataverse.util.FileUtil.calculateChecksum(FileUtil.java:735)
	at edu.harvard.iq.dataverse.globus.GlobusServiceBean.calculateDetails(GlobusServiceBean.java:1138)
	at edu.harvard.iq.dataverse.globus.GlobusServiceBean.lambda$calculateDetailsAsync$11(GlobusServiceBean.java:1112)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
]]

[2024-03-04T15:32:58.379-0500] [Payara 6.2023.8] [INFO] [] [edu.harvard.iq.dataverse.globus.GlobusServiceBean] [tid: _ThreadID=1770 _ThreadName=pool-217-thread-1] [timeMillis: 1709584378379] [levelValue: 800] [[
  Cannot invoke "java.io.InputStream.read(byte[])" because "in" is null]]

landreev · 2024-03-05T15:13:23Z

Testing a remote managed store upload workflow, aside from the cast-to-int problem mentioned on slack, the issue persists where the transfer successfully completes on the globus side, but then dataverse fails to save the datafile on its end.
The failure happens inside the /addFiles api, with the following exception stack trace:

java.lang.NullPointerException: Cannot invoke "java.lang.Long.longValue()" because the return value of "edu.harvard.iq.dataverse.FileMetadata.getVersion()" is null
	at edu.harvard.iq.dataverse.util.json.JsonPrinter.json(JsonPrinter.java:672)
	at edu.harvard.iq.dataverse.util.json.JsonPrinter.json(JsonPrinter.java:656)
	at edu.harvard.iq.dataverse.util.json.JsonPrinter.jsonFileMetadatas(JsonPrinter.java:526)
	at edu.harvard.iq.dataverse.util.json.JsonPrinter.jsonDataFileList(JsonPrinter.java:486)
	at edu.harvard.iq.dataverse.datasetutility.AddReplaceFileHelper.getSuccessResultAsJsonObjectBuilder(AddReplaceFileHelper.java:1904)
	at edu.harvard.iq.dataverse.datasetutility.AddReplaceFileHelper.addFiles(AddReplaceFileHelper.java:2091)
	at edu.harvard.iq.dataverse.api.Datasets.addFilesToDataset(Datasets.java:4055)
...

The above is likely a symptom, not the cause of the failure (the exception is thrown in AddReplaceFileHelper.getSuccessResultAsJsonObjectBuilder() - probably called after a failure to save that didn't get detected (?).
Attached is a fine server log output; starting from receiving a confirmation from globus about the completion of the upload.

Looking at the following lines: https://github.com/GlobalDataverseCommunityConsortium/dataverse/blob/6d3769f2a26b70e7108c18b92f39c1838ad8cd28/src/main/java/edu/harvard/iq/dataverse/globus/GlobusServiceBean.java#L820-L852 it looks like when a failure occurs during this /addFiles api call it is not recorded as such (countError not incremented). Also, the edit lock is left on the dataset, because the removeDatasetLock() call is inside the catch() statement, and no exception is thrown in this case.

It's not immediately clear to me from the server log why the actual failure in /addFiles occurs. But has to be something simple - ?

server_log.txt

poikilotherm · 2024-03-06T13:09:48Z

/push-image thx

github-actions · 2024-03-06T13:17:20Z

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:globusstore

ghcr.io/gdcc/configbaker:globusstore

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

landreev · 2024-03-11T14:58:19Z

I have confirmed that the above (datafile not created, dataset stuck with an "edit in progress" lock) is happening consistently not just on my own dev. instance, but also on other servers, like dataverse-internal. Same exception stack trace.
I'm wondering if it's something very straightforward; like some necessary fix missing in the PR branch. Or maybe some JVM option, or setting that I am missing in my config?

landreev · 2024-03-11T21:37:06Z

Finally occurred to me to check - but it looks like it's not just the Globus uploads!
Direct uploads via DvUploader are similarly broken (same exception). I.e. /addFiles appears to be broken in develop (not just this branch).

This is from beta, which is running the develop branch build:

PROCESSING(F): /Users/landreev/NetBeansProjects/dataverse/test_685.txt
               Does not yet exist on server.
              UPLOADED as: MD5:ab3b010ddaaf87bc09944ad210e5d6eb
CURRENT TOTAL: 1 files :685 bytes
Error response when processing files in Individual files on command line : Bad Request
{"status":"ERROR","message":"Cannot invoke \"java.lang.Long.longValue()\" because the return value of \"edu.harvard.iq.dataverse.FileMetadata.getVersion()\" is null"}

landreev · 2024-03-12T13:09:47Z

It's really simple too: all it appears to be is that this JsonPrinter line in 6.1:

.add("version", fmd.getVersion())

is working because the JsonObjectBuilder methods are overridden by our own NullSafeJsonBuilder, and in develop we somehow lost that, and the .add methods from the native class are used.

landreev · 2024-03-12T13:13:03Z

(there was a wrong code line copy-and-pasted in the last comment; corrected)

landreev · 2024-03-12T13:43:25Z

It is a very recent change, and it is this line, 660 in JsonPrinter.json:

   JsonObjectBuilder builder = jsonObjectBuilder();

jsonObjectBuilder() creates a NullSafeJsonBuilder, but it is cast back to the native JsonObjectBuilder. ☹️

landreev · 2024-03-12T22:33:01Z

With the fix for the main issue merged in develop (#10366) and with the fix for file size>2GB committed, I can now upload files - yay.

I'm inclined to go ahead and merge this PR. With the acknowledgement that everything about Globus is rather convoluted, and I have been testing one simple workflow - uploading files to one specific endpoint (NESE test/demo tape), using Borealis dataverse-globus app. So, apologies if I missed something in the process.

A few cosmetic/trivial additions to the guide, if you want them. Things like the dataset id query parameters missing from the curl example lines... that are really obvious, and it may be safe to assume that only serious developers are going to read this guide... so, up to you.

cosmetic additions to globus-api.rst

landreev · 2024-03-13T14:41:02Z

Just noticed the "size 3" label - lol.

qqmyers added 6 commits February 27, 2024 14:36

don't test path for datasets

ba4f580

Merge remote-tracking branch 'IQSS/develop' into globusstore

061aaaf

remove duplicate, undocumented dataverse-inaccessible setting

1218b98

only show tools if file is accessible or tool uses aux files

9248eee

only validate Path for files

854ea84

add storageIdentifier for test

bfbf435

qqmyers added 4 commits February 28, 2024 15:27

send info about size limit and remaining quota

78ac5a4

doc update re: size/quota limits

ef0f3df

add file size/quota checks, remove undocumented deprecated api call

4f019b5

Merge remote-tracking branch 'IQSS/develop' into globusstore

6d3769f

qqmyers marked this pull request as ready for review March 1, 2024 18:03

qqmyers added the Size: 3 A percentage of a sprint. 2.1 hours. label Mar 1, 2024

qqmyers modified the milestones: 6.1.1, 6.2 Mar 1, 2024

landreev self-requested a review March 4, 2024 01:37

landreev self-assigned this Mar 4, 2024

landreev mentioned this pull request Mar 4, 2024

Resolve the remaining issues with transfer to and from NESE disk/tape on the MOC demo instance IQSS/dataverse-pm#183

Closed

landreev approved these changes Mar 4, 2024

View reviewed changes

qqmyers added 3 commits March 12, 2024 14:18

remove lossy cast to int

638c40e

Merge remote-tracking branch 'IQSS/develop' into globusstore

f85428b

Merge remote-tracking branch 'IQSS/develop' into globusstore

dc34049

landreev and others added 3 commits March 12, 2024 19:15

a couple more cosmetic changes to globus-api.rst

4e4cdf8

Merge pull request #39 from landreev/patch-5

3be8cac

cosmetic additions to globus-api.rst

landreev merged commit 5caf436 into IQSS:develop Mar 13, 2024
11 of 12 checks passed

landreev removed their assignment Mar 13, 2024

landreev added Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) and removed Size: 3 A percentage of a sprint. 2.1 hours. labels Mar 14, 2024

scolapasta modified the milestones: 6.1.1, 6.2 Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Globus Bug Fixes #10345

Globus Bug Fixes #10345

qqmyers commented Feb 27, 2024 •

edited

coveralls commented Feb 27, 2024 •

edited

landreev commented Mar 4, 2024

landreev commented Mar 5, 2024 •

edited

poikilotherm commented Mar 6, 2024

github-actions bot commented Mar 6, 2024

landreev commented Mar 11, 2024

landreev commented Mar 11, 2024

landreev commented Mar 12, 2024 •

edited

landreev commented Mar 12, 2024

landreev commented Mar 12, 2024

landreev commented Mar 12, 2024

landreev commented Mar 13, 2024

Globus Bug Fixes #10345

Globus Bug Fixes #10345

Conversation

qqmyers commented Feb 27, 2024 • edited

coveralls commented Feb 27, 2024 • edited

landreev commented Mar 4, 2024

landreev commented Mar 5, 2024 • edited

poikilotherm commented Mar 6, 2024

github-actions bot commented Mar 6, 2024

landreev commented Mar 11, 2024

landreev commented Mar 11, 2024

landreev commented Mar 12, 2024 • edited

landreev commented Mar 12, 2024

landreev commented Mar 12, 2024

landreev commented Mar 12, 2024

landreev commented Mar 13, 2024

qqmyers commented Feb 27, 2024 •

edited

coveralls commented Feb 27, 2024 •

edited

landreev commented Mar 5, 2024 •

edited

landreev commented Mar 12, 2024 •

edited