IQSS/6993-use-single-AWS-client-per-store #6994

qqmyers · 2020-06-17T19:53:01Z

What this PR does / why we need it: Refactors the S3AccessIO class to avoid creating a new AWS client for each instance, and instead keeping track of one AWS client per store.

Which issue(s) this PR closes:

Closes #6993

Special notes for your reviewer:

Suggestions on how to test this: This should do nothing :-) - behavior should be the same (except lower memory use perhaps). Since it's S3-related, testing should focus on getting/putting files to S3. Nominally, since the change is related to keeping track of S3 stores correctly, testing should probably use a couple stores and verify that read/writes are still happening to the correct store/bucket.

Does this PR introduce a user interface change? If mockups are available, please link/include them here: No

Is there a release notes update needed for this change?: No

Additional documentation:

also refactor S3StorageIO to re-use single client per store, use more static methods

Nominally useful if code ever changes the storageIdentifier of the dvObject after the S3AccessIO instance is created, but real code shouldn't do that :-)

and the dataset pid being set multiple times.

for directUpload case

used in DatasetPage and editFilesFragment.xhtml

Conflicts: src/main/webapp/resources/js/fileupload.js

Conflicts: src/main/java/edu/harvard/iq/dataverse/DatasetPage.java src/main/webapp/resources/js/fileupload.js

keeping the cancel delay at 1 second - any queued uploads need to finish.

Conflicts: src/main/java/edu/harvard/iq/dataverse/api/Datasets.java

coveralls · 2020-06-17T20:06:13Z

Coverage decreased (-0.01%) to 19.543% when pulling 10c2fb7 on GlobalDataverseCommunityConsortium:IQSS/6993 into fa33c7a on IQSS:develop.

landreev

Looks good.
@qqmyers could you please sync up the branch with develop? - There are are no conflicts, but updating the version in pom.xml would help with QA deployments.
Everything appears to work. I was also curious to measure how much exactly we are saving per access call... Got strange results, then realized my experiment was faulty... Then decided not to hold the PR on account of it. So maybe I can instead help with that during QA (since it really is a QA task).

qqmyers · 2020-08-20T19:20:47Z

@landreev - I'll be curious as well w.r.t. any memory/performance improvement - could be fun with 1K+ files in a dataset.

In any case - note/reminder for testing - #6995 depends on this PR (includes it) so this should probably be tested/merged first.

qqmyers added 30 commits April 11, 2020 15:02

cleanup - add explicit type

a018055

add multipart upload api calls and add to S3StorageIO class

d314431

also refactor S3StorageIO to re-use single client per store, use more static methods

Merge remote-tracking branch 'IQSS/develop' into IQSS/6763

dbbd569

update error msgs

61448a8

try test update

cd7e345

Restore driverId check in getMainKey to pass tests

8b88b39

Nominally useful if code ever changes the storageIdentifier of the dvObject after the S3AccessIO instance is created, but real code shouldn't do that :-)

remove unused imports

7b91c07

Merge remote-tracking branch 'IQSS/develop' into IQSS/6763

6b03884

report the exception message when can't access bucket

e707519

typo on comparison

dab4af6

IQSS/6829 - account for file failures

d322809

remove ~duplicate code

6ca407c

IQSS-6829 - avoid race with 2+ files uploading

b0557aa

and the dataset pid being set multiple times.

IQSS/6829 - create mode uses DatasetPage - initialize identifier there

beb419e

for directUpload case

move directUploadEnabled to systemconfig

8400c20

used in DatasetPage and editFilesFragment.xhtml

typo/fix method calls

6f9baed

set minimum part size

f841d59

add convenience urls

f844914

add dv_status=temp tag in multipart uploads

db00c6f

return no content for delete call

f473fab

add request for upload in parts method

088efb0

handle net:ERR_NETWORK_CHANGED errors w/o 'undefined' error

be27169

remove draft code from other branch

ef6ef13

handle dataverse change - keep GlobalId in direct upload case

a666a7b

IQSS/6881 delete temp files when cancelling dataset create

4bb1b66

IQSS 6881 cleanup temp files on cancel when creating dataset

45245b3

get/set for uploadInProgress

e2d7207

actually initialize uploadInProgress

56bd6a7

limit processing on cancelCreate

40f2cd1

catch all files in direct upload cancel + debug statements

dc3f130

qqmyers added 15 commits May 11, 2020 14:55

Merge remote-tracking branch 'IQSS/develop' into IQSS/6763

a2b47ec

Conflicts: src/main/webapp/resources/js/fileupload.js

Merge remote-tracking branch 'IQSS/develop' into IQSS/6763

1963f09

Conflicts: src/main/java/edu/harvard/iq/dataverse/DatasetPage.java src/main/webapp/resources/js/fileupload.js

add a delay for tiny files

dbecab9

more changes to fix concurrency issues

e1993b2

bug - requests for URL weren't queued after initial 4

a1c2c92

update version console entry

a3b775b

make it easier to adjust delay

b4f7178

decrease to 100 ms delay

0a546d9

keeping the cancel delay at 1 second - any queued uploads need to finish.

correct save button selector for disbale during cancel

c438219

fix selector

6bae264

update console version info

37ab57a

Merge remote-tracking branch 'IQSS/develop' into IQSS/6763

6d6d2b0

Conflicts: src/main/java/edu/harvard/iq/dataverse/api/Datasets.java

Merge remote-tracking branch 'IQSS/develop' into IQSS/6763

d4ae322

removing whitespace changes

050b59d

removing mp-upload-specific changes

66d84c1

qqmyers mentioned this pull request Jun 17, 2020

IQSS/6763-multi-part upload API calls #6995

Merged

djbrooke added this to Code Review 🦁 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) via automation Jul 15, 2020

djbrooke assigned landreev Jul 15, 2020

Merge remote-tracking branch 'IQSS/develop' into IQSS/6993

a792363

landreev approved these changes Aug 20, 2020

View reviewed changes

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from Code Review 🦁 to QA 🔎✅ Aug 20, 2020

Merge remote-tracking branch 'IQSS/develop' into IQSS/6993

10c2fb7

kcondon assigned kcondon and unassigned landreev Aug 24, 2020

kcondon merged commit cdbb7f2 into IQSS:develop Aug 24, 2020

IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) automation moved this from QA 🔎✅ to Done 🚀 Aug 24, 2020

djbrooke added this to the 5.1 milestone Aug 25, 2020

qqmyers mentioned this pull request Aug 27, 2020

S3 failures aren't informative in the log #7237

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IQSS/6993-use-single-AWS-client-per-store #6994

IQSS/6993-use-single-AWS-client-per-store #6994

qqmyers commented Jun 17, 2020

coveralls commented Jun 17, 2020 •

edited

landreev left a comment

qqmyers commented Aug 20, 2020

IQSS/6993-use-single-AWS-client-per-store #6994

IQSS/6993-use-single-AWS-client-per-store #6994

Conversation

qqmyers commented Jun 17, 2020

coveralls commented Jun 17, 2020 • edited

landreev left a comment

Choose a reason for hiding this comment

qqmyers commented Aug 20, 2020

coveralls commented Jun 17, 2020 •

edited