Skip to content

[5.x] fix s3 zarr performance issue#761

Merged
lesserwhirls merged 4 commits intoUnidata:maint-5.xfrom
haileyajohnson:s3-perf
Jul 13, 2021
Merged

[5.x] fix s3 zarr performance issue#761
lesserwhirls merged 4 commits intoUnidata:maint-5.xfrom
haileyajohnson:s3-perf

Conversation

@haileyajohnson
Copy link
Copy Markdown
Contributor

@haileyajohnson haileyajohnson commented Jul 13, 2021

Description of Changes

  • get length and last modified properties off s3object instead of from head request
  • fail immediately in isValidFile in other iosps
  • Result: time to build Zarr/CDM object down from 2+minutes to 6 seconds

PR Checklist

  • Indicate the version associated with this PR in the Title
    (e.g. "[5.x]: This is my PR title")
  • Link to any issues that the PR addresses
  • Add labels, especially if the PR should be ported to other versions
    (these labels start with "port: ")
  • Open as a draft PR
    until ready for review
  • Make sure GitHub tests pass
  • Mark PR as "Ready for Review"

This change is Reviewable

@Category(NotPullRequest.class)
public void bucketAndKeyOsdc() throws IOException {
long lastModified = 1611593614000L;
long lastModified = 1619161328000L;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tests is still going through the same codepath (i.e. getting last modified from headObjectResponse) but the lastmodified seems to have changed

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we should disable the last modified portion of the tests. It seems like the last modified time changes occasionally on all platforms, but I'm not sure why...maybe shifting disks on the object store or data centers? It has happened most often on the Open Science Data Cloud.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that

Copy link
Copy Markdown
Member

@lesserwhirls lesserwhirls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the JDK 8 test actually finishes, this looks great to me!

Reviewed 8 of 9 files at r1, 1 of 1 files at r2.
Dismissed @haileyajohnson from a discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @haileyajohnson)


cdm/s3/src/test/java/thredds/inventory/s3/TestMFileS3.java, line 111 at r1 (raw file):

Previously, haileyajohnson wrote…

I can do that

👍

@haileyajohnson haileyajohnson marked this pull request as ready for review July 13, 2021 21:50
@lesserwhirls
Copy link
Copy Markdown
Member

woohoo!

@lesserwhirls lesserwhirls merged commit 04f7db0 into Unidata:maint-5.x Jul 13, 2021
@haileyajohnson haileyajohnson deleted the s3-perf branch August 2, 2021 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants