Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add write*Blob option to replace existing blob #31729

Merged
merged 3 commits into from Jul 3, 2018

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Jul 2, 2018

Adds a new parameter to the BlobContainer#write*Blob methods to specify whether the existing file should be overridden or not. For some metadata files in the repository, we actually want to replace the current file. This is currently implemented through an explicit blob delete and then a fresh write. In case of using a cloud provider (S3, GCS, Azure), this results in 2 API requests instead of just 1. This change will therefore allow us to achieve the same functionality using less API requests.

@ywelsch ywelsch added >enhancement :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.0.0 v6.4.0 labels Jul 2, 2018
@ywelsch ywelsch requested a review from tlrx July 2, 2018 12:31
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

assert blobSize <= LARGE_BLOB_THRESHOLD_BYTE_SIZE : "large blob uploads should use the resumable upload method";
final ByteArrayOutputStream baos = new ByteArrayOutputStream(Math.toIntExact(blobSize));
Streams.copy(inputStream, baos);
try {
final Storage.BlobTargetOption[] targetOptions = failIfAlreadyExists ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a static method that returns the appropriate options depending on failIfAlreadyExists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do so, but that method cannot be shared between writeBlobResumable and writeBlobMultipart (as in one case these are of type BlobWriteOption and in the other case these are BlobTargetOption). Do you still think these should get a static method (just to have one caller)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed that they didn't have the same return types. Let's keep what you already did. Sory for the noise.

@@ -91,7 +91,7 @@ public InputStream readBlob(String blobName) throws IOException {
}

@Override
public void writeBlob(String blobName, InputStream inputStream, long blobSize) throws IOException {
public void writeBlob(String blobName, InputStream inputStream, long blobSize, boolean failIfAlreadyExists) throws IOException {
SocketAccess.doPrivilegedIOException(() -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3BlobContainer does not respect the BlobContainer API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but that's not a surprise? It can't respect it because of its weak consistency guarantees

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should document this here

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ywelsch ywelsch merged commit 2bb4f38 into elastic:master Jul 3, 2018
ywelsch added a commit that referenced this pull request Jul 3, 2018
Adds a new parameter to the BlobContainer#write*Blob methods to specify whether the existing file
should be overridden or not. For some metadata files in the repository, we actually want to replace
the current file. This is currently implemented through an explicit blob delete and then a fresh write.
In case of using a cloud provider (S3, GCS, Azure), this results in 2 API requests instead of just 1.
This change will therefore allow us to achieve the same functionality using less API requests.
dnhatn added a commit that referenced this pull request Jul 4, 2018
* 6.x:
  Fix not waiting for Netty ThreadDeathWatcher in IT (#31758) (#31789)
  [Docs] Correct default window_size (#31582)
  S3 fixture should report 404 on unknown bucket (#31782)
  [ML] Limit ML filter items to 10K (#31731)
  Fixture for Minio testing (#31688)
  [ML] Return statistics about forecasts as part of the jobsstats and usage API (#31647)
  [DOCS] Add missing get mappings docs to HLRC (#31765)
  [DOCS] Starting Elasticsearch (#31701)
  Fix coerce validation_method in GeoBoundingBoxQueryBuilder (#31747)
  Painless: Complete Removal of Painless Type (#31699)
  Consolidate watcher setting update registration (#31762)
  [DOCS] Adds empty 6.3.1 release notes page
  ingest: Introduction of a bytes processor (#31733)
  [test] don't run bats tests for suse boxes (#31749)
  Add analyze API to high-level rest client (#31577)
  Implemented XContent serialisation for GetIndexResponse (#31675)
  [DOCS] Typos
  DOC: Add examples to the SQL docs (#31633)
  Add support for AWS session tokens (#30414)
  Watcher: Reenable start/stop yaml tests (#31754)
  JDBC: Fix stackoverflow on getObject and timestamp conversion (#31735)
  Support multiple system store types (#31650)
  Add write*Blob option to replace existing blob (#31729)
  Split CircuitBreaker-related tests (#31659)
  Painless: Add Context Docs (#31190)
  Docs: Remove missing reference
  Migrate scripted metric aggregation scripts to ScriptContext design (#30111)
  Watcher: Fix chain input toXcontent serialization (#31721)
  Remove _all example (#31711)
  rest-high-level: added get cluster settings (#31706)
  Docs: Match the examples in the description (#31710)
  [Docs] Correct typos (#31720)
  Extend allowed characters for grok field names (#21745) (#31653) (#31722)
  [DOCS] Check for Windows and *nix file paths (#31648)
  [ML] Validate ML filter_id (#31535)
  Fix gradle4.8 deprecation warnings (#31654)
  Update numbers to reflect 4-byte UTF-8-encoded characters (#27083)
dnhatn added a commit that referenced this pull request Jul 4, 2018
* master:
  [ML] Rate limit established model memory updates (#31768)
  [Docs] Correct default window_size (#31582)
  S3 fixture should report 404 on unknown bucket (#31782)
  Detach Transport from TransportService (#31727)
  [ML] Limit ML filter items to 10K (#31731)
  [ML] Return statistics about forecasts as part of the jobsstats and usage API (#31647)
  Fixture for Minio testing (#31688)
  [DOCS] Add missing get mappings docs to HLRC (#31765)
  [DOCS] Starting Elasticsearch (#31701)
  Painless: Complete Removal of Painless Type (#31699)
  Fix not waiting for Netty ThreadDeathWatcher in IT (#31758)
  Consolidate watcher setting update registration (#31762)
  Build: re-enabled bwc (#31769)
  ingest: Introduction of a bytes processor (#31733)
  Fix coerce validation_method in GeoBoundingBoxQueryBuilder (#31747)
  Add analyze API to high-level rest client (#31577)
  [DOCS] Typos
  DOC: Add examples to the SQL docs (#31633)
  Add support for AWS session tokens (#30414)
  Watcher: Reenable start/stop yaml tests (#31754)
  Implemented XContent serialisation for GetIndexResponse (#31675)
  JDBC: Fix stackoverflow on getObject and timestamp conversion (#31735)
  resolveHasher defaults to NOOP (#31723)
  Account for XContent overhead in in-flight breaker
  Split CircuitBreaker-related tests (#31659)
  Add write*Blob option to replace existing blob (#31729)
  Painless: Add Context Docs (#31190)
  Watcher: Fix chain input toXcontent serialization (#31721)
  Docs: Match the examples in the description (#31710)
  rest-high-level: added get cluster settings (#31706)
  [Docs] Correct typos (#31720)
  Clean up double semicolon code typos (#31687)
  [DOCS] Check for Windows and *nix file paths (#31648)
  [ML] Validate ML filter_id (#31535)
  Revert long lines
  Fix TransportChangePasswordActionTests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement v6.4.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants