New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidestep multipart upload failures on GCS #70

Merged
merged 1 commit into from Apr 24, 2017

Conversation

Projects
None yet
3 participants
@ljfranklin
Contributor

ljfranklin commented Apr 10, 2017

  • Attempting to upload a file to Google Cloud Storage larger than 5 MB
    (the default multipart upload size) will return:
error running command: InvalidArgument: Invalid argument.
  status code: 400, request id:
  • This PR avoids the issue by disabling multipart uploads when the GCS
    endpoint is used, didn't have time to investigate an actual fix or root cause.
  • Didn't add an integration test as it seemed like a lot of overhead to
    require a GCS account in CI just for this change, but let me know if
    you'd rather have integration coverage around this.
Sidestep multipart upload failures on GCS
- Attempting to upload a file to Google Cloud Storage larger than 5 MB
  (the default multipart upload size) will return:
```
error running command: InvalidArgument: Invalid argument.
  status code: 400, request id:
```
- This PR avoids the issue by disabling multipart uploads when the GCS
  endpoint is used, didn't have time to investigate an actual fix
- Didn't add an integration test as it seemed like a lot of overhead to
  require a GCS account in CI just for this change, but let me know if
  you'd rather have integration coverage around this.
@concourse-bot

This comment has been minimized.

Show comment
Hide comment
@concourse-bot

concourse-bot Apr 10, 2017

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

  • #143387325 Sidestep multipart upload failures on GCS

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

concourse-bot commented Apr 10, 2017

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

  • #143387325 Sidestep multipart upload failures on GCS

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

ljfranklin added a commit to ljfranklin/bosh-bot that referenced this pull request Apr 10, 2017

@vito

This comment has been minimized.

Show comment
Hide comment
@vito

vito Apr 24, 2017

Member

Seems reasonable, thanks!

I did some investigation and there are a couple confusing factors between S3 and GCS. I may be wrong, but figured I'd document my thought process:

  1. AWS "multipart" is really for uploading separate chunks of the same blob in parallel. GCS "multipart" is for uploading the data along with metadata for the file, in a single request.
  2. GCS docs happen to say 5M as the threshold over which you should not try to do it all with one request.
  3. The AWS SDK also happens to have a threshold of 5M after which it will attempt a parallel upload. This is unrelated to GCS's "multipart", and is really just an API that GCS doesn't support.

So, two different definitions of "multipart", and two different uses of the same threshold (5M), for totally unrelated purposes. Got it.

Member

vito commented Apr 24, 2017

Seems reasonable, thanks!

I did some investigation and there are a couple confusing factors between S3 and GCS. I may be wrong, but figured I'd document my thought process:

  1. AWS "multipart" is really for uploading separate chunks of the same blob in parallel. GCS "multipart" is for uploading the data along with metadata for the file, in a single request.
  2. GCS docs happen to say 5M as the threshold over which you should not try to do it all with one request.
  3. The AWS SDK also happens to have a threshold of 5M after which it will attempt a parallel upload. This is unrelated to GCS's "multipart", and is really just an API that GCS doesn't support.

So, two different definitions of "multipart", and two different uses of the same threshold (5M), for totally unrelated purposes. Got it.

@vito vito merged commit 6968022 into concourse:master Apr 24, 2017

1 check passed

ci/pivotal-cla Thank you for signing the Contributor License Agreement!
Details

@vito vito added this to the v3.0.0 milestone May 13, 2017

cappyzawa pushed a commit to cappyzawa/s3-resource that referenced this pull request Feb 5, 2018

Merge pull request concourse#70 from ljfranklin/PR-gcs-multipart
Sidestep multipart upload failures on GCS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment