Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: a way to specify Content-Length during upload #601

Closed
gmiroshnykov opened this issue May 15, 2015 · 6 comments
Closed

storage: a way to specify Content-Length during upload #601

gmiroshnykov opened this issue May 15, 2015 · 6 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. type: question Request for information or clarification. Not an issue.

Comments

@gmiroshnykov
Copy link

Cloud Storage docs mention that you have to specify Content-Length header for upload requests, but I guess this is not a hard requirement as gcloud-node does not do that (as far as I can see) and provides an API for streaming uploads of unknown length using Transfer-Encoding: chunked.

I'd like to be able to explicitly set a known Content-Length in advance.

This might help catching the kind of programmer mistake (which I am guilty of) when you're piping an HTTP request into a stream returned by file.createWriteStream(), but accidentally aborting the HTTP request prematurely. In that case CRC32 / MD5 checksums won't save you because the actual bytes are intact and the upload was technically complete, but logically incomplete.

@jgeewax jgeewax added type: question Request for information or clarification. Not an issue. api: storage Issues related to the Cloud Storage API. labels May 15, 2015
@jgeewax jgeewax added this to the Storage Future milestone May 15, 2015
@ryanseys
Copy link
Contributor

Would you like to set X-Upload-Content-Length or Content-Length ?

From the docs:

X-Upload-Content-Length. Set to the number of bytes of upload data to be transferred in subsequent requests. If the length is unknown at the time of this request, you can omit this header.

Content-Length. Set to the number of bytes provided in the body of this initial request. Not required if you are using chunked transfer encoding.

POST /upload/storage/v1/b/myBucket/o?uploadType=resumable HTTP/1.1
Host: www.googleapis.com
Authorization: Bearer your_auth_token
Content-Length: 38
Content-Type: application/json; charset=UTF-8
X-Upload-Content-Type: image/jpeg
X-Upload-Content-Length: 2000000

{
  "name": "myObject"
}

@gmiroshnykov
Copy link
Author

It's not about resumable uploads, it's about simple ("multipart") uploads, so X-Upload-Content-Length is not applicable.

I'd like to be able to set Content-Length and avoid chunked transfer encoding.

@stephenplusplus
Copy link
Contributor

Sorry that 3 months have gone by without any action on this issue. Requests are indeed going through with chunked encoding, and there isn't a way to disable that currently. We could turn it off with your provided Content-Length header, then add on the length of the other part of the request, the metadata, and the upload will only succeed if all of the data is present.

I think we should definitely support that, but we just need to figure out how and what makes the most sense. The nice thing about streaming uploads is the fact that chunked encoding works to flush data as it's brought through the pipeline. By making a one-off upload, all of that data has to be stored in memory until the request ends. However, that's kind of expected, since you'll probably only want to use a single HTTP request for those smaller sized uploads.

What do you think about letting file.createWriteStream continue to chunk, but allow bucket.upload to set the Content Length if the file provided is < 5 MB?

This might help catching the kind of programmer mistake (which I am guilty of) when you're piping an HTTP request into a stream returned by file.createWriteStream(), but accidentally aborting the HTTP request prematurely.

How would you see this working, since file.createWriteStream() needs to know the content length of the incoming data up front in order to disable chunking?

Sorry again about the delay. Thanks for bringing this up!

@gmiroshnykov
Copy link
Author

Hey, thanks and sorry for the delay on my part.

The nice thing about streaming uploads is the fact that chunked encoding works to flush data as it's brought through the pipeline. By making a one-off upload, all of that data has to be stored in memory until the request ends.

Actually you can do streaming uploads without chunked encoding or in-memory buffering: just pipe some streams and you'll be fine. But obviously you have to know the exact size of your request in advance, so you could send Content-Length header.

What do you think about letting file.createWriteStream continue to chunk, but allow bucket.upload to set the Content Length if the file provided is < 5 MB?

That might be a good idea in itself, but won't help my particular use case.

I am (was) writing a proxy server that is backed by GCS. On cache miss, I did an HTTP request to the origin server and got an instance of http.IncomingMessage with a known Content-Length header.
Then I've used file.createWriteStream() and piped the origin response into it (to populate the cache). This way there was no in-memory buffering: I just piped one HTTP GET response into another HTTP POST request. So there was no need for chunked encoding or multipart uploads, but I couldn't pass Content-Length along and instead was forced to use chunked encoding and resumable uploads anyway because that was the only API available when using gcloud-node (at least at that time).

How would you see this working, since file.createWriteStream() needs to know the content length of the incoming data up front in order to disable chunking?

I've started looking through the code and it looks like you've changed a bunch of things (great work btw!) and my knowledge is now outdated, plus my original project is done and it works as is, but let me try to suggest a sensible API change:

It looks like you allow setting a Content-Type via metadata.contentType here. You could also accept metadata.contentLength if it's provided. You'll probably have to ignore it for resumable uploads.

I hope that made sense :)

Thanks for your work!

@stephenplusplus
Copy link
Contributor

Thank you for the nice words and for sharing the use case! I put together a PR to support this, please take a look if you're still interested: #853.

@stephenplusplus
Copy link
Contributor

As mentioned here, I'm currently stuck on how to proceed. Feel free to share thoughts here if this feature is really in demand, but for now, I'm going to consider this a micro-optimization that adds too much complexity to the library to support.

sofisl pushed a commit that referenced this issue Oct 11, 2022
🤖 I have created a release \*beep\* \*boop\*
---
### [4.2.8](https://www.github.com/googleapis/nodejs-language/compare/v4.2.7...v4.2.8) (2021-07-21)


### Bug Fixes

* Updating WORKSPACE files to use the newest version of the Typescript generator. ([#600](https://www.github.com/googleapis/nodejs-language/issues/600)) ([9f31d3f](https://www.github.com/googleapis/nodejs-language/commit/9f31d3f510c1628d610633d4ec749abdf66d73f3))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
sofisl pushed a commit that referenced this issue Oct 13, 2022
🤖 I have created a release \*beep\* \*boop\*
---
### [4.2.8](https://www.github.com/googleapis/nodejs-language/compare/v4.2.7...v4.2.8) (2021-07-21)


### Bug Fixes

* Updating WORKSPACE files to use the newest version of the Typescript generator. ([#600](https://www.github.com/googleapis/nodejs-language/issues/600)) ([9f31d3f](https://www.github.com/googleapis/nodejs-language/commit/9f31d3f510c1628d610633d4ec749abdf66d73f3))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
sofisl pushed a commit that referenced this issue Nov 10, 2022
Co-authored-by: Benjamin E. Coe <bencoe@google.com>

Source-Author: Samyak Jain <jainsamyak330@gmail.com>
Source-Date: Tue Nov 24 20:27:51 2020 +0530
Source-Repo: googleapis/synthtool
Source-Sha: 15013eff642a7e7e855aed5a29e6e83c39beba2a
Source-Link: googleapis/synthtool@15013ef
sofisl pushed a commit that referenced this issue Nov 10, 2022
[![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [sinon](https://sinonjs.org/) ([source](https://togithub.com/sinonjs/sinon)) | [`^12.0.0` -> `^13.0.0`](https://renovatebot.com/diffs/npm/sinon/12.0.1/13.0.0) | [![age](https://badges.renovateapi.com/packages/npm/sinon/13.0.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/npm/sinon/13.0.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/npm/sinon/13.0.0/compatibility-slim/12.0.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/npm/sinon/13.0.0/confidence-slim/12.0.1)](https://docs.renovatebot.com/merge-confidence/) |

---

### Release Notes

<details>
<summary>sinonjs/sinon</summary>

### [`v13.0.0`](https://togithub.com/sinonjs/sinon/blob/HEAD/CHANGES.md#&#8203;1300)

[Compare Source](https://togithub.com/sinonjs/sinon/compare/v12.0.1...v13.0.0)

-   [`cf3d6c0c`](https://togithub.com/sinonjs/sinon/commit/cf3d6c0cd9689c0ee673b3daa8bf9abd70304392)
    Upgrade packages ([#&#8203;2431](https://togithub.com/sinonjs/sinon/issues/2431)) (Carl-Erik Kopseng)
    > -   Update all @&#8203;sinonjs/ packages
    >
    > -   Upgrade to fake-timers 9
    >
    > -   chore: ensure always using latest LTS release
-   [`41710467`](https://togithub.com/sinonjs/sinon/commit/417104670d575e96a1b645ea40ce763afa76fb1b)
    Adjust deploy scripts to archive old releases in a separate branch, move existing releases out of master ([#&#8203;2426](https://togithub.com/sinonjs/sinon/issues/2426)) (Joel Bradshaw)
    > Co-authored-by: Carl-Erik Kopseng <carlerik@gmail.com>
-   [`c80a7266`](https://togithub.com/sinonjs/sinon/commit/c80a72660e89d88b08275eff1028ecb9e26fd8e9)
    Bump node-fetch from 2.6.1 to 2.6.7 ([#&#8203;2430](https://togithub.com/sinonjs/sinon/issues/2430)) (dependabot\[bot])
    > Co-authored-by: dependabot\[bot] <49699333+dependabot\[bot][@&#8203;users](https://togithub.com/users).noreply.github.com>
-   [`a00f14a9`](https://togithub.com/sinonjs/sinon/commit/a00f14a97dbe8c65afa89674e16ad73fc7d2fdc0)
    Add explicit export for `./*` ([#&#8203;2413](https://togithub.com/sinonjs/sinon/issues/2413)) (なつき)
-   [`b82ca7ad`](https://togithub.com/sinonjs/sinon/commit/b82ca7ad9b1add59007771f65a18ee34415de8ca)
    Bump cached-path-relative from 1.0.2 to 1.1.0 ([#&#8203;2428](https://togithub.com/sinonjs/sinon/issues/2428)) (dependabot\[bot])
-   [`a9ea1427`](https://togithub.com/sinonjs/sinon/commit/a9ea142716c094ef3c432ecc4089f8207b8dd8b6)
    Add documentation for assert.calledOnceWithMatch ([#&#8203;2424](https://togithub.com/sinonjs/sinon/issues/2424)) (Mathias Schreck)
-   [`1d5ab86b`](https://togithub.com/sinonjs/sinon/commit/1d5ab86ba60e50dd69593ffed2bffd4b8faa0d38)
    Be more general in stripping off stack frames to fix Firefox tests ([#&#8203;2425](https://togithub.com/sinonjs/sinon/issues/2425)) (Joel Bradshaw)
-   [`56b06129`](https://togithub.com/sinonjs/sinon/commit/56b06129e223eae690265c37b1113067e2b31bdc)
    Check call count type ([#&#8203;2410](https://togithub.com/sinonjs/sinon/issues/2410)) (Joel Bradshaw)
-   [`7863e2df`](https://togithub.com/sinonjs/sinon/commit/7863e2dfdbda79e0a32e42af09e6539fc2f2b80f)
    Fix [#&#8203;2414](https://togithub.com/sinonjs/sinon/issues/2414): make Sinon available on homepage (Carl-Erik Kopseng)
-   [`fabaabdd`](https://togithub.com/sinonjs/sinon/commit/fabaabdda82f39a7f5b75b55bd56cf77b1cd4a8f)
    Bump nokogiri from 1.11.4 to 1.13.1 ([#&#8203;2423](https://togithub.com/sinonjs/sinon/issues/2423)) (dependabot\[bot])
-   [`dbc0fbd2`](https://togithub.com/sinonjs/sinon/commit/dbc0fbd263c8419fa47f9c3b20cf47890a242d21)
    Bump shelljs from 0.8.4 to 0.8.5 ([#&#8203;2422](https://togithub.com/sinonjs/sinon/issues/2422)) (dependabot\[bot])
-   [`fb8b3d72`](https://togithub.com/sinonjs/sinon/commit/fb8b3d72a85dc8fb0547f859baf3f03a22a039f7)
    Run Prettier (Carl-Erik Kopseng)
-   [`12a45939`](https://togithub.com/sinonjs/sinon/commit/12a45939e9b047b6d3663fe55f2eb383ec63c4e1)
    Fix 2377: Throw error when trying to stub non-configurable or non-writable properties ([#&#8203;2417](https://togithub.com/sinonjs/sinon/issues/2417)) (Stuart Dotson)
    > Fixes issue [#&#8203;2377](https://togithub.com/sinonjs/sinon/issues/2377) by throwing an error when trying to stub non-configurable or non-writable properties

*Released by [Carl-Erik Kopseng](https://togithub.com/fatso83) on 2022-01-28.*

</details>

---

### Configuration

📅 **Schedule**: "after 9am and before 3pm" (UTC).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox.

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/nodejs-tasks).
sofisl pushed a commit that referenced this issue Jan 24, 2023
sofisl pushed a commit that referenced this issue Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

4 participants