Skip to content

HTTP GET "Range" header not supported#304

Merged
shino merged 4 commits intomasterfrom
ss-get-object-range
Feb 19, 2013
Merged

HTTP GET "Range" header not supported#304
shino merged 4 commits intomasterfrom
ss-get-object-range

Conversation

@shino
Copy link
Copy Markdown
Contributor

@shino shino commented Feb 14, 2013

We aren't supporting the HTTP "Range" header, e.g.

GET http://test.s3.amazonaws.com/foofoo5.txt HTTP/1.1
Host: test.s3.amazonaws.com
Accept-Encoding: identity
Authorization: AWS NP3ZEHK_H9MBSHEP2XWS:5REBtA1MXLk+t5QHMz4Kx5mBzVQ=
x-amz-date: Fri, 19 Oct 2012 22:28:45 +0000
Range: bytes=1-

To duplicate via s3cmd:

  1. Store a file of any size, call it s3://buck/foo-object
  2. echo bar > foo-object
  3. s3cmd get --continue s3://buck/foo-object

The output is:

[...]
DEBUG: Requesting Range: 4 .. end
DEBUG: Response: {'status': 500, 'headers': {'content-length': '170', 'server': 'MochiWeb/1.1 WebMachine/1.9.0 (someone had painted it blue)', 'last-modified': 'Sat, 20 Oct 2012 03:26:52 GMT', 'etag': '"78b295896cccae248b9ef825147a642f"', 'date': 'Fri, 19 Oct 2012 22:33:17 GMT', 'content-type': 'text/html'}, 'reason': 'Internal Server Error'}
DEBUG: S3Error: 500 (Internal Server Error)
DEBUG: HttpHeader: content-length: 170
DEBUG: HttpHeader: server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted it blue)
DEBUG: HttpHeader: last-modified: Sat, 20 Oct 2012 03:26:52 GMT
DEBUG: HttpHeader: etag: "78b295896cccae248b9ef825147a642f"
DEBUG: HttpHeader: date: Fri, 19 Oct 2012 22:33:17 GMT
DEBUG: HttpHeader: content-type: text/html
ERROR: S3 error: 500 (Internal Server Error): 

And the error from the CS node is:

17:33:17.672 [error] webmachine error: path="/test/foofoo5.txt"
[{webmachine_request,range_parts,[{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[[resource_module|riak_cs_wm_key],['content-type',116,101,120,116,47,112,108,97,105,110]],[],[['content-encoding',105,100,101,110,116,105,116,121]],[],[],[],[],[],[],[],[]}}},undefined,"127.0.0.1",{wm_reqdata,'GET',http,{1,1},"127.0.0.1",{wm_reqstate,#Port<0.6013>,{dict,3,16,16,8,80,48,{[],...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...},...],...},...]

Required PRs

Implementation TODOs:

  • GET object which is uploaded by multipart

@ghost ghost assigned shino Jan 22, 2013
@shino
Copy link
Copy Markdown
Contributor

shino commented Jan 26, 2013

Memo: AWS S3 respond whole resource when multiple ranges are requested.

cf: RFC 2616 says "A server MAY ignore Range header".

14.35.2 Range Retrieval Requests

A server MAY ignore the Range header. However, HTTP/1.1 origin
servers and intermediate caches ought to support byte ranges when
possible, since Range supports efficient recovery from partially
failed transfers, and supports efficient partial retrieval of large
entities.

s3curl command outputs for single range and two ranges are as follows.
Irrelevant parts are omitted and hamandeggs.txt has just one line hamandeggs.

single range (bytes=4-8)

  • Response headers include Content-Length, Accept-Ranges, Content-Range
  • Content-Type is the resouce's one
  • Response body is cut down to the specified byte range
$ s3curl.pl --id s3_id_here -- -v -H "Range: bytes=4-8" \
            http://***.s3.amazonaws.com/hamandeggs.txt
> GET /hamandeggs.txt HTTP/1.1
< HTTP/1.1 307 Temporary Redirect
> GET /hamandeggs.txt HTTP/1.1
> User-Agent: curl/7.27.0
> Host: ***.s3-ap-northeast-1.amazonaws.com
> Accept: */*
> Range: bytes=4-8
>
< HTTP/1.1 206 Partial Content
< Accept-Ranges: bytes
< Content-Range: bytes 4-8/11
< Content-Type: text/plain
< Content-Length: 5
< Server: AmazonS3
<
ndegg

two ranges (bytes=1-3,5-8)

  • Response headers include Content-Length, Accept-Ranges,
  • but not include Content-Range
  • Content-Type is the resouce's one
  • Response body is not cut down, whole resource is returned
    (i.e. the same as without Range)
$ s3curl.pl --id --id s3_id_here -- -v -H "Range: bytes=1-3,5-8" \
     http://***.s3.amazonaws.com/hamandeggs.txt
> GET /hamandeggs.txt HTTP/1.1
< HTTP/1.1 307 Temporary Redirect
> GET /hamandeggs.txt HTTP/1.1
> Host: ***.s3-ap-northeast-1.amazonaws.com
> Accept: */*
> Range: bytes=1-3,5-8
>
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Content-Type: text/plain
< Content-Length: 11
< Server: AmazonS3
<
hamandeggs

Comment thread src/riak_cs_wm_object.erl Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this also use known_length_stream instead of stream? We don't know how many CS blocks the requested range will span.

@kellymclaughlin
Copy link
Copy Markdown
Contributor

Verified that single range requests succeed using this branch and the related webmachine and mochiweb branches. Also, specifying multiple ranges does have the expected result of returning a 200 response code along with the entire object body.

@shino
Copy link
Copy Markdown
Contributor

shino commented Feb 15, 2013

@kellymclaughlin Thank you very much for review and confirmation.

I pushed changes according to your review comment.

Related change in Webmachine:
basho/webmachine@e68015b

And one bug fix to mochiweb:
basho/mochiweb@af52646

Comment thread rebar.config Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just be sure to set this to point to master prior to merging.

@shino
Copy link
Copy Markdown
Contributor

shino commented Feb 18, 2013

Rebased on master 3627d0c and force pushed, in order to incorporate MP bug fix.

Now it's possible to GET Range for objects those uploaded by multipart upload.

@ghost ghost assigned kellymclaughlin Feb 18, 2013
@kellymclaughlin
Copy link
Copy Markdown
Contributor

+1 to merge

@ghost ghost assigned shino Feb 18, 2013
Shunichi Shinohara added 4 commits February 19, 2013 12:49
shino added a commit that referenced this pull request Feb 19, 2013
Add support of HTTP GET "Range" header

* Respond normal (non-range) resource for multiple range requests
* Respond with 416 for invalid range requests
@shino shino merged commit b6b38a3 into master Feb 19, 2013
@shino
Copy link
Copy Markdown
Contributor

shino commented Feb 19, 2013

Merged after rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants