Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inaccurate s3 rewrite rule. #1040

Merged
merged 17 commits into from
Jan 13, 2015
Merged

Conversation

ksauzz
Copy link
Contributor

@ksauzz ksauzz commented Jan 6, 2015

Riak CS doesn't handle some specific characters in object name well.

see: https://gist.github.com/ksauzz/9f0dc13c9b8cb0f2a85e
related to #910

This PR includes:
  • Fix s3 rewrite rule which results in losing backward compability.
  • Introduce riak_cs_s3_rewrite_legacy to keep backward compatibility.
  • Add regression test for riak_cs_s3_rewrite_legacy.
Topics I'd like to discuss
  • Is using test-python reasonable for regression test ? At least, boto test saved me from my bad codes.
How access log format changes after this patches.

before

172.17.42.1 - - [07/Jan/2015:08:27:07 +0000] "PUT /buckets/test/objects/path1%2Fpath2%2Fte%2Bst.txt HTTP/1.1" 200 0 "" ""

after

127.0.0.1 - - [07/Jan/2015:17:28:23 +0900] "PUT /buckets/test/objects/path1%2Fpath2%2Fte%252Bst.txt HTTP/1.1" 200 0 "" ""
TODO:
  • Update releasenote about description the change of rewrite rule, and how to keep backward compatibility.

error({external_client_tests, Reason})
end.

execute_cmd(Cmd, Opts) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend just using rt_cs_dev:cmd/2 instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like duplicate code ^^;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rt_cs_dev:cmd/2 doesn't look suitable for this case because it doesn't handle exit code, or output log until the command finishes.I'd like to extract these function to rtcs module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also in master branch, rt_cs_dev:cmd/2 does.

@ksauzz ksauzz changed the title WIP: fix inaccurate s3 rewrite rule. Fix inaccurate s3 rewrite rule. Jan 7, 2015

-module(legacy_s3_rewrite_test).

%% @doc `riak_test' module for testing object get behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a stale comment from copy source.

@kuenishi
Copy link
Contributor

kuenishi commented Jan 7, 2015

I believe you should update riak_cs_config:api/0 to handle riak_cs_s3_rewrite_legacy, or it does no harm to riak_cs_wm_common:maybe_create_user/6 ?

@kuenishi
Copy link
Contributor

kuenishi commented Jan 7, 2015

I also believe it is a good opportunity to write eqc tests on rewrite. If you can't afford time for now, I can do it later.


-define(TEST_BUCKET, "legacy-s3-rewrite-test").

confirm() ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be good if this test includes the exact case where the problem happened, like:

  • put key "foo bar" with content "foo bar"
  • put key "foo+bar" with content "foo+bar"
  • read "foo bar" and "foo+bar" and verify they have different content

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's about testing legacy code, so I'd say this is enough to test old behaviour.

@kuenishi
Copy link
Contributor

kuenishi commented Jan 7, 2015

Memo: to preserve old path translation behaviour, users should update app.config as {rewrite_module, riak_cs_s3_rewrite_legacy} from default when updating from 1.5.3 or older.

@kuenishi
Copy link
Contributor

kuenishi commented Jan 7, 2015

After upgraded to the latest version and not to preserve old odd behaviour, new key for the old object put as "foo+bar", is "foo bar".

@shino shino added this to the 1.5.4 milestone Jan 7, 2015
@shino shino added the Bug label Jan 7, 2015
@shino
Copy link
Contributor

shino commented Jan 8, 2015

By removing URL decode at rewrite module, bucket part of path style
access is never decoded.

For example, using GET Bucket request (client can URL encode any characters in path part, %74 is t):

s3curl.pl --id cs --get -- -x 127.0.0.1:15018 -i -s 'http://s3.amazonaws.com/%74est2'

Current release/1.5 branch responds with 200 OK and contents:

HTTP/1.1 200 OK
Server: Riak CS
Date: Thu, 08 Jan 2015 02:14:39 GMT
Content-Type: application/xml
Content-Length: 253

<?xml version="1.0" encoding="UTF-8"?><ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>test2</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter></Delimiter><IsTruncated>false</IsTruncated></ListBucketResult>%

This branch (at commit d1c3d02) responds with error:

HTTP/1.1 404 Object Not Found
Server: Riak CS
Date: Thu, 08 Jan 2015 02:14:08 GMT
Content-Type: application/xml
Content-Length: 185

<?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The specified bucket does not exist.</Message><Resource>/%74est2</Resource><RequestId></RequestId></Error>%

@kuenishi
Copy link
Contributor

kuenishi commented Jan 9, 2015

As for @shino 's comment, we'll leave it for bucket names bug just fix object keys. If legacy behaviour preferred, just use legacy rewrite module.

@shino
Copy link
Contributor

shino commented Jan 9, 2015

@kuenishi The bug was fixed by 40609c5 (if I understand correctly).

@kuenishi
Copy link
Contributor

kuenishi commented Jan 9, 2015

ah 🙏

@kuenishi
Copy link
Contributor

kuenishi commented Jan 9, 2015

All riak_test has passed as usual. test-riak_cs is still screwed, although.

@kuenishi kuenishi closed this Jan 9, 2015
@kuenishi kuenishi reopened this Jan 9, 2015
@kuenishi
Copy link
Contributor

kuenishi commented Jan 9, 2015

the AA

@kuenishi
Copy link
Contributor

quote state     RAW    escaped(1)  escaped(2)    example

                 *                               'baz/foo bar'        'baz/foo+bar'
client app       +--------+                      'baz/foo%20bar'      'baz/foo%2Bbar'
                          |
HTTP                      |
                          |
webmachine                |
before rewrite            |
rewrite                   +------------+         'baz%2Ffoo%2520bar'  'baz%2Ffoo%252Bbar'
after rewrite                          |
                                       |
extract_key      +---------------------+
                 |
                 v
                 *                               'baz/foo bar'        'baz/foo+bar'

@shino
Copy link
Contributor

shino commented Jan 13, 2015

Note: The above diagram is only on objects part of the path in case of "virtual host style" access. buckets part in "path style" and subresource part are different, not doubly escaped at any point of time.

borshop added a commit that referenced this pull request Jan 13, 2015
Fix inaccurate s3 rewrite rule.

Reviewed-by: kuenishi
@ksauzz
Copy link
Contributor Author

ksauzz commented Jan 13, 2015

@borshop merge

@shino
Copy link
Contributor

shino commented Jan 13, 2015

This also fixed #977 which described the case of % character and subsequent [0-9a-zA-Z]{2} as well as + sign (in client application layer).

@ksauzz ksauzz deleted the bugfix/rewriting-path-inaccurately branch January 13, 2015 06:58
@Basho-JIRA
Copy link

Reopening to get on sprint backlog - will close again shortly.

_[posted via JIRA by Deborah Rakow]_

@Basho-JIRA
Copy link

Re-closing after adding to sprint

_[posted via JIRA by Deborah Rakow]_

kuenishi added a commit that referenced this pull request Jan 15, 2015
kuenishi added a commit that referenced this pull request Jan 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants