Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for S3 CORS configuration policies #63

Merged
merged 7 commits into from
Feb 6, 2018

Conversation

kherock
Copy link
Collaborator

@kherock kherock commented Mar 20, 2017

(Close #59 if you decide to merge this as this tracks off those changes)

This uses Express's cors package to support customizing what headers the S3 server exposes/allows. It shouldn't affect existing configurations as the --cors option doubles as a boolean to enable the basic CORS config and as a file path to a full XML configuration.

@kherock kherock changed the title Support for S3 CORS configuration policy Support for S3 CORS configuration policies Mar 20, 2017
lib/index.js Outdated
this.app = new App(this.options);
}

for (var option in defaultOptions) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like this for loop for dynamically adding methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't really see the point in having all those setters in the first place. I guess I just felt better by sticking them all in a loop so they weren't wasting so much space. I'll leave them alone when I fix this up.

Copy link
Collaborator

@n1ru4l n1ru4l Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we can drop those since they can be set via the options of the constructor anyway... 😅

lib/index.js Outdated
for (var option in defaultOptions) {
S3rver.prototype['set' + option.charAt(0).toUpperCase() + option.slice(1)] = function (value) {
this.options[option] = value;
this.app = new App(this.options);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a new app instance after every set option?

@n1ru4l
Copy link
Collaborator

n1ru4l commented Jan 24, 2018

@specialkk Could you please rebase, add tests and explain your motivations behind creating a new app instance on every set option?

@kherock
Copy link
Collaborator Author

kherock commented Jan 24, 2018

Whoa, we have a maintainer again!

I actually haven't really looked at this since I messed with it almost a year ago. Looking at it now I think it might have been a crappy shortcut to get it the way I needed it to work.

I'll probably end up rewriting most what I've done here when I rebase, hopefully things will seem a bit tidier. Hopefully I'll have some time this week.

kherock added a commit to kherock/s3rver that referenced this pull request Jan 25, 2018
@kherock
Copy link
Collaborator Author

kherock commented Jan 25, 2018

All right, I'm all done, lemme know what you think. I also had time to stick in a test for range requests!

I did kind of go crazy with the tests file but I think it's better now. I made some adjustments to make things slightly more compatible with Windows (you can actually see what individual tests are failing now). I also have the main test suite and the static site suite reuse the same client and server instance across tests. Lastly I moved the tests you added recently to the top so that they match mocha's execution order.

@kherock
Copy link
Collaborator Author

kherock commented Jan 25, 2018

I rebased, everything is let/const now. pls approve 😅

test/test.js Outdated
const buckets = ['bucket1', 'bucket2', 'bucket3', 'bucket4', 'bucket5', 'bucket6'];
let s3rver;
let s3rver, s3Client;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not do this, each variable in one line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right. I usually only do single line when I'm only declaring variables without initializing them, I'll make sure I didn't do it anywhere else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will definitely have to setup eslint and prettier :D

@n1ru4l
Copy link
Collaborator

n1ru4l commented Jan 25, 2018

Is there not possibility to make the tests work on windows? There must be some kind of a workaround?

@kherock
Copy link
Collaborator Author

kherock commented Jan 25, 2018

It's been a mostly ignored issue since Windows Explorer itself sometimes has issues recursively deleting directories with files that haven't been completely released from their locks. It might be able to be worked around by trying to remove the directory again after waiting a bit since the OS is still working on releasing the locks. I haven't had any major issues with normal end use.

Based on what I've read it's probably that Node.js needs to use better suited syscalls for whatever it's trying to do. Windows obviously supports it since running everything under Linux Subsystem works just fine.

@kherock
Copy link
Collaborator Author

kherock commented Jan 28, 2018

I've recently done a major refactor of the filesystem store that fixes all tests on Windows! I'll submit a PR once this gets merged in.

@n1ru4l
Copy link
Collaborator

n1ru4l commented Jan 28, 2018

@leontastic Any final words?

@leontastic
Copy link
Collaborator

leontastic commented Jan 29, 2018

@specialkk I appreciate the PR! I am reviewing your code right now, I appreciate your patience as I read through all of it.

There are many concerns being handled here besides merely CORS configuration, which makes the intent of your changes less transparent (harder to make accurate and complete release notes) and increases the cognitive load of performing a review. I would encourage you to split your work into multiple PRs labelled with individual concerns (the smaller the better!) in the future for easier maintenance, so that we can merge your work in more quickly and other contributors can parse your work easily in retrospect.

Great job nonetheless, I'll be back with review comments within the next day or two.

Copy link
Collaborator

@leontastic leontastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concerns are regarding the way CORS configurations are being loaded, and the significant changes you've made to the test suite. I won't enforce any specific changes if you can address my feedback with comments.

Thanks @specialkk!

lib/index.js Outdated
@@ -40,18 +21,38 @@ S3rver.prototype.resetFs = function (callback) {
});
}

S3rver.prototype.callback = function () {
return new App(this.options);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you want to be able to get an express middleware out of an existing S3rver instance here. Could you name the function something more descriptive than callback? Something like getMiddleware would make it more obvious what this function does.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the new is necessary, return App(this.options) should suffice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be great to see some tests and documentation for this feature. I can imagine many users wanting to use this module as an express middleware.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha it's called .callback() because I'm anticipating it eventually being a Koa wrapper. Ideally once we do Koa, we can also add a getter for .middleware so that the app can be mounted directly. And of course Express users can still mount the result of .callback().

I think it would be good to expand the running s3rver programmatically section in the README that explains these things.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I prefer middleware because this library is an Express app in its current state. But I won't argue with you on any of the semantics here, as long as you document it I will accept it.

I can see how naming it middleware could be confusing to Koa users (who could be expecting a Koa middleware) while callback would be confusing to Express users (who would expect it to be called "middleware")  – so I think making both semantics available to users by aliasing getMiddleware or a middleware getter to this function is a good idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the alias getMiddleware

Just to clarify the middleware getter would be the array of middleware that helpers like koa-mount look for when mounting subapps.

const request = require('request');
const should = require('should');
const util = require('util');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 nice I love it when imports are ordered alphabetically

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot more of that coming 🙃

I tried to control myself and not lose too much focus

lib/app.js Outdated
({ CORSConfiguration } = parsed);
});
} else {
({ CORSConfiguration } = options.cors);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if options.cors is not an object here? I believe it's set to false by default in the CLI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you mention it, we should probably have at least a couple tests for using the actual CLI, seeing that's probably how most people use this project.

Anyway I think I forgot to test that, I think you're right in that it will be set to false which can cause problems. I will have to make some changes there. I also found the way I have the default config specified kind of awkward since it can be either an object or a string. Do you think it would be better to use the path to the sample config xml as a default value instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about testing the actual CLI, I don't really have any ideas on how to write those tests right now though. Definitely need to check for false and null as opposed to all falsey values because I'd assume you'd want to set the default when the user merely leaves options.cors undefined.

I think using object by default is fine, users would need to do more searching to find the default config XML file rather than seeing it directly in the source code. However it would be great to see the default CORS configuration XML included or linked in the documentation so people can quickly copy and paste it into their own projects if they want to.

server.close = (callback) => {
const { close } = Object.getPrototypeOf(server);
return close.call(server, () => {
app.logger.unhandleExceptions();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

test/test.js Outdated

it('should merge default options with provided options', function () {
const s3rver = new S3rver({
hostname: 'testhost',
indexDocument: 'index.html',
errorDocument: '',
directory: '/tmp/s3rver_test_directory',
directory: tmpDir,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If directory is already set on defaultOptions is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I didn't really touch that section so I didn't think about that. I can fix that.

lib/index.js Outdated
CORSConfiguration: {
CORSRule: [{
AllowedOrigin: ['*'],
AllowedMethod: ['GET'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under this default configuration, would POST/DELETE requests be denied by clients which make preflight requests and find only GET listed in the returned Access-Control-Allow-Methods headers?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's correct? If you think that's problematic/confusing I'm ok with disabling CORS by default. I think most people needing to enable CORS are going to have their own config anyway.

In case you're wondering, this is just copy and pasted from the result of parsing the XML config you get after creating a new bucket.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to be certain about what these defaults do. I don't think most users will be expecting CORS protection to be enabled by default (this could be a point of frustration for them) but at the same time since most users are using this for integration testing it should resemble the real-world S3 as closely as possible. I'm partial towards the latter so I support leaving it as is.

@@ -445,6 +591,30 @@ describe('S3rver Tests', function () {
});
});

it('should get partial image from a bucket with a range request', function (done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have liked to see this in a separate PR. However @specialkk thank you for adding this test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that test because I'm the one that added that feature without ever creating a test for it! 😛

test/test.js Outdated
let s3rver;
let s3Client;

before('Start s3rver', function (done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above regarding isolated test cases: @specialkk Can you please justify reusing dependencies across test cases, or roll back this change?

lib/app.js Outdated
res.header('Access-Control-Allow-Methods', req.headers['access-control-request-method']);
} else if (CORSConfiguration && CORSConfiguration.CORSRule) {
CORSConfiguration.CORSRule.some((rule) => {
// The Express CORS middleware is sync, so this finishes before next() is called
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a clearer comment would be:

we only use 'cors' to append headers to 'res'
since CORS middleware is sync, call 'next' immediately after running middleware for each rule

lib/app.js Outdated
exposedHeaders: rule.ExposeHeader,
maxAge: (rule.MaxAgeSeconds || [])[0]
})(req, res, () => {});
return !!res.getHeader('Access-Control-Allow-Origin');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the Access-Control-Allow-Origin header is added by cors, this return statement will interrupt the loop and skip all subsequent rules (in other words, it only processes the configuration for the first origin). When I read https://docs.aws.amazon.com/AmazonS3/latest/dev/cors.html it seems that configs for multiple origins could be specified and cors should only be run for the configuration matching the request origin, if it exists, rather than just the first rule. This deviates from the S3 documentation you reference as I understand it. Please correct me if I've misunderstood!

Copy link
Collaborator Author

@kherock kherock Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the first sentence of the section at the bottom!

How Does Amazon S3 Evaluate the CORS Configuration On a Bucket?
When Amazon S3 receives a preflight request from a browser, it evaluates the CORS configuration for the bucket and uses the first CORSRule rule that matches the incoming browser request to enable a cross-origin request. For a rule to match, the following conditions must be met:

The cors middleware will only set headers when the conditions outlined there match (origin, exposed headers, etc.), so I think I'm still correct in my understanding?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the exit condition here should be whether the allowed origin matches rule.AllowedOrigin === req.get('origin'), not whether the Access-Control-Allow-Origin header was set. As long as the first rule has any AllowedOrigin here then the header will be set, hence the logic you've added here will use the rules for the first rule rather than the first rule matching the origin of the incoming browser request.

To be clearer about the intent this logic (such that there's no ambiguity as to what's going on here and make it obvious if I'm misinterpreting here), I think it would be better to use a lodash.find on the CORSRule list to explicitly retrieve the first rule matching the request origin and running cors with that rule, rather than putting it in a .some call and returning your exit condition (really no different from breaking out of a for loop).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of misspoke, the exit condition is testing that Access-Control-Allow-Origin is a truthy value (the cors middleware sets it to false if it doesn't match).

This is only basic support for CORS configurations, I offloaded most of the implementation to the cors middleware which is a bit imprecise compared to the real S3 response. If you need me to support everything involved in matching a CORSRule (such as partial wildcard *.example.com or http://*.example.com) I can see about that. I'd probably just ditch the express middleware entirely at that point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I wouldn't have known that without knowing cors internals, I'd appreciate a comment on the expectations for the exit condition since they are opaque side effects of an external library. Honestly I'm OK with this as-is, but of course any extra work to support wildcards would be amazing too. Thank you for clarifying.

@leontastic
Copy link
Collaborator

@specialkk No more feedback from me, you have either addressed all my concerns or described the additional changes you are OK with making. Thank you so much for the replies and your contribution! Look forward to seeing the new changes and approving and merging this PR.

@leontastic
Copy link
Collaborator

@specialkk If you are interested in becoming a collaborator (given that you've made some big contributions already) maybe request an invitation from @jamhall in #77, I'd be happy to have you on board given the diligence you show in your contributions and discussion.

@jamhall
Copy link
Owner

jamhall commented Feb 1, 2018

@specialkk - I am absolutely for the idea that @specialkk becomes a collaborator. @specialkk - let me know if you would like this. Thank you to everyone for your hard work! :)

@kherock
Copy link
Collaborator Author

kherock commented Feb 1, 2018

@jamhall I'm happy to help - send me the invitation when you can!

@jamhall
Copy link
Owner

jamhall commented Feb 1, 2018

@specialkk - Sent! :)

Adds S3rver.prototype.callback() which can be passed to http.createServer or treated as a normal Express app.
This also removes the workaround added when using _.merge to merge the passed in options with defaults.
The 'cors' option can now be set to a raw XML string containing an S3 CORS policy.
Amazon's default CORS policy is now applied by default. All documented aspects of the configuration file including wildcard origins and headers are supported.
Updated README to note that Windows isn't completely supported yet.
Winston's logger instance is now accessible as app.logger and is closed with the server. This also cleans up a several tests (mainly the static site suite) that needed some touch-ups.
@kherock
Copy link
Collaborator Author

kherock commented Feb 1, 2018

Apologies for the push spam, there were some small things I kept forgetting to include (and a couple commit message typos -_- ).

I managed to get a rather complete implementation for CORS configuration done. It supports wildcard characters in request origins and allowed headers, and I also added in several error responses for OPTIONS requests. I also dialed back my changes to the tests file a ton. The diff should be much smaller now.

@kherock
Copy link
Collaborator Author

kherock commented Feb 4, 2018

FYI @leontastic this is all done, I assume the 👍 means you approve but I'll still let you have the final word on if this can be merged.

Copy link
Collaborator

@leontastic leontastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@specialkk Thanks for going the extra mile to roll out a custom CORS handler instead of the cors package. I'm willing to merge this as-is if you can address my question about the OPTIONS request – the rest is up to you if you feel like polishing these changes. Thanks for your patience!

});
```

### s3rver.callback() ⇒ `function (req, res)`
*Also aliased as* **s3rver.getMiddleware()**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

| cors | `string` \| `Buffer` | [S3 Sample policy](test/resources/cors_sample_policy.xml) | Raw XML string or Buffer of CORS policy |
| indexDocument | `string` | | Index document for static web hosting |
| errorDocument | `string` | | Error document for static web hosting |
| removeBucketsOnClose | `boolean` | `false` | Remove all bucket data on server close |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎖 hero

}

// S3 only checks if CORS is enabled *after* checking the existence of access control headers
if (!CORSConfiguration) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the user does not specify a CORS configuration, but their client still sends a preflight OPTIONS request? The user would get a 403 where I think we would expect it to be lenient (we want to make integration tests easier wherever we allow users to omit preconditions). Should there be a check to determine if a config was specified before running this logical branch under if (req.method === 'OPTIONS')?

If this is how S3 behaves when no CORS config is specified, or if we have a default applied when none is specified, then leave a note and keep this as-is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 403 response is correct based on what I was testing, also documented here. I'm not sure if it would get in the way of integration tests since the default sample policy should be lenient enough.

res.header('Vary', 'Origin, Access-Control-Request-Headers, Access-Control-Request-Method');
}
}
next();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this middleware, it looks fantastic and fairly easy to extend/maintain

port: 4578,
hostname: 'localhost',
silent: false,
cors: fs.readFileSync(path.resolve(__dirname, '../test/resources/cors_sample_policy.xml')),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For peace of mind that future collaborators don't update test/resources for other purposes and break this, I think the default CORS policy should be specified in a folder outside of test or as an in-memory object. @specialkk Do you have an opinion on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to make it in-memory since I have it linked in the README. I honestly only have it there because it felt out of place throwing it in the lib folder. Keeping it at root level would make enough sense if the test/resources location seems too volatile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll keep an eye out, tests should fail if the config gets moved/deleted

if (err) return done(err);

response.statusCode.should.equal(200);
response.headers.should.have.property('access-control-allow-origin', origin);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@specialkk Thank you for going the extra mile to verify the actual test conditions pass.

});
});
});

describe('S3rver Tests with Static Web Hosting', function () {
let s3Client;
let s3rver;
beforeEach(function (done) {

beforeEach('Reset site bucket', resetTmpDir);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 wonderful


done();
before('Initialize bucket', function (done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should add a cleanup step to this test suite? So as to not pollute the contents of the tmp directory for other suites (even if the others currently clean up the tmp directory before running)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say no, only because other tests should be cleaning up the directory they're using before running anyway, otherwise I'd consider that a form of dependence on other tests. Additionally we shouldn't have to be too concerned with the intermediate test files left behind in the tmpdir since the OS should clean them up on its own.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tests should clean up after themselves, but we can revisit this in a future test cleanup.

}
describe('S3rver CORS Policy Tests', function () {
const bucket = 'foobars';
let s3Client;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also share S3rver as a dependency across tests so you can add an after hook for cleanup, and afterEach hook to close the server so that you don't need to do it inside each individual test.

s3Client.putObject(params, function (err, data) {
if (err) return done(err);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adhering to the linting rules as they are... they're not pretty but updating them is for a future PR.

@leontastic leontastic merged commit 16b231a into jamhall:master Feb 6, 2018
@leontastic
Copy link
Collaborator

@specialkk Thanks for your work on this huge PR, looking forward to maintaining this repo with you going forward.

@kherock
Copy link
Collaborator Author

kherock commented Feb 6, 2018

@leontastic I really appreciate all the positive/constructive feedback, I'm glad this finally got added! You're doing a great job with the repo.

@leontastic leontastic mentioned this pull request Feb 8, 2018
@kherock kherock deleted the cors branch February 13, 2018 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants