-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the entire notion of named overrides (breaking change) #383
Conversation
We ran into a bug where overrides weren't working as expected. The user expectation is that an override should override a config value for whatever the current configuration for the current request. But, as implemented, they actually were just forks of a base config..regardless of whether that base config changed during runtime. So, overrides no work that way and we don't currently support any way of using an alternate config. This might be added back later if it is useful. While making the changes related to overrides it became obvious that caching was a huge source of complexity. So, this also removes all caching. A quick local benchmark shows that a non-cached config takes about .18ms to build out a set of headers, so it shouldn't be a big concern.
The current code stored named configs in class instance hash. Since the only config that should ever exist is now the default config (all other cases are handled by dynamic overrides and not configs), we remove all vestiages of the configuration hash lookup.
While the normal expected use case for an override is to apply it to a request, there are uses cases where we might want to apply an override to a non-request context. For example, we have a script that runs outside of the main rails app to generate some static configs we copy/paste into some static files. It would look something like this: ``` _, static_csp_policy = SecureHeaders::CSP.make_header(SecureHeaders::Configuration.get.apply_overrides(:static_file_policy).csp, nil) ... %q(<meta http-equiv="Content-Security-Policy" content="#{static_csp+policy"}) ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some misc context for the changes.
@@ -18,6 +17,7 @@ | |||
require "secure_headers/view_helper" | |||
require "useragent" | |||
require "singleton" | |||
require "secure_headers/configuration" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The configuration now references some of the individual header objects..so it had to move.
@@ -52,29 +52,9 @@ def opt_out? | |||
HTTPS = "https".freeze | |||
CSP = ContentSecurityPolicy | |||
|
|||
ALL_HEADER_CLASSES = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all the caching and stuff has been removed I needed up moving some of the "make me a header" logic down into the actual Configuration
object. So, these classes are defined in https://github.com/twitter/secureheaders/pull/383/files#diff-5b9420dafe46fa9c6c8ac3415ab06102R91. Let me know if you hate it.
@@ -168,39 +148,26 @@ def opt_out_of_all_protection(request) | |||
def header_hash_for(request) | |||
prevent_dup = true | |||
config = config_for(request, prevent_dup) | |||
headers = config.cached_headers | |||
config.validate_config! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were only validating the config during the initial Configuration.defautl { ... }
. But, we never did any checks when an override is applied. So, before we actually send out the header, we validate again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was somewhat intentional, for "performance" reasons. A goal was to make it hard to do the wrong thing at runtime. But I'm fine with this too.
user_agent = UserAgent.parse(request.user_agent) | ||
headers = config.generate_headers(user_agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the generate_headers
ignorant to HTTPS. We could send additional data into make_headers
to let it be smart about HTTPS.
end | ||
|
||
if !config.csp_report_only.opt_out? && config.csp_report_only.modified? | ||
headers = update_cached_csp(config.csp_report_only, headers, user_agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay for killing off this update_cached_csp
business 😛
@@ -48,7 +46,7 @@ def validate_config!(config) | |||
# | |||
# Returns a String of quoted values that are comma separated. | |||
def make_header_value(types) | |||
types.map { |t| "\"#{t}\""}.join(", ") | |||
types.map { |t| "\"#{t}\"" }.join(", ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linter told me that a few locations needed a small style update.
def merge(new_hash) | ||
ContentSecurityPolicy.combine_policies(self.to_h, new_hash) | ||
new_config = self.dup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oreoshake - I know we had chatted about the expectations of merge
. But, given that merge!
was overwriting existing values, it seems like merge
should to the same (or merge!
should do the opposite). It seems like append
should be used for any kind of "combine these policies" use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move as far away from treating this like a hash as possible. Using verbs like append/combine/overwrite would be clearer. But we'll have to alias and deprecate first.
@@ -18,7 +18,7 @@ module SecureHeaders | |||
# (pulled from README) | |||
config = { | |||
# "meta" values. these will shape the header, but the values are not included in the header. | |||
report_only: true, # default: false | |||
report_only: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now validating report_only
inside the actual config object. So, we can't go setting report_only: true
unless we are going to be validating it with a ContentSecurityPolicyReportOnlyConfig
.
@@ -60,6 +60,24 @@ module SecureHeaders | |||
expect(hash["X-Frame-Options"]).to be_nil | |||
end | |||
|
|||
it "Overrides the current default config if default config changes during request" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test exercises the bug in #380.
@@ -116,7 +131,7 @@ module SecureHeaders | |||
firefox_request = Rack::Request.new(request.env.merge("HTTP_USER_AGENT" => USER_AGENTS[:firefox])) | |||
|
|||
# append an unsupported directive | |||
SecureHeaders.override_content_security_policy_directives(firefox_request, {plugin_types: %w(flash)}) | |||
SecureHeaders.override_content_security_policy_directives(firefox_request, {plugin_types: %w(application/pdf)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are now validating the config before we generate headers we need to use a legit value for plugin_types
.
My personal opinion: |
I'm definitely not looking for an opportunity to add a caching bug if I don't need to. 👍 |
I would love some input from people more concerned about performance. |
@oreoshake some super hacky quick benchmarking on my local machine says that a cached set of headers takes 30ms for 1000 invocations (.03ms/call) vs. 170ms for 1000 invocations uncached (.17ms/call). So, it definitely is more expensive. But, I have no clue if .17ms is something to sweat over (so says 10K libraries that each add on their own .17ms ). |
Emphasis added. Anecdotally, non-caching code that generates headers is pretty common and does well on high traffic websites. I think it's more important for lightweight APIs with static policies ( |
Once this is in a usable state (perhaps it already is?) I'm happy to run it within our CI suite to confirm correctness on our end and put it through some benchmarks. When we're happy with that, I can roll this out to our fleet and keep you posted on any performance changes. On a side note, I ❤️ reviewing PRs that actually use the description field to convey the intention and context. It's amazing. Thanks @ptoomey3 🙇 ! |
@jacobbednarz - I think this code could use a bit more review..but it does currently pass all the CI tests. So, it should work more or less for testing/benchmarking/etc.
Thanks! I tend to veer toward the side of over-explaining my changes 😄 |
@jacobbednarz did you notice the branch name? Patrick wants to jump to secure_headers, and I quote:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this is a big improvement. I haven't touched the caching code since it was written and I still am trying to understand the details, but I don't have any major issues with the code. I'm happy to give it another pass too if you want to make more changes.
I'd love feedback from people whose code might break or use cases that become unsupported. This is a breaking change indeed, but my guess is that this won't affect many.
I guess this is a good time to get on a soapbox and say "because CSP can limit the functionality of your website, your test coverage would surely pick up any unexpected changes in CSP...right?"
@@ -168,39 +148,26 @@ def opt_out_of_all_protection(request) | |||
def header_hash_for(request) | |||
prevent_dup = true | |||
config = config_for(request, prevent_dup) | |||
headers = config.cached_headers | |||
config.validate_config! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was somewhat intentional, for "performance" reasons. A goal was to make it hard to do the wrong thing at runtime. But I'm fine with this too.
end | ||
hash[header_name] = value | ||
if request.scheme != HTTPS | ||
HTTPS_HEADER_CLASSES.each do |klass| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
attr_reader :cached_headers, :csp, :cookies, :csp_report_only, :hpkp, :hpkp_report_host | ||
# The list of attributes that must respond to a `make_header` method | ||
HEADERABLE_ATTRIBUTES = (CONFIG_ATTRIBUTES - [:cookies]).freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣 "headerable"
@csp = new_csp | ||
when ContentSecurityPolicyConfig | ||
@csp = new_csp | ||
when Hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about removing this "could be an object, could be a hash" support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh..I think choosing one or the other feels awkward in different places. For example, during initial config it feels really natural to do:
Configuration.configure do |config|
config.csp = {
default_src: %w(https: 'self')
}
end
But, it also feels super natural to do
Configuration.override(:footer) do |config|
config.csp = config.csp.merge(...)
So, I guess I tend to vote to leave it as is.
HEADERABLE_ATTRIBUTES.each do |attr| | ||
klass = CONFIG_ATTRIBUTES_TO_HEADER_CLASSES[attr] | ||
header_name, value = klass.make_header(instance_variable_get("@#{attr}"), user_agent) | ||
if header_name && value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we raise an exception if one of these is nil
? That would be a clear indicator of a bug but I don't want to bring someone's site down because of it.
def merge(new_hash) | ||
ContentSecurityPolicy.combine_policies(self.to_h, new_hash) | ||
new_config = self.dup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move as far away from treating this like a hash as possible. Using verbs like append/combine/overwrite would be clearer. But we'll have to alias and deprecate first.
@@ -201,6 +201,7 @@ module ClassMethods | |||
# Returns a default policy if no configuration is provided, or a | |||
# header name and value based on the config. | |||
def make_header(config, user_agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is user_agent required here but optional elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no good reason..only because we always pass it when it is used. But, for consistency it would be fine...eventually this is passed to a method that has a user_agent = OTHER
default value. I'll make it match the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should have it (and the others) match the user_agent = OTHER
pattern though.
I've managed to get some performance metrics out of our application for this change. For these I've used derailed_benchmarks against the existing gem and the local branch that @ptoomey3 has pushed up. Current
Proposed
There is a bit of a difference here (especially double the allocated memory) so I think it's worth confirming whether this acceptable for others. This isn't a deal breaker for me but might be slightly concerning for the Hubbers. As for real requests, there is little difference to the end user response time (using
|
…n ContentSecurityPolicy
We don't expect public callers to actually access the default config directly. So, we hide it behind a private class method. There is only one caller in the entire codebase that is not a test. It does make tests a bit uglier, but is probably worth it for the benefits of hiding it from the public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok..another round of changes. Mostly I just changed things such that there is no public was to fetch the default configuration object directly. It is now hidden behind a private class method and the only internal caller of it does the hacky Configuration.send(:default_config)
approach for accessing it. Gross yes, but Ruby doesn't give us great package level visibility primitives.
config = new(&block) | ||
add_noop_configuration | ||
add_configuration(DEFAULT_CONFIG, config) | ||
if defined?(@default_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spirit of preserving the integrity of the default config I added this. It is my assumption that a caller should only ever being calling this configuration block a single time, correct? If it is called more than once it implies that we could be running into strange runtime behavior, since the call order could go something like:
Configuration.default
...
some runtime override
...
Configuration.default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is my assumption that a caller should only ever being calling this configuration block a single time, correct?
There were guards in place to prevent doing exactly that. Not sure if they survived the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But don't the tests do a bunch of Configuration.default {...}
between tests? That implies it was ok to reconfigure the default config a bunch of times, no?
# ever be called by internal callers (or tests) that know the semantics | ||
# of ensuring that the default config is never mutated and is dup(ed) | ||
# before it is used in a request. | ||
def default_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This private class method makes tests uglier, but it seems worthwhile to preserve hiding the actual default config from public callers.
@@ -13,6 +13,7 @@ class ContentSecurityPolicy | |||
FALLBACK_VERSION = ::UserAgent::Version.new("0") | |||
|
|||
def initialize(config = nil, user_agent = OTHER) | |||
user_agent ||= OTHER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are defaulting user_agent
to nil
in various places I thought it made sense to also override user_agent
if nil
ends up being passed.
@@ -3,6 +3,11 @@ | |||
|
|||
module SecureHeaders | |||
describe PolicyManagement do | |||
before(:each) do | |||
reset_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call this in more tests now that it is an exception to reconfigure the default configuration.
end | ||
|
||
describe "#combine_policies" do | ||
before(:each) do | ||
reset_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For groups of tests that want to play with the default configuration we need to reset it so we don't get an AlreadyConfiguredError
@@ -183,7 +205,8 @@ module SecureHeaders | |||
non_default_source_additions = ContentSecurityPolicy::NON_FETCH_SOURCES.each_with_object({}) do |directive, hash| | |||
hash[directive] = %w("http://example.org) | |||
end | |||
combined_config = ContentSecurityPolicy.combine_policies(Configuration.get.csp.to_h, non_default_source_additions) | |||
default_policy = Configuration.send(:default_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup..this makes a handful of tests uglier, but I don't like making private internal data more visible simply for the sake of tests.
def clear_configurations | ||
@configurations = nil | ||
def clear_default_config | ||
remove_instance_variable(:@default_config) if defined?(@default_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't enough to nil
out the default config anymore since we are checking defined?(@default_config)
when we throw an AlreadyConfiguredError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at the latest round of changes and this is still looking good. None of the changes are objectionable and 👍 on somewhat hiding things
If we want to add caching back in for default policies, I think we should address that in a separate PR.
We'll need an upgrade guide, but for now we can just bake this on github.com and document each pain point.
@oreoshake - If you would take a quick read through the 6.0.0 upgrade docs in c0750da just let me know if there is something I missed, some idiom I didn't follow, or some more detailed guidance you think we should provide. |
All the unit tests for GitHub pass with these changes. But, I did notice I had refactor a few one-off call sites (used for local scripts used to generate CSP values for static files) in a less than glamorous way: Before: SecureHeaders::CSP.make_header(
SecureHeaders::Configuration.get(:static_file_policy).csp
) After: SecureHeaders::CSP.make_header(
SecureHeaders::Configuration.send(:default_config).dup.override(:static_file_policy).csp
) That is a pretty gross, but is admittedly a fringe usage of this whole library. So, we can either just accept that very few folks will want to do this OR we could add back in a method for fetching the default config. But, unlike the prior |
So hiding |
I would just be hiding the |
Hmm I guess without any actual information on how people are using this it seems that obvious breakage would be better than surprising breakage. 👍 to |
Ok..I added a |
I deployed this PR on a lab deployment of github.com and it looked good. The only change I had to make across the entire codebase was: before: SecureHeaders::Configuration.get(:static_file_policy).csp after SecureHeaders::Configuration.dup.override(:static_file_policy).csp As was noted somewhere above, we use the above in a script to manually generate a static file policy that we copy/paste into our static 404, 500, etc files. It is a pretty obscure usage of the library and I don't expect just about anyone is directly operating on configurations like this. Besides the above, no other changes were needed to get our tests passing and a deployment operational. I'm guessing that means most consumer of the library won't need to change any code🤞. |
Nice! Let's let this bake for a while in production. In the meantime, I'll cut a deprecation release for the upcoming API changes. |
I added a test and did a branch deploy of the PR I originally had deployed that unearthed #380. Everything seems to work correctly. I'll probably do another deploy to take a closer look/test at all endpoints that declare overrides later. |
I'm going to merge this in to get moving on a new release with a proper alpha since it seems to work even if not yet fully in production. |
Released in 6.0.0.alpha01 |
This PR is very likely not yet ready for a full review. But, I wanted to push up the work in progress to get a sense of whether the direction of it is palatable, since I am introducing some relatively foundational changes that will definitely break some existing uses.
#380 outlines a bug related to the current implementation of named overrides. There is an underlying assumption that named override is more or less a "fork" of a parent configuration. The parent configuration, at the point the override is defined, is duplicated and then the changes in the override are applied. The resulting configuration is statically stored in a class instance variable and can be retrieved when it is requested with
Configuration.get(:named_override
). Unfortunately, this model is directly at odds with how configurations work at runtime. Imagine the following scenario:script-src
Currently, the above scenario results in the appended
script-src
from not showing up. The reason is that, when we call the override, it applies the config that was generated when the override was defined. That config is simply a copy of the original default config with the overridden directives applied. So, the dynamic changes made to the default config are lost.In order to fix this we, unfortunately, need to redefine the entire notion of an override. Most callers would assume that an override will override the "current configuration" at runtime. As a result, we can't statically computer the override at the point it is defined. Instead, we need to treat overrides more like how we treat named appends. We store the provided block for an override in a hash that maps the name to the
Proc
. Then, when the override is requested, weinstance_eval
theProc
with the current configuration for the request.Also, during this PR it became evident that trying to cache the generated configs was incredibly complex and prone to error. This PR removes the caching and just computes the headers on demand for each request. We already hit this case a fair bit if one ever used per-request overrides anyway, since the cached values would get invalidated depending on which code path you went through on a given request. The computation overhead of making a header is relatively low and we don't expect it to be a concern. If there is a concern we could optimize the base case and only precompute the default config. But, if any changes are applied, then the precomputation is lost. As I'm typing this it sounds like that could be a nice win, since so few folks probably use dynamic policies. Thoughts?
fixes #380
All PRs: