-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define Rack::Builder::Config, and support middleware.rackup configuring middleware based on server configuration #1720
Conversation
I generally like this idea, I'm not strongly in favour of any particular approach, but I would like to explore a bit more about how this works in practice. The idea of a config object is nice since it isolates the responsibilities, but it also introduces additional complexities. However, I think the net gain for users is that it simplifies more complex construction of the application and provides a very well defined interface for the kinds of things people should depend on. The only thing I wonder about is whether we should allow config to contain user specific keys/values. I imagine that such a model could be really useful but at the expense of opening up the design to more complexity. However, the advantage would be that users could depend on a configuration context for their own usage rather than having multiple sources of truth for configuration. |
@jeremyevans do you mind rebasing this on master? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great. Just some points for us to discuss and iron out.
CHANGELOG.md
Outdated
@@ -34,6 +34,10 @@ All notable changes to this project will be documented in this file. For info on | |||
|
|||
- [[CVE-2020-8184](https://nvd.nist.gov/vuln/detail/CVE-2020-8184)] Do not allow percent-encoded cookie name to override existing cookie names. BREAKING CHANGE: Accessing cookie names that require URL encoding with decoded name no longer works. ([@fletchto99](https://github.com/fletchto99)) | |||
|
|||
### Removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add a note regarding the addition of the new configuration class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can do that. Note that most users won't be using it, only middleware/server authors.
README.rdoc
Outdated
@@ -77,7 +77,6 @@ middleware: | |||
* Rack::Files, for serving static files. | |||
* Rack::Head, for returning an empty body for HEAD requests. | |||
* Rack::Lint, for checking conformance to the \Rack API. | |||
* Rack::Lock, for serializing requests using a mutex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't end up removing this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I'll restore it.
lib/rack/builder.rb
Outdated
# Config stores settings on what the server supports, such as whether it | ||
# is multithreaded. | ||
class Config | ||
def initialize(multithread: true, reentrant: multithread) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should have a more generic interface **options
so that different servers can pack different information into this configuration.
If we choose to do this, maybe we should also expose attr :options
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea here is that server authors can subclass this class to provide a richer API if they need to. Currently, the only use for the configuration is for concurrency. The issue with **options
is that it turns typos into silent failures.
lib/rack/builder.rb
Outdated
|
||
# Re-entrancy is a feature of event-driven servers which may perform non-blocking operations. When | ||
# an operation blocks, that particular request may yield and another request may enter the application stack. | ||
def reentrant?; @reentrant; end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you feel about this configuration option name?
I started wondering if we should call it multifiber
or even replace multithread
with parallel?
and reentrant?
with concurrent?
. multithread?
sounds like an implementation detail.
I think we should try to iron this interface out as it's going to be important going forward that it means exactly what we want it to mean for the servers that need to support it. We might want to add a table showing potential values for these configuration options w.r.t. servers and their respective configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If multithread?
is an implementation detail, so is multifiber?
. parallel?
is not accurate on CRuby, since Ruby code does not execute in parallel on CRuby (though that is also an implementation detail). Replacing reentrant?
with concurrent?
seems like a good idea to me, so I'll make that change.
I would like to get rid of multithread?
, but I think it it is necessary until we drop Ruby 2.7 support. That is because on Ruby 2.7 and below, you cannot use Rack::Lock in the concurrent but not multithread case, since it uses Mutex internally.
lib/rack/builder.rb
Outdated
builder = self.new(config: config) | ||
|
||
# Create a top level scope with self as the builder instance: | ||
binding = TOPLEVEL_BINDING.eval('->(builder){builder.instance_eval{binding}}').call(builder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not particularly important, but in another project I moved TOPLEVEL_BINDING.eval('->(builder){builder.instance_eval{binding}}')
into a separate constant, i.e.
# Top level of file:
module Rack; end
Rack::BUILDER_CONTEXT = ->(builder){builder.instance_eval{binding}}
# ...
binding = BUILDER_CONTEXT.call(builder)
@eregon this should be the same thing right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems fine to me. I'd like the constant to be under Rack::Builder
. So I plan to use Rack::Builder::EVAL_CONTEXT
?
lib/rack/lock.rb
Outdated
@mutex.unlock | ||
@env[RACK_MULTITHREAD] = @old_rack_multithread | ||
def self.rackup(config, app) | ||
if config.multithread? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fiber scheduler now correctly handles locks so this might be relevant no matter what? i.e. we might prefer to use reentrant?
as it's currently defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this for concurrent?
, but I think only on Ruby 3+. I'll modify the code to handle that.
b67c697
to
4846141
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ioquatix I pushed a new commit and tried to respond to all of the points you raised.
CHANGELOG.md
Outdated
@@ -34,6 +34,10 @@ All notable changes to this project will be documented in this file. For info on | |||
|
|||
- [[CVE-2020-8184](https://nvd.nist.gov/vuln/detail/CVE-2020-8184)] Do not allow percent-encoded cookie name to override existing cookie names. BREAKING CHANGE: Accessing cookie names that require URL encoding with decoded name no longer works. ([@fletchto99](https://github.com/fletchto99)) | |||
|
|||
### Removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can do that. Note that most users won't be using it, only middleware/server authors.
README.rdoc
Outdated
@@ -77,7 +77,6 @@ middleware: | |||
* Rack::Files, for serving static files. | |||
* Rack::Head, for returning an empty body for HEAD requests. | |||
* Rack::Lint, for checking conformance to the \Rack API. | |||
* Rack::Lock, for serializing requests using a mutex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I'll restore it.
lib/rack/builder.rb
Outdated
# Config stores settings on what the server supports, such as whether it | ||
# is multithreaded. | ||
class Config | ||
def initialize(multithread: true, reentrant: multithread) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea here is that server authors can subclass this class to provide a richer API if they need to. Currently, the only use for the configuration is for concurrency. The issue with **options
is that it turns typos into silent failures.
lib/rack/builder.rb
Outdated
|
||
# Re-entrancy is a feature of event-driven servers which may perform non-blocking operations. When | ||
# an operation blocks, that particular request may yield and another request may enter the application stack. | ||
def reentrant?; @reentrant; end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If multithread?
is an implementation detail, so is multifiber?
. parallel?
is not accurate on CRuby, since Ruby code does not execute in parallel on CRuby (though that is also an implementation detail). Replacing reentrant?
with concurrent?
seems like a good idea to me, so I'll make that change.
I would like to get rid of multithread?
, but I think it it is necessary until we drop Ruby 2.7 support. That is because on Ruby 2.7 and below, you cannot use Rack::Lock in the concurrent but not multithread case, since it uses Mutex internally.
lib/rack/builder.rb
Outdated
builder = self.new(config: config) | ||
|
||
# Create a top level scope with self as the builder instance: | ||
binding = TOPLEVEL_BINDING.eval('->(builder){builder.instance_eval{binding}}').call(builder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems fine to me. I'd like the constant to be under Rack::Builder
. So I plan to use Rack::Builder::EVAL_CONTEXT
?
lib/rack/lock.rb
Outdated
@mutex.unlock | ||
@env[RACK_MULTITHREAD] = @old_rack_multithread | ||
def self.rackup(config, app) | ||
if config.multithread? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this for concurrent?
, but I think only on Ruby 3+. I'll modify the code to handle that.
lib/rack/builder.rb
Outdated
@@ -31,6 +38,30 @@ module Rack | |||
# You can use +map+ to construct a Rack::URLMap in a convenient way. | |||
|
|||
class Builder | |||
# Config stores settings on what the server supports, such as whether it | |||
# is multithreaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would make sense for us to explain that we expect severs may sub-class this to provide additional details?
Also, considering this usage, does it make more sense to provide a generic options hash?
Otherwise, I imagine we might have:
def rackup(config, app)
if config.respond_to?(:puma_thing)
puma_thing = config.puma_thing
...
While I'm okay with the general idea of a strong interface, I'm a little concerned about how we should use it if puma/falcon/thin/unicorn start adding server-specific interfaces.
We should also consider the case where Rack is an interface rather than a gem. To this end, it feels like options
is a stronger contender since it's simpler. But it's also less pleasant to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, I prefer your design, but I think we need to consider some specific usage scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the configuration would only be used for Rack::Lock. I'm not aware of any other use case, so making it more flexible and prone to silent failures doesn't seem like a good trade-off. If you have ideas for how this will be used by servers, that information would definitely be helpful. That said, I'm not strongly against the options hash approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rack for the most part tries to separate "Rack the SPEC" and "Rack the GEM". This configuration interface touches both worlds. I personally think Rack would benefit from richer interfaces, but I also appreciate the original design which was servers could follow "Rack the SPEC" without depending on "Rack the GEM". IIRC, Passenger does not pull in "Rack the GEM" and instead just follows the spec. In that case, @FooBarWidget would need a bespoke implementation of Rack::Builder::Config
but it's not specified anywhere except for the implementation in the GEM. Maybe this is a longer term project - we should define config.ru
- i.e. use
, run
, the context for evaluation, the builder and config interfaces, etc.
Frankly it feels a lot harder to define an interface like Rack::Builder::Config
- the building blocks we've used for "Rack the SPEC" has been Hash, Array, String, and other simple types. In any such spec, I guess we would leave the actual class as anonymous and just define the interface, e.g.
Your application may respond to `rackup` in which case it will be invoked with a configuration object which contains at least the following interface:
multithread? -> true | false
concurrent? -> true | false
I guess this all hinges on how much we define config.ru
and Rack::Builder
as "SPEC" vs "GEM". cc @tenderlove your input would be useful too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the middleware users and server implementers using hashes for this config would be better. It is easier to check for a hash value than it is to check if the config object respond to some method the only some server configs provide.
For what I get of this implementation, the only reason why a hash as config would not work right now is because the object allows servers to implement the rackup
method, but do we see them needing this kind of feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Rack::Builder
directly has the same issues that using Rack::Builder::Config
does in terms of SPEC. I agree from a SPEC perspective, rack avoids custom objects. However, config.ru
was never part of SPEC (which is only regarding the rack protocol), so I'm not sure it applies.
The issue with using a plain hash is that it won't have the equivalent of Config#rackup
, so middleware cannot pass the configuration to other middleware loaded by that middleware. However, maybe there are no such middleware that actually need that. Switching to a plain hash seems fine, I can work on that change if that's the direction we want to go.
However, maybe we should rethink the idea of middleware autoconfiguration based on configuration parameters? Is it really needed? Maybe we can just have Rack::Lock
always lock, and users who know their apps are not concurrency-safe can use it. If they use it on a non-concurrent webserver (e.g. Unicorn), it will use a lock when it doesn't need to, but uncontested mutex is not slow, so there is no real problem with doing so. What are your thoughts on abandoning the idea of middleware configuration? We would still remove rack.multithread
/rack.multiprocess
/rack.run_once
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts on abandoning the idea of middleware configuration? We would still remove
rack.multithread
/rack.multiprocess
/rack.run_once
.
I'm positive for that idea. To be fair the only middleware that I saw which uses this is Rack::Lock, which Rails doesn't include anymore by default and uses it own config config.allow_concurrency
to configure it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we still expose Builder#options
for server configuration/specific details at load time. Otherwise there is no per-builder side channel for server configuration/details at all. We don't need to handle the rackup
model in the same PR, if it's too hard, we can drop it. Users can write use MyMiddleware.rackup(self)
if they want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the need for a per-builder side channel. The complexity isn't worth it, IMO. Looks like @rafaelfranca agrees. So unless another committer is in favor of adding it, we should probably just commit the removal of rack.multithread
/rack.multiprocess
/rack.run_once
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's do that to start with.
These variables generally come too late to be useful. Removed `Rack::Lock` which depends on these variables.
This adds Rack::Builder::Config, which stores information on the server's configuration, such as whether it is multithreaded or supports reentrancy. Middleware can use the configuration by defining a rackup method in addition to a new method. If the rackup method is defined, it is called instead of the new method, with the configuration as the first argument and with the remaining arguments and block the same as what would be passed to new. In cases where the server's configuration indicates the middleware is not needed, the middleware rackup method can be just return the app itself. To handle the very rare case where a middleware would want to delegate to other middleware in certain server configurations, and doesn't know whether the other middleware supports the rackup method or not, The configuration object supports a rackup method, which will call rackup on the middleware if defined, or new otherwise. So if middleware A wants to use external middleware B in a certain server configuration, middleware A's rackup method could be something like: def self.rackup(config, app) if config.multithread? config.rackup(MiddlewareB, app) else new(app) end end The configuration rackup method is also used internally to implement the builder. This readds Rack::Lock, using the rackup method, which only uses the Rack::Lock middleware if the configuration indicates the server is multithreaded. The advantage of this approach is that it doesn't require exposing the entire builder API to the middleware, it only exposes the server configuration, which is all the middleware should need to appropriately configure itself.
Avoids the need to use TOPLEVEL_BINDING.
… on Ruby 2 Ruby 2 Mutex only works for multiple threads, not multiple fibers. Such apps are probably still broken at runtime unless they are using their own locks, but this matches the Rack 2 behavior.
This correctly doesn't release the lock until after the body is closed. Restore the related tests as well. Just make changes to avoid use of rack.multithread.
4846141
to
a8a0459
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great.
When using Rack >= 3.0.0 you get this error message on boot an app with Flipper UI : ``` warning: Rack::File is deprecated and will be removed in Rack 3.1 ``` This PR detects if `Rack::Files` is defined and tries to use it instead. Source : rack/rack#1720
These constants were removed from Rack in rack/rack#1720
These constants were removed from Rack in rack/rack#1720
This is an alternative approach to #1718. Instead of creating a new
builder for every middleware that responds to
rackup
, we have asingle configuration object that stores the server's configuration,
such as whether it is multithreaded. If middleware.rackup is defined,
we call it with the configuration object as the first object, and the
remaining arguments and block that would have been passed to new.
The middleware.rackup method can return the app itself if it not
needed, call new with the remaining arguments if it is needed, or
potentially other actions for more complex cases.
To handle the very rare case where a middleware would want to
delegate to other middleware in certain server configurations, and
doesn't know whether the other middleware supports the rackup
method or not, The configuration object supports a rackup method,
which will call rackup on the middleware if defined, or new
otherwise. So if middleware A wants to use external middleware B
in a certain server configuration, middleware A's rackup method
could be something like:
The configuration rackup method is also used internally to implement
the builder.
This readds Rack::Lock, using the rackup method, which only uses
the Rack::Lock middleware if the configuration indicates the
server is multithreaded.
The advantage of this approach is that it doesn't require exposing
the entire builder API to the middleware, it only exposes the
server configuration, which is all the middleware should need to
appropriately configure itself.