New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup ActionView::OutputBuffer #45614
Conversation
879c5d6
to
6ca8332
Compare
fragment = output_buffer.slice!(pos..-1) | ||
fragment = output_buffer.to_s.slice!(pos..-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this change is quite right. Previously this was actually mutating the output buffer (it's a really strange thing to do). @seejohnrun and I ran into this as well when experimenting with modifying outputbuffer, and tried replacing what this method does capture
https://github.com/jhawthorn/rails/pull/1/files#diff-7e8f6bc4873643df271aa007b513b993a8e424964b76e4eec2c1d156d9108ac3R252 (but I don't remember if it works)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look more into it tomorrow, but I was under the impression that mutating the buffer was simply an unneeded side effect. I'm pretty sure that buffer instance is no longer used after that slice!
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I was wrong. This slice!
is to handle formats such as .builder
, as it doesn't access @output_buffer
, it is initialized once with it.
So in cache do
we can't just substitute the buffer, we have to mutate it.
That's really unfortunate.
I'm tempted to add some stack to the buffer, the problem is that it's not clear to be wether @output_buffer
is guaranteed to be an OutputBuffer
(or StreamBuffer
) as there are some tests using ActiveSupport::SafeBuffer
. I'd have to dig into that as well to see wether it's legit or not.
I like this change. I think the speedup is more avoiding the indirection than the concatenation itself (I don't think anything called by I don't know if this happens in practice but here's another behaviour change we should guard against:
Probably worth adding an |
John: one of many reasons we're hoping to avoid concatenating onto a non-basic string is that YJIT is significantly faster when the receiver is a string. That's not the only reason we're doing it, but it's a side benefit :-) |
As I said, I'm in favour of this change. But I don't think that's true. String concatenation onto a subclass can be more or less the same speed. EDIT: # frozen_string_literal: true
require 'benchmark/ips'
class SafeBuffer < String
end
class OutputBuffer < SafeBuffer
def <<(value)
return self if value.nil?
super(value.to_s)
end
end
str = +""
safe_buffer = SafeBuffer.new
output_buffer = OutputBuffer.new
safe_buffer_encoded = SafeBuffer.new("")
output_buffer_encoded = OutputBuffer.new("")
puts "Encodings:"
pp({str:, safe_buffer:, output_buffer:, safe_buffer_encoded:, output_buffer_encoded:}.transform_values(&:encoding))
Benchmark.ips do |x|
x.report("String") { str << "" }
x.report("SafeBuffer") { safe_buffer << "" }
x.report("OutputBuffer") { output_buffer << "" }
x.report("SafeBuffer (same encoding)") { safe_buffer_encoded << "" }
x.report("OutputBuffer (same encoding)") { output_buffer_encoded << "" }
x.compare!(order: :baseline)
end
|
Okay. I don't think I have permissions on that repo. But at a minimum we would need to build more infrastructure for YJIT to make that true. So with current YJIT as it exists, I'm pretty certain that's true. |
benchmark attached, but again, I am in favour of this change. I also don't think YJIT needs extra infrastructure for that to be true, because Ruby does not special case rb_cString for concatenation (just T_STRING). Even if it did (it doesn't) we are going to want it because Let's continue the YJIT part of this discussion elsewhere because I think this is a good change anyways. |
@jhawthorn this is the optimization I'm talking about: https://github.com/ruby/ruby/blob/85ea46730deff70172a9f50172f0011a7401f371/vm_insnhelper.c#L5374-L5380 static VALUE
vm_opt_ltlt(VALUE recv, VALUE obj)
{
if (SPECIAL_CONST_P(recv)) {
return Qundef;
}
else if (RBASIC_CLASS(recv) == rb_cString &&
BASIC_OP_UNREDEFINED_P(BOP_LTLT, STRING_REDEFINED_OP_FLAG)) {
if (LIKELY(RB_TYPE_P(obj, T_STRING))) {
return rb_str_buf_append(recv, obj);
} else {
return rb_str_concat(recv, obj);
}
}
else if (RBASIC_CLASS(recv) == rb_cArray &&
BASIC_OP_UNREDEFINED_P(BOP_LTLT, ARRAY_REDEFINED_OP_FLAG)) {
return rb_ary_push(recv, obj);
}
else {
return Qundef;
}
} String subclasses are explictly excluded, and it's particularly true for |
Good catch. |
6ca8332
to
d0fb46f
Compare
d0fb46f
to
9343e07
Compare
MRI has a lot of optimizations for string concatenation that are only available when concatenating into a `String` instance. Using a `String` subclass disable these optimizations. The difference is even more important with YJIT. So ideally we want the buffer not to be a String subclass. Luckily, the Action View buffer is for internal use only, so we can replace inheritance by composition without much work. Benchmark: ``` ActionView::OutputBuffer: 147644.2 i/s optimized buffer: 228001.4 i/s - 1.54x (± 0.00) faster ``` Source: https://gist.github.com/casperisfine/7199579a138e268fda71d6a91366af49 NB: That 50% faster figure is to be contextualized, it can radically change from one template to the other, but is always faster.
9343e07
to
4e8e828
Compare
rails/rails#45614 broke our Global Output Buffer
* Pin CI to non-broken rails main rails/rails#45614 broke our Global Output Buffer * I don't know why standard failed here but 🤷 * add changelog
In rails/rails#45614 I modified `ActionView::OutputBuffer` to no longer be a subclass of `ActiveSupport::SafeBuffer` and as such it's not recommended to use `SafeBuffer` for buffering views anymore. Additionally in rails/rails#45731 I start doing some changes so that we stop mutating `@output_buffer`. So template should no longer assign `@output_buffer` themselves.
In rails/rails#45614 I modified `ActionView::OutputBuffer` to no longer be a subclass of `ActiveSupport::SafeBuffer` and as such it's not recommended to use `SafeBuffer` for buffering views anymore. Additionally in rails/rails#45731 I start doing some changes so that we stop mutating `@output_buffer`. So template should no longer assign `@output_buffer` themselves.
We ran into a pretty significant performance regression with this change. In our app, we usually are able to call However, with this change Here's a benchmark demonstrating the issue: # frozen_string_literal: true
require 'rails'
require 'action_view'
require 'benchmark/ips'
class BenchView < ActionView::Base
def compiled_method_container
self.class
end
def self.compiled_method_container
self
end
end
view = BenchView.new(
ActionView::LookupContext,
{},
nil,
)
locals = { }
erb_template = <<~ERB
Hello! #{"a"*1000}
ERB
template = ActionView::Template.new(
erb_template,
"hello template",
ActionView::Template::Handlers::ERB.new,
virtual_path: "hello",
locals: locals.keys,
)
def render_many(view, locals, template)
buffer = ActionView::OutputBuffer.new
1000.times do
template.render(view, locals, buffer)
end
end
Benchmark.ips do |x|
x.report("render_many") { render_many(view, locals, template) }
end Before this change
After this change
We can't run this change in production as-is. The good news is that I think we might be able to fail-forward by avoiding calling |
Thanks @jhawthorn I'll comment on #45756 |
In rails/rails#45614 I modified `ActionView::OutputBuffer` to no longer be a subclass of `ActiveSupport::SafeBuffer` and as such it's not recommended to use `SafeBuffer` for buffering views anymore. Additionally in rails/rails#45731 I start doing some changes so that we stop mutating `@output_buffer`. So template should no longer assign `@output_buffer` themselves.
Rails 7.1 has sped up its internal OutputBuffer implementation (what handles << concatenation in templates), which we rely on tricking a bit. rails/rails#45614 Then a follow up PR the `to_s` call was moved to after calling `view._run` rails/rails#45756 However, we rely on the return value being a Hash (which I'm not sure how to fix yet). So since `target!` is the return value of our templates, we can wrap that and ensure `to_s` doesn't actually `to_s`, which Rails 7.1 will call for us. On Rails < 7.1 we must call the compatibility `unwrap_target!` method ourselves to get our inner Hash return value. This also adds a Rails version matrix so we can detect these things before release in the future.
* Fix Rails 7.1 compatibility. Rails 7.1 has sped up its internal OutputBuffer implementation (what handles << concatenation in templates), which we rely on tricking a bit. rails/rails#45614 Then a follow up PR the `to_s` call was moved to after calling `view._run` rails/rails#45756 However, we rely on the return value being a Hash (which I'm not sure how to fix yet). So since `target!` is the return value of our templates, we can wrap that and ensure `to_s` doesn't actually `to_s`, which Rails 7.1 will call for us. On Rails < 7.1 we must call the compatibility `unwrap_target!` method ourselves to get our inner Hash return value. This also adds a Rails version matrix so we can detect these things before release in the future. * Allow us to run tests in multiple Rails versions * Yep, I'm calling it, don't think it's worth it anymore * Fix enum invocation for Rails 6.1 tests
* rails-7.1: Rails 7.1 no longer populates redirect body (rails/rails#44554). Calling silence on ActiveSupport::Deprecation is deprecated and will be removed from Rails (use Rails.application.deprecators.silence instead) Deprecator setting has been deprecated. run db:drop, db:create and db:migrate in a separate commands (probably due to rails/rails#49349) Override and revert rails/rails#46699 for now, move test database from /storage back to /db Rails 7.1 replaces config.cache_classes with config.enable_reloading in template environment/test.rb Add Rails 7.1 test gem file. to_default_s is deprecated and will be removed from Rails 7.2 (use to_s instead) ActionView::OutputBuffer refactored by rails/rails#45614 (Rails 7.1) See rails/rails#36020
* Pin CI to non-broken rails main rails/rails#45614 broke our Global Output Buffer * I don't know why standard failed here but 🤷 * add changelog
* Pin CI to non-broken rails main rails/rails#45614 broke our Global Output Buffer * I don't know why standard failed here but 🤷 * add changelog
MRI has a lot of optimizations for string concatenation that are only available when concatenating into a
String
instance.Using a
String
subclass disable these optimizations.The difference is even more important in Ruby 3.2 where both YJIT and the VM noticeably improve
String#<<
performance.So ideally we want the buffer not to be a String subclass. Luckily, the Action View buffer is for internal use only, so we can replace inheritance by composition without much work.
Benchmark:
Source: https://gist.github.com/casperisfine/7199579a138e268fda71d6a91366af49
NB: That 50% faster figure is to be contextualized, it can radically change
from one template to the other, but is always faster.
cc @jhawthorn @matthewd @noahgibbs @tenderlove