Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beyond Ludicrous Speed #21057

Merged
merged 22 commits into from Jul 30, 2015
Merged

Conversation

schneems
Copy link
Member

@schneems schneems commented Jul 29, 2015

I've been working on quite a few performance improvements to Rails and the Rails ecosystem, but this is by far the single biggest set of improvements I've been able to make. Those other changes have maybe shaved off a few objects here and there, and maybe 1 or 2% faster request time if I'm extremely lucky.

This change shaves off 34,299 objects and 3,457,318 bytes (3.29 MiB) allocated on every request. This is a 29% decrease in objects and a 23% decrease in memory allocated per request. Taking us past ludicrous speed...

So just how much does this improve overall request time? To measure I used http://www.codetriage.com as an example app and https://github.com/schneems/derailed_benchmarks to generate load. I hit the app with 1,000 requests with my patch and then 1,000 requests against Rails master. I repeated several times until I saw a fairly stable standard deviation.

The average time for 1,000 request against master was 221.88 seconds and against this patch was 195.401 seconds. This gives us a...

11.9 % speed improvement

Where does that speed improvement come from? The big theme is not allocating objects when we don't need to. I used $ derailed exec perf:objects which uses memory_profiler to look for large pockets of allocated memory against http://www.codetriage.com. You can get an in-depth explanation of each change in the commit along with a benchmark of objects and memory saved. The largest area I was able to make gains on was link generation, by removing extraneous arrays, duplicated hashes, and getting rid of intermediate objects whenever possible. I measured this speedup directly:

# $ bin/rails console

require 'benchmark/ips'
repo = Repo.first

Benchmark.ips do |x|
  x.report("url_for speed") {
    app.url_for(repo)
  }
end

This showed 2,865 iterations per second on master and 4,143 iterations per second on my patch. Which is a 44% speed improvement in url generation.

These performance improvements preserve existing interfaces and behaviors, all tests pass.

@rafaelfranca
Copy link
Member

rafaelfranca commented Jul 29, 2015

If you want just to run tests you can always push to rails/rails since you have commit access.

@schneems schneems force-pushed the schneems/journey-formatter-objects branch from fb782aa to 6776ce7 Compare Jul 30, 2015
schneems added 11 commits Jul 30, 2015
When a symbol is passed in, we call `to_s` on it which allocates a string. The two hardcoded symbols that are used internally are `:to_partial_path` and `:to_model`.

This change buys us 71,136 bytes of memory and 1,777 fewer objects per request.
In `apply_inflections` a string is down cased and some whitespace stripped in the front (which allocate strings). This would normally be fine, however `uncountables` is a fairly small array (10 elements out of the box) and this method gets called a TON. Instead we can keep an array of valid regexes for each uncountable so we don't have to allocate new strings.

This change buys us 325,106 bytes of memory and 3,251 fewer objects per request.
The request.script_name is dup-d which allocates an extra string. It is most commonly an empty string "". We can save a ton of string allocations by checking first if the string is empty, if so we can use a frozen empty string instead of duplicating an empty string.

This change buys us 35,714 bytes of memory and 893 fewer objects per request.
Micro optimization: `reverse.drop_while` is slower than `reverse_each.drop_while`. This doesn't save any object allocations.

Second, `keys_to_keep` is typically a very small array. The operation `parameterized_parts.keys - keys_to_keep` actually allocates two arrays. It is quicker (I benchmarked) to iterate over each and check inclusion in array manually.

This change buys us 1774 fewer objects per request
Most routes have a `route.path.requirements[key]` of `/[-_.a-zA-Z0-9]+\/[-_.a-zA-Z0-9]+/` yet every time this method is called a new regex is generated on the fly with `/\A#{DEFAULT_INPUT}\Z/`. OBJECT ALLOCATIONS BLERG!

This change uses a special module that implements `===` so it can be used in a case statement to pull out the default input. When this happens, we use a pre-generated regex.

This change buys us 1,643,465 bytes of memory and 7,990 fewer objects per request.
In handle_positional_args `Array#-=` is used which allocates a new array. Instead we can iterate through and delete elements, modifying the array in place.

Also `Array#take` allocates a new array. We can build the same by iterating over the other element.

This change buys us 106,470 bytes of memory and 2,663 fewer objects per request.
When generating a url with `url_for` the hash of arguments passed in, is dup-d and merged a TON. I wish I could clean this up better, and might be able to do it in the future. This change removes one dup, since it's literally right after we just dup-d the hash to pass into this constructor.

This may be a breaking, change but the tests pass...so :shipit: we can revert if it causes problems

This change buys us 205,933 bytes of memory and 887 fewer objects per request.
When an unknonwn key is passed to the hash in `PRE_CONTENT_STRINGS` it returns nil, when you call "#{nil}" it allocates a new empty string. We can get around this allocation by using a default value `Hash.new { "".freeze }`. We can avoid the `to_sym` call by pre-populating the hash with a symbol key in addition to a string key.

We can freeze some strings when using Array#* to reduce allocations.

Array#join can take frozen strings.

This change buys us 86,600 bytes of memory and 1,857 fewer objects per request.
No idea why on earth this hash key isn't already optimized by MRI, but it isn't. 💩

This change buys us 74,077 bytes of memory and 1,852 fewer objects per request.
The instrument method creates new strings, the most common action to instrument is "!render_template` so we can detect when that action is occurring and use a frozen string instead.

This change buys us 113,714 bytes of memory and 1,790 fewer objects per request.
content_tag's first argument is will generate a string with an html tag so `:a` will generate: `<a></a>`. When this happens, the symbol is implicitly `to_s`-d so a new string is allocated. We can get around that by using a frozen string instead which

This change buys us 74,236 bytes of memory and 1,855 fewer objects per request.
@schneems schneems force-pushed the schneems/journey-formatter-objects branch from 6776ce7 to 7fc36cb Compare Jul 30, 2015
@schneems schneems changed the title Running tests Beyond Ludicrous Speed Jul 30, 2015
@schneems
Copy link
Member Author

schneems commented Jul 30, 2015

Build is green, for some reason it's not showing up on the PR https://travis-ci.org/rails/rails/builds/73305474

@jeremy
Copy link
Member

jeremy commented Jul 30, 2015

Nice work @schneems !

@vipulnsward
Copy link
Member

vipulnsward commented Jul 30, 2015

🏇

@arthurnn
Copy link
Member

arthurnn commented Jul 30, 2015

Squash! =)

@rafaelfranca
Copy link
Member

rafaelfranca commented Jul 30, 2015

:shipit:

@rafaelfranca
Copy link
Member

rafaelfranca commented Jul 30, 2015

Really nice work!

@arthurnn
Copy link
Member

arthurnn commented Jul 30, 2015

❤️ ❤️ ❤️ ❤️ ❤️

@@ -33,6 +33,7 @@ def generate(name, options, path_parameters, parameterize = nil)
defaults = route.defaults
required_parts = route.required_parts
parameterized_parts.delete_if do |key, value|
next if defaults[key].nil?
Copy link
Member

@jeremy jeremy Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining control flow (next as early-return) and a boolean expression feels a little awkward. How about folding the not-nil check into the expression?

Copy link
Member Author

@schneems schneems Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It becomes a bit long for my tastes, also requires a bang expression, which I try to avoid:

!defaults[key].nil? && value.to_s == defaults[key].to_s && !required_parts.include?(key)

You make the call. Which do you prefer?

Copy link
Member

@matthewd matthewd Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really considered its merits (or checked it's right 😉), but here's a less negative spelling of that:

parameterized_parts.keep_if do |key, value|
  defaults[key].nil? ||
    value.to_s != defaults[key].to_s ||
    required_parts.include?(key)
end

Copy link
Member

@jeremy jeremy Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could flip to keep_if to simplify the expression a bit.

Copy link
Member Author

@schneems schneems Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! action patch tests pass with that for me. seems good, somehow removing the bangs makes it okay to be on one line (for me). I pushed this change. Will wait on the whole suite to run again to be sure.

@pixeltrix
Copy link
Contributor

pixeltrix commented Jul 30, 2015

@schneems well done 👏

@@ -33,6 +33,7 @@ def generate(name, options, path_parameters, parameterize = nil)
defaults = route.defaults
required_parts = route.required_parts
parameterized_parts.delete_if do |key, value|
next if defaults[key].nil?
value.to_s == defaults[key].to_s && !required_parts.include?(key)
Copy link
Member

@jeremy jeremy Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is value ever nil?

Copy link
Member Author

@schneems schneems Jul 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked via actionpack tests and with my own app, value is never nil.

@schneems schneems force-pushed the schneems/journey-formatter-objects branch from 7fc36cb to 61dae88 Compare Jul 30, 2015
When `defaults[key]` in `generate` in the journey formatter is called, it often returns a `nil` when we call `to_s` on a nil, it allocates an empty string. We can skip this check when the default value is nil.

This change buys us 35,431 bytes of memory and 887 fewer objects per request.

Thanks to @matthewd for help with the readability
@007lva
Copy link

007lva commented Aug 5, 2015

👏 👏 👏

@connorshea
Copy link
Contributor

connorshea commented Aug 6, 2015

Another reason to look forward to Rails 5.0! :shipit:

@toreriklinnerud
Copy link

toreriklinnerud commented Aug 6, 2015

Thanks for this - great work that will benefit everyone using Rails! ❤️

@davekapp
Copy link

davekapp commented Aug 6, 2015

Amazing work. Congrats to you and everyone else and I owe you a 🍺 sometime. :)

end

# Move 'index' action from options to recall
def normalize_action!
if @options[:action] == 'index'
if @options[:action] == 'index'.freeze
Copy link

@Johnius Johnius Aug 6, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's a stupid question. But why do you freeze strings?

Copy link
Member Author

@schneems schneems Aug 6, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explains the freeze method for strings 
http://tmm1.net/ruby21-fstrings/

This article explains why you might want I to use frozen strings for performance with some benchmarks http://www.sitepoint.com/unraveling-string-key-performance-ruby-2-2/

Copy link
Contributor

@dmitry dmitry Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to have a constant defined in a class/module?

Copy link
Member Author

@schneems schneems Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out https://github.com/rails/rails/pull/21057/files#r36532115 for why we're not doing that

@radar
Copy link
Contributor

radar commented Aug 7, 2015

Excellent work @schneems! 🎉 🍻

@illtellyoulater
Copy link

illtellyoulater commented Aug 7, 2015

@schneems thanks for making Rails so much better, one more time! 👏 👏 👏

@@ -164,7 +164,7 @@ def deconstantize
#
# <%= link_to(@person.name, person_path) %>
# # => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
def parameterize(sep = '-')
def parameterize(sep = '-'.freeze)
Copy link

@kgrz kgrz Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's better to have a single place that has all the frozen strings? Sort of like boot.rb, but that will have the calls to freeze for these tokens (- here, :: in the next file etc.

Copy link
Contributor

@bquorning bquorning Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’d still have to call freeze when you want to use the frozen string, so there is no point in freezing them up front.

>> '-'.freeze.object_id
=> 70222534974420
>> '-'.object_id
=> 70222535341760
>> '-'.freeze.object_id
=> 70222534974420
>> '-'.object_id
=> 70222535533680

Copy link

@kgrz kgrz Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, TIL, thank you!

Copy link

@kgrz kgrz Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bquorning That said, I remember a way I did that in a different project. That was by using constants in cases like these. For example, def parameterize(sep = DASH_TOKEN). That way, the object_ids will be same no matter where this token gets used, IIRC.

Copy link
Contributor

@bquorning bquorning Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, if you assign the frozen strings to constants up front, you can re-use the same instance all over the place. Plus, they won't be GC'ed.

Copy link
Contributor

@bquorning bquorning Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To quote @schneems from rack/rack#737:

“While we could certainly go overboard and pre-define ALL strings as constants, that would be pretty gnarly to work with. This patch goes after the largest of the low hanging fruit.”

Copy link

@kgrz kgrz Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. More cleaner, IMO. That was what I wanted to convey when I first posted the comment. Evidently, I wasn't clear. Sorry about that.

Copy link

@kgrz kgrz Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify my first comment again:

I meant that it might be nice to have these tokens like -, ::, '' etc to, say, DASH, DOUBLE_COLON, EMPTY_STRING etc defined in a separate file that's loaded upfront and those constants used everywhere instead of multiple "-".freeze calls all over the code.

Copy link
Member Author

@schneems schneems Aug 7, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifying frozen strings as constants is slower. Constants work internally as a global hash, every time you call a constant Ruby has to do a hash lookup. If you use String#freeze inline, you don't have to do the lookup and your code is faster. Also ''.freeze is shorter than EMPTY_STRING.

require 'benchmark/ips'

HELLO = "hello".freeze
Benchmark.ips do |x|
  x.report("freeze")   { "hello".freeze + "world" }
  x.report("constant") { HELLO + "world" }
end
Calculating -------------------------------------
              freeze   100.611k i/100ms
            constant    99.036k i/100ms
-------------------------------------------------
              freeze      3.630M (± 7.9%) i/s -     18.110M
            constant      3.470M (±11.6%) i/s -     17.034M

@nitinstp23
Copy link

nitinstp23 commented Aug 7, 2015

Nice work @schneems 🍻

@arteezy
Copy link

arteezy commented Aug 7, 2015

@schneems bravo 👏

@marcgg
Copy link
Contributor

marcgg commented Aug 7, 2015

👍 👍 👍 Thanks a lot for this

@arun057
Copy link

arun057 commented Aug 7, 2015

Nice work @schneems

@coderxin
Copy link

coderxin commented Aug 10, 2015

@schneems nice work! 👏

@kennym
Copy link
Contributor

kennym commented Aug 10, 2015

@schneems this is awesome :-) 👍

@dmitry
Copy link
Contributor

dmitry commented Aug 10, 2015

Still a question, why not to use constants, instead of forzen? Because of the readability?

@matthewd
Copy link
Member

matthewd commented Aug 10, 2015

@dmitry there is no reason to use constants. This is how you spell an immutable string literal in ruby; we want immutable string literals, so that's what we're doing.

(Also, @schneems has already pointed out that constants may be slower.)

@dmitry
Copy link
Contributor

dmitry commented Aug 10, 2015

@matthewd thanks for pointing out.

Interestingly on my computer (with 2.1.5 and 2.2.2 rubies) this benchmark produces almost the same results for constant/freeze strings. But readability of freeze is better, except the times when it's repeating many times in a code, and you would like to inspect them with IDE or grep.

@garysweaver
Copy link
Contributor

garysweaver commented Aug 10, 2015

👍 👍 Nice work!

@@ -75,13 +75,21 @@ def parameterize(string, sep = '-')
# Turn unwanted chars into the separator
parameterized_string.gsub!(/[^a-z0-9\-_]+/i, sep)
unless sep.nil? || sep.empty?
re_sep = Regexp.escape(sep)
if sep == "-".freeze
Copy link
Contributor

@heliocola heliocola Aug 24, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schneems the "sep" variable in the definition of the parameterize function is not using the .freeze when sep is not present in the call. Will this if catch or miss the case when somebody call parameterize("string")?
If the comparison takes in consideration the string (and not the object id), this probably won't be a problem... Does it make sense?
I noticed you used the sep = "-".freeze in the parameterize function on file 'activesupport/lib/active_support/core_ext/string/inflections.rb'

Copy link
Member Author

@schneems schneems Aug 24, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is wrong. Are you concerned that "-".freeze does not == "-" ? It does

puts "-".freeze == "-"
true
puts "-" == "-".freeze
true

Copy link
Contributor

@heliocola heliocola Aug 24, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison was the main thing I thought about.
If you call it without the second parameter it will have a different string "-" created but the comparison will work.

@alejandrodevs
Copy link

alejandrodevs commented Oct 1, 2015

@schneems Nice work!

@lucascaton
Copy link
Contributor

lucascaton commented Dec 14, 2015

@schneems Nice one! ツ

@swapab
Copy link

swapab commented Feb 16, 2016

Is this PR available in rails 3.2.22 or 3.2.22.1 ?

@dmitry
Copy link
Contributor

dmitry commented Feb 16, 2016

@swapnilabnave no, it's only available for v5.0.0.beta2 v5.0.0.beta1.1 v5.0.0.beta1. You can see that by checking merge commit: 5373bf2

@swapab
Copy link

swapab commented Feb 16, 2016

@dmitry Thanks!

@Chizuru-Maxienne
Copy link

Chizuru-Maxienne commented Feb 16, 2016

Keep up the good work! @schneems 👏 👏 👍 🍰

@tiagoovieira
Copy link

tiagoovieira commented Feb 22, 2016

🔝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet