Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix caching of missed translations #45572

Merged

Conversation

fatkodima
Copy link
Member

Fixes #45571.

When using Object.new as a default, we get another object when it is fetched from the cache store - https://github.com/ruby-i18n/i18n/blob/5963031f2cc52847a6c431be2b2b38229b1793d2/lib/i18n/backend/cache.rb#L89

So I used some random number as a default.

@rails-bot rails-bot bot added the actionview label Jul 12, 2022
MISSING_TRANSLATION = Object.new
MISSING_TRANSLATION = -1327903252
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea! Though I am somewhat wary of choosing magic numbers. Class.new also seems to work, but I'm not sure if it works with all cache stores (but maybe that doesn't matter, given the use case).

Another possibility would be to choose a sufficiently unlikely string (e.g. a string of unprintable characters), and change equal? to ==.

But I don't know how any of these options affect performance both when using I18n::Backend::Cache and not. And I'm not sure if performance is even a practical concern when using I18n::Backend::Cache — see #45571 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class.new also seems to work

I think Class.new also would not work (same as Object.new), at least in redis/memcached, where objects are serialized/deserialized.

Another possibility would be to choose a sufficiently unlikely string

I am not sure this is better than just the number.

But I don't know how any of these options affect performance both when using I18n::Backend::Cache and not.

I am not sure this will give a performance boost for memory/redis/memcached cache stores, but for slower ones (like https://github.com/svenfuchs/i18n-active_record) will give.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least in redis/memcached, where objects are serialized/deserialized.

Objects are serialized for MemoryStore too, hence the current issue. 😉 But yes, the implementation may be different.

I am not sure this is better than just the number.

Generally speaking, using a random anything concerns me because nothing prevents someone else from "randomly" using that thing too. Another concern, though, is that we'd still be relying on equal? to return true for a deserialized object. For MRI, that's true for smaller numbers (like the one you've chosen), but do you know if that is part of the Ruby spec?

If not, I think using == would be safer. And if we're using ==, I think using an intentional string like "\0\a" or "\aMISSING" or similar makes sense. Although if benchmarks show a significant difference, then I am open to using a number. But I would feel better about using a larger number (or, in the negative, smaller).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got your point, agreed. Changed to the string.
I haven't measured, but for sure there will be only a minor difference, if any.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't measured, but for sure there will be only a minor difference, if any.

I18n.translate can have a surprising performance difference depending on the type of its arguments. I would encourage you to benchmark this. 😉

Additionally, whichever constant we choose, would mind also applying it here and here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark.ips do |x|
  x.report('string') do
    translate('hello')
  end
  x.report('number') do
    old_translate('hello')
  end
  x.report('string - missing') do
    translate('missing')
  end
  x.report('number - missing') do
    old_translate('missing')
  end
  x.compare!
end

With caching in memory:

Warming up --------------------------------------
              string     3.115k i/100ms
              number     3.201k i/100ms
    string - missing     1.133k i/100ms
    number - missing     1.181k i/100ms
Calculating -------------------------------------
              string     31.112k (± 2.8%) i/s -    155.750k in   5.010104s
              number     30.473k (± 7.0%) i/s -    153.648k in   5.068716s
    string - missing     11.525k (± 1.7%) i/s -     57.783k in   5.015333s
    number - missing     11.810k (± 1.9%) i/s -     60.231k in   5.101684s

Comparison:
              string:    31112.2 i/s
              number:    30473.5 i/s - same-ish: difference falls within error
    number - missing:    11810.3 i/s - 2.63x  (± 0.00) slower
    string - missing:    11524.5 i/s - 2.70x  (± 0.00) slower

Without caching:

Warming up --------------------------------------
              string     7.175k i/100ms
              number     7.015k i/100ms
    string - missing     1.421k i/100ms
    number - missing     1.441k i/100ms
Calculating -------------------------------------
              string     65.531k (±11.8%) i/s -    322.875k in   5.007789s
              number     68.760k (± 6.7%) i/s -    343.735k in   5.034387s
    string - missing     13.428k (±13.9%) i/s -     66.787k in   5.104037s
    number - missing     14.066k (± 5.8%) i/s -     70.609k in   5.039283s

Comparison:
              number:    68760.0 i/s
              string:    65531.5 i/s - same-ish: difference falls within error
    number - missing:    14066.2 i/s - 4.89x  (± 0.00) slower
    string - missing:    13428.1 i/s - 5.12x  (± 0.00) slower

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to see how the results compared with Object.new, so I adapted my benchmark script from #44299:

Benchmark script
require "benchmark/ips"

class BaseModel
  extend ActiveModel::Translation
end

BlogPost = Class.new(BaseModel)

NAMED_CLASS = Class.new

[true, false].each do |missing|
  [false, true].each do |caching|
    I18n.backend = Class.new(I18n::Backend::Simple).new
    if caching
      I18n.backend.class.include(I18n::Backend::Cache)
      I18n.cache_store = ActiveSupport::Cache.lookup_store(:memory_store)
    end

    if !missing
      I18n.backend.store_translations(:en, activemodel: { models: { blog_post: "Blerg" } })
    end

    puts ""
    puts "--- missing: #{missing}, caching: #{caching} ".ljust(72, "-")
    puts ""

    [Object.new, NAMED_CLASS, "\aMISSING\a", 2**30, 2**60].each do |sentinel|
      ActiveModel::Name.send(:remove_const, :MISSING_TRANSLATION)
      ActiveModel::Name.const_set(:MISSING_TRANSLATION, sentinel)

      Benchmark.ips do |x|
        x.report(sentinel.inspect) { BlogPost.model_name.human }
      end
    end
  end
end

The results I got appear to be different than yours:

--- missing: true, caching: false --------------------------------------

#<Object:0x00007f376...> 27.906k (± 1.3%) i/s -    140.550k in   5.037437s
         NAMED_CLASS     27.876k (± 0.8%) i/s -    140.913k in   5.055406s
       "\aMISSING\a"     13.512k (± 1.1%) i/s -     68.100k in   5.040470s
          1073741824     27.889k (± 1.0%) i/s -    139.850k in   5.015039s
 1152921504606846976     27.768k (± 1.2%) i/s -    139.850k in   5.037131s

--- missing: true, caching: true ---------------------------------------

#<Object:0x00007f376...> 21.390k (± 1.0%) i/s -    108.579k in   5.076757s
         NAMED_CLASS     23.524k (± 1.3%) i/s -    119.187k in   5.067600s
       "\aMISSING\a"     26.392k (± 0.9%) i/s -    132.300k in   5.013182s
          1073741824     27.862k (± 1.1%) i/s -    140.300k in   5.036137s
 1152921504606846976     27.827k (± 1.0%) i/s -    139.450k in   5.011886s

--- missing: false, caching: false -------------------------------------

#<Object:0x00007f2c4...> 18.859k (± 1.4%) i/s -     94.600k in   5.017336s
         NAMED_CLASS     18.952k (± 0.8%) i/s -     96.186k in   5.075477s
       "\aMISSING\a"     18.905k (± 1.4%) i/s -     96.084k in   5.083430s
          1073741824     18.827k (± 0.9%) i/s -     95.268k in   5.060580s
 1152921504606846976     18.734k (± 1.3%) i/s -     95.064k in   5.075230s

--- missing: false, caching: true --------------------------------------

#<Object:0x00007f376...> 24.845k (± 1.1%) i/s -    124.350k in   5.005617s
         NAMED_CLASS     26.684k (± 1.0%) i/s -    136.017k in   5.097946s
       "\aMISSING\a"     26.510k (± 1.0%) i/s -    134.844k in   5.087075s
          1073741824     26.715k (± 1.1%) i/s -    134.500k in   5.035266s
 1152921504606846976     26.728k (± 1.1%) i/s -    133.950k in   5.012120s

If you run my script, do you see similar results? If so, what is causing the discrepancy with the results you posted?

In particular, I would like to point out that when "missing: true" (which is common enough for e.g. model_name.human), caching in almost every case appears to be slower than not caching while using MISSING_TRANSLATION = Object.new. However, caching does appear to provide a significant speed-up when "missing: false". Those are the kind of results I was looking for in #45571 (comment).

On the plus side, it looks like there is a strong case for using a large integer for MISSING_TRANSLATION. 😃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was kinda obvious that the number should be faster than the string 😄
I am seeing similar results with your benchmark. My benchmark was very simple and on the view context.
There is a very little probability someone will use a random 32-bits integer, but lets use a bigger integer, since it shows the same results.

And thank you for the patience and detailed comments ❤️! I hope this will be merged this time 😄

@fatkodima fatkodima force-pushed the fix-cached-missing-translations branch 2 times, most recently from 42346f9 to 4c03eca Compare July 14, 2022 16:38
@fatkodima fatkodima force-pushed the fix-cached-missing-translations branch from 4c03eca to 85be794 Compare July 15, 2022 18:19
@jonathanhefner jonathanhefner force-pushed the fix-cached-missing-translations branch from 85be794 to 3822172 Compare July 15, 2022 19:49
@jonathanhefner jonathanhefner merged commit 9994d38 into rails:main Jul 15, 2022
@jonathanhefner
Copy link
Member

jonathanhefner commented Jul 15, 2022

To wind this down, I pushed a commit to replace 1152921504606846976 with 2**60. I did not want future readers to wonder whether the number was significant (or random). Also, it occured to me that, if caching is involved, all three MISSING_TRANSLATION values should be kept in sync, and that's easier to verify with 2**60. (If you want to submit a PR with a more robust strategy or even just more test coverage, I will gladly take a look!)

Thank you, @fatkodima! 🙌

(Backported to 7-0-stable.)

@fatkodima
Copy link
Member Author

Thanks, lgtm 👍

@fatkodima fatkodima deleted the fix-cached-missing-translations branch July 15, 2022 20:09
jonathanhefner added a commit that referenced this pull request Jul 24, 2022
Fix caching of missed translations

(cherry picked from commit 9994d38)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants