Since v3.5.4 it is no longer possible to call Hashie::Mash[]= more than once with the same key #413

Closed
BRMatt opened this Issue Feb 23, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@BRMatt

BRMatt commented Feb 23, 2017

Since 3.5.4 it is no longer possible to assign a value to a key more than once:

require 'hashie'

foo = Hashie::Mash.new

foo['foobar'] = Array.new
foo['foobar'] = Array.new

produces:

/Users/mattbutton/.gem/ruby/2.2.2/gems/hashie-3.5.4/lib/hashie/mash.rb:344:in `method': undefined method `foobar' for class `Hashie::Mash' (NameError)
        from /Users/mattbutton/.gem/ruby/2.2.2/gems/hashie-3.5.4/lib/hashie/mash.rb:344:in `log_built_in_message'
        from /Users/mattbutton/.gem/ruby/2.2.2/gems/hashie-3.5.4/lib/hashie/mash.rb:151:in `custom_writer'
        from bug.rb:6:in `<main>'

The issue is that fc4f6e2 changed the code that logs warnings to use respond_to? to check if the call to Hashie::Mash[]= would override a method that exists on Hashie::Mash, or the thing that's subclassing it.

The documentation for Object#respond_to? says:

If the method is not defined, respond_to_missing? method is called and the result is returned.

Unfortunately Hashie::Mash overrides respond_to_missing? to return true if a key with that name exists in the hash.

When the warning logger is invoked it calls self#method(method_key) to try and look up details about the method. This raises an error as the method does not exist (Hashie::Mash does not actually define methods on the object, it implements dynamic method lookup using #method_missing)

This is made worse by the fact that you can't disable warnings globally for all Hashie::Mash objects, only subclasses. Unfortunately some fairly prominent users of Hashie::Mash do not subclass it.

I think the previous version of this worked because methods.include? excluded the "magic" methods that Hashie::Mash will respond to.

@BRMatt BRMatt changed the title from v3.5.4 introduces severe breaking change for Hashie::Mash to Since v3.5.4 it is no longer possible to call Hashie::Mash[]= more than once with the same key Feb 23, 2017

@michaelherold

This comment has been minimized.

Show comment
Hide comment
@michaelherold

michaelherold Feb 23, 2017

Contributor

Oy vey. Thanks for reporting this. This has been a major thorn in our side. I'll see what I can do to mitigate the problem.

If you're digging into it, we'd be happy to accept a PR with a failing test or a fix!

Contributor

michaelherold commented Feb 23, 2017

Oy vey. Thanks for reporting this. This has been a major thorn in our side. I'll see what I can do to mitigate the problem.

If you're digging into it, we'd be happy to accept a PR with a failing test or a fix!

@dblock dblock referenced this issue in dblock/code.dblock.org Feb 23, 2017

Closed

Document the Hashie::Mash logging saga #46

@dblock

This comment has been minimized.

Show comment
Hide comment
@dblock

dblock Feb 23, 2017

Contributor

Oh wow when will we get over this saga? :( Thanks for helping out @michaelherold, LMK if you need anything. I'm making myself a note do blog about this once it's over ;)

Contributor

dblock commented Feb 23, 2017

Oh wow when will we get over this saga? :( Thanks for helping out @michaelherold, LMK if you need anything. I'm making myself a note do blog about this once it's over ;)

@BRMatt

This comment has been minimized.

Show comment
Hide comment
@BRMatt

BRMatt Feb 23, 2017

If you're digging into it, we'd be happy to accept a PR with a failing test or a fix!

Sorry I didn't submit one with this issue, I was a bit tired after investigating this!

I might be able to, but I might be missing a bit of context about this feature. Is this only supposed to warn about methods that are defined on the mash or it's subclasses? If the performance overhead was coming from the overhead of include? could we cache memoize Set.new(methods) and call include? on that?

BRMatt commented Feb 23, 2017

If you're digging into it, we'd be happy to accept a PR with a failing test or a fix!

Sorry I didn't submit one with this issue, I was a bit tired after investigating this!

I might be able to, but I might be missing a bit of context about this feature. Is this only supposed to warn about methods that are defined on the mash or it's subclasses? If the performance overhead was coming from the overhead of include? could we cache memoize Set.new(methods) and call include? on that?

@michaelherold

This comment has been minimized.

Show comment
Hide comment
@michaelherold

michaelherold Feb 23, 2017

Contributor

Yeah, it's supposed to throw a warning when you set a key that collides with a built-in method. I think the cache is probably going to be the way to go. I have a failing test and will try to carve out some time to work on it tonight. Otherwise I'll look before I head in to work in the morning.

Contributor

michaelherold commented Feb 23, 2017

Yeah, it's supposed to throw a warning when you set a key that collides with a built-in method. I think the cache is probably going to be the way to go. I have a failing test and will try to carve out some time to work on it tonight. Otherwise I'll look before I head in to work in the morning.

michaelherold added a commit to michaelherold/hashie that referenced this issue Feb 24, 2017

Don't log when overwriting Mash keys
When we switched to using `#respond_to?` to detecting whether to log
a Mash collision, we started reporting when we were overwriting keys
that already exist in the Mash. This is a poor experience because it
causes extra warnings (as in #414) or, in the worst case, causes an
"undefined method" error (as in #413).

This change fixes that problem and benchmarks to ensure we're not
appreciably regressing performance. The results of two benchmarks are
below:

```
bundle exec ruby benchmark/mash_method_access.rb:

Warming up --------------------------------------
  before    92.456k i/100ms
Calculating -------------------------------------
  before      1.290M (± 4.4%) i/s -      6.472M in   5.028183s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
    after    92.941k i/100ms
Calculating -------------------------------------
    after      1.326M (± 5.4%) i/s -      6.692M in   5.060756s

Comparison:
   after:  1326239.2 i/s
  before:  1289624.0 i/s - same-ish: difference falls within error
```

and

```
within spec/integrations/omniauth,
bundle exec rake perf:ips

Warming up --------------------------------------
  before     1.260k i/100ms
Calculating -------------------------------------
  before     13.114k (± 4.2%) i/s -     66.780k in   5.101689s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
    after     1.299k i/100ms
Calculating -------------------------------------
    after     13.149k (± 4.0%) i/s -     66.249k in   5.046630s

Comparison:
   after:    13148.9 i/s
  before:    13113.8 i/s - same-ish: difference falls within error
```

Closes #413
Closes #414
@michaelherold

This comment has been minimized.

Show comment
Hide comment
@michaelherold

michaelherold Feb 24, 2017

Contributor

I tried using a Set to memoize the method set on initialization. Hashie performance stayed the same, but it caused a 50% performance regression in OmniAuth. I ended up going a different way, as seen in #415.

Here are the benchmark results with the Set change, for reference:

Mash Benchmark

Warming up --------------------------------------
              before    91.812k i/100ms
Calculating -------------------------------------
              before      1.312M (± 4.4%) i/s -      6.610M in   5.046857s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
               after    90.721k i/100ms
Calculating -------------------------------------
               after      1.297M (± 5.3%) i/s -      6.532M in   5.051305s

Comparison:
              before:  1312343.0 i/s
               after:  1296853.0 i/s - same-ish: difference falls within error

OmniAuth Benchmark

Warming up --------------------------------------
              before     1.309k i/100ms
Calculating -------------------------------------
              before     13.202k (± 4.4%) i/s -     66.759k in   5.066731s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
               after   672.000  i/100ms
Calculating -------------------------------------
               after      6.876k (± 4.4%) i/s -     34.944k in   5.092162s
Contributor

michaelherold commented Feb 24, 2017

I tried using a Set to memoize the method set on initialization. Hashie performance stayed the same, but it caused a 50% performance regression in OmniAuth. I ended up going a different way, as seen in #415.

Here are the benchmark results with the Set change, for reference:

Mash Benchmark

Warming up --------------------------------------
              before    91.812k i/100ms
Calculating -------------------------------------
              before      1.312M (± 4.4%) i/s -      6.610M in   5.046857s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
               after    90.721k i/100ms
Calculating -------------------------------------
               after      1.297M (± 5.3%) i/s -      6.532M in   5.051305s

Comparison:
              before:  1312343.0 i/s
               after:  1296853.0 i/s - same-ish: difference falls within error

OmniAuth Benchmark

Warming up --------------------------------------
              before     1.309k i/100ms
Calculating -------------------------------------
              before     13.202k (± 4.4%) i/s -     66.759k in   5.066731s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
               after   672.000  i/100ms
Calculating -------------------------------------
               after      6.876k (± 4.4%) i/s -     34.944k in   5.092162s

michaelherold added a commit to michaelherold/hashie that referenced this issue Feb 24, 2017

Don't log when overwriting Mash keys
When we switched to using `#respond_to?` to detecting whether to log
a Mash collision, we started reporting when we were overwriting keys
that already exist in the Mash. This is a poor experience because it
causes extra warnings (as in #414) or, in the worst case, causes an
"undefined method" error (as in #413).

This change fixes that problem and benchmarks to ensure we're not
appreciably regressing performance. The results of two benchmarks are
below:

```
bundle exec ruby benchmark/mash_method_access.rb:

Warming up --------------------------------------
  before    92.456k i/100ms
Calculating -------------------------------------
  before      1.290M (± 4.4%) i/s -      6.472M in   5.028183s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
    after    92.941k i/100ms
Calculating -------------------------------------
    after      1.326M (± 5.4%) i/s -      6.692M in   5.060756s

Comparison:
   after:  1326239.2 i/s
  before:  1289624.0 i/s - same-ish: difference falls within error
```

and

```
within spec/integrations/omniauth,
bundle exec rake perf:ips

Warming up --------------------------------------
  before     1.260k i/100ms
Calculating -------------------------------------
  before     13.114k (± 4.2%) i/s -     66.780k in   5.101689s

Pausing here -- run Ruby again to measure the next benchmark...

Warming up --------------------------------------
    after     1.299k i/100ms
Calculating -------------------------------------
    after     13.149k (± 4.0%) i/s -     66.249k in   5.046630s

Comparison:
   after:    13148.9 i/s
  before:    13113.8 i/s - same-ish: difference falls within error
```

Closes #413
Closes #414

@dblock dblock closed this in #415 Feb 24, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment