Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashie Breakup - Major Perf boost #773

Closed
wants to merge 1 commit into from

Conversation

schneems
Copy link
Contributor

Hashie, we've had some great times...but I think we need to break up. Your ties to omniauth are expensive, you're too high maintenance. We needed to take it to the next level but things aren't moving fast enough. It's just not working out...

I benched a Rails request with memory_profiler and found that Hashie allocates a TON of objects on each request:

allocated memory by gem
-----------------------------------
rack-1.5.2 x 6491
actionpack-4.1.2 x 4888
hashie-3.3.1 x 4735             <==============
activesupport-4.1.2 x 3292
omniauth-1.2.2 x 1387
actionview-4.1.2 x 1107
ruby-2.1.2/lib x 1097
railties-4.1.2 x 925
activerecord-4.1.2 x 440
warden-1.2.3 x 200
codetriage-ko1-test-app/app x 40
other x 40

This is one, request! I was curious if it was possible to remove hashie from Omniauth, and if so, would the result be better, faster, and use less memory. This PR does just that.

Patch Implementation

This patch removes Hashie as a dependency and replaces its usage with a custom data structure OmniStruct based off of OpenStruct. I think we can even remove this structure eventually in favor of some honest to goodness PORO configuration options, but we will have to deprecate some interfaces and rev versions so...one step at a time.

Patch Benchmarks

It uses less memory:

allocated memory by gem
-----------------------------------
rack-1.5.2 x 6491
actionpack-4.1.2 x 4888
activesupport-4.1.2 x 3292
ruby-2.1.2/lib x 1337      <=========
omniauth/lib x 1267
actionview-4.1.2 x 1107
railties-4.1.2 x 925
activerecord-4.1.2 x 440
warden-1.2.3 x 200
codetriage-ko1-test-app/app x 40
other x 40

Our structure uses 4735 - 1337 # => 3398 fewer objects (per request 0_o). I benchmarked a request with and without the patch and this was the result.

Without patch:


Calculating ——————————————————
                 ips        54 i/100ms
-------------------------------------------------
                 ips      547.3 (±4.2%) i/s -       2754 in   5.041327s 

With patch:


Calculating ——————————————————
                 ips        55 i/100ms
-------------------------------------------------
                 ips      576.7 (±4.2%) i/s -       2915 in   5.063895s 

Our patch improves speed by:

(576.7 - 547.3)/547.3 * 100
# => 5.37 # %

This isn't 5% speed increase in Omniauth, this is a 5% speed increase in the whole rails app request.

Implementation considerations

Most tests pass with no modification and most interfaces remain unchanged. The biggest change is that there are no more predicate methods so use skip_info instead of skip_info?. These were never explicitly tested. Also, if we move to proper configuration objects we can put these back in where they make sense, some custom defined predicates still remain such as valid? on `AuthHash because they were manually defined.

There was one test that needed a to_hash called on them because of the recursive nature of the OmniStruct (if you set a value as a hash, it will store it as an OmniStruct so we can do method call chaining). I think in this case the change is fine because the resultant object can be accessed the same way (keys and methods). This behavior was never explicitly tested for, only implicitly in this one test, which leads me to believe that it is an acceptable change.

Now what?

The case for removing hashie is pretty compelling, let's :shipit: or discuss implementation. Afterwards we can talk about a plan to move omniauth towards real honest to goodness PORO config objects.

@mbleigh
Copy link
Contributor

mbleigh commented Sep 29, 2014

At the very least this will require a major version bump, because people have built thousands of applications depending on OmniAuth's AuthHash etc. as a hash-like structure, not a struct. I personally use query methods in my own applications.

I'm not saying let's not merge this, I think it's probably fine, but authentication is not typically a code path that needs to be heavily optimized. A 5% performance gain is great and all, but a given user is authenticating what, maybe once a week? Once a month? I worry about introducing such sweeping API changes for a noticeable but perhaps not-super-important perf bump.

@schneems
Copy link
Contributor Author

To be clear these perf tests were not while logging in, they were on the index page of codetriage.com. This speed up seems to be on all pages in an app.

Hashie, we've had some great times...but I think we need to break up. Your ties to omniauth are expensive, you're too high maintenance. We needed to take it to the next level but things aren't moving fast enough. It's just not working out...

I benched a Rails request with [memory_profiler](https://github.com/SamSaffron/memory_profiler) and found that Hashie allocates a TON of objects on each request:

```
allocated memory by gem
-----------------------------------
rack-1.5.2 x 6491
actionpack-4.1.2 x 4888
hashie-3.3.1 x 4735             <==============
activesupport-4.1.2 x 3292
omniauth-1.2.2 x 1387
actionview-4.1.2 x 1107
ruby-2.1.2/lib x 1097
railties-4.1.2 x 925
activerecord-4.1.2 x 440
warden-1.2.3 x 200
codetriage-ko1-test-app/app x 40
other x 40
```

This is one, request! I was curious if it was possible to remove hashie from Omniauth, and if so, would the result be better, faster, and use less memory. This PR does just that.

## Patch Implementation

This patch removes Hashie as a dependency and replaces its usage with a custom data structure `OmniStruct` based off of `OpenStruct`. I think we can even remove this structure eventually in favor of some honest to goodness PORO configuration options, but we will have to deprecate some interfaces and rev versions so...one step at a time.

## Patch Benchmarks

It uses less memory:


```
allocated memory by gem
-----------------------------------
rack-1.5.2 x 6491
actionpack-4.1.2 x 4888
activesupport-4.1.2 x 3292
ruby-2.1.2/lib x 1337      <=========
omniauth/lib x 1267
actionview-4.1.2 x 1107
railties-4.1.2 x 925
activerecord-4.1.2 x 440
warden-1.2.3 x 200
codetriage-ko1-test-app/app x 40
other x 40
```

Our structure uses `4735 - 1337 # => 3398` fewer objects (per request 0_o). I benchmarked a request with and without the patch and this was the result.


Without patch:


```

Calculating ——————————————————
                 ips        54 i/100ms
-------------------------------------------------
                 ips      547.3 (±4.2%) i/s -       2754 in   5.041327s 
```


With patch:


```

Calculating ——————————————————
                 ips        55 i/100ms
-------------------------------------------------
                 ips      576.7 (±4.2%) i/s -       2915 in   5.063895s 
```


Our patch improves speed by:

```
(576.7 - 547.3)/547.3 * 100
# => 5.37 # %
```

This isn't `5%` speed increase in Omniauth, this is a `5%` speed increase in the __whole rails app request__.

## Implementation considerations

Most tests pass with no modification and most interfaces remain unchanged. The biggest change is that there are no more predicate methods so use `skip_info` instead of `skip_info?`. These were never explicitly tested. Also, if we move to proper configuration objects we can put these back in where they make sense, some custom defined predicates still remain such as `valid?` on `AuthHash because they were manually defined.

There was one test that needed a `to_hash` called on them because of the recursive nature of the OmniStruct (if you set a value as a hash, it will store it as an OmniStruct so we can do method call chaining). I think in this case the change is fine because the resultant object can be accessed the same way (keys and methods). This behavior was never explicitly tested for, only implicitly in this one test, which leads me to believe that it is an acceptable change.

## Now what?

The case for removing hashie is pretty compelling, let's :shipit: or discuss implementation then :shipit:. Afterwards we can talk about a plan to move omniauth towards real honest to goodness PORO config objects.
@soopa
Copy link

soopa commented Sep 30, 2014

👏

schneems added a commit to schneems/omniauth that referenced this pull request Oct 1, 2014
This patch isn't nearly as invasive as omniauth#773. Also it was tested against itself, not in a full Rails app. I'm contributing the perf scripts I wrote in another PR.

Notice that by memorizing methods, not only are we reducing objects created in omniauth, but also in hashie due to fewer method calls to various omniauth options.


## Bench

Before patch

```
Calculating -------------------------------------
                 ips       470 i/100ms
-------------------------------------------------
                 ips     4877.1 (±10.1%) i/s -      24440 in   5.061663s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       467 i/100ms
-------------------------------------------------
                 ips     4840.9 (±14.5%) i/s -      23817 in   5.068649s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       465 i/100ms
-------------------------------------------------
                 ips     4799.5 (±11.8%) i/s -      23715 in   5.019191s
```

After Patch

```
$ be rake perf:ips
Calculating -------------------------------------
                 ips       631 i/100ms
-------------------------------------------------
                 ips     6269.1 (±11.8%) i/s -      30919 in   5.014849s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       615 i/100ms
-------------------------------------------------
                 ips     6207.8 (±10.3%) i/s -      31365 in   5.106172s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       522 i/100ms
-------------------------------------------------
                 ips     6155.8 (±9.3%) i/s -      30798 in   5.048455s
```



## Memory Bench

100 requests without patch


```
allocated objects by gem
-----------------------------------
hashie-3.3.1 x 6901
rack-1.5.2 x 5300
omniauth/lib x 3637 <====================
ruby-2.1.2/lib x 301
other x 300
```

100 requests with patch 

```
allocated objects by gem
-----------------------------------
rack-1.5.2 x 4800
hashie-3.3.1 x 4701
omniauth/lib x 1037 <====================
ruby-2.1.2/lib x 301
other x 300
```

Results in `(3637  - 1037 )/ 3637.0 * 100  # => 71 %` less objects.
schneems added a commit to schneems/omniauth that referenced this pull request Oct 1, 2014
This patch isn't nearly as invasive as omniauth#773. Also it was tested against itself, not in a full Rails app. I'm contributing the perf scripts I wrote in another PR.

Notice that by memorizing methods, not only are we reducing objects created in omniauth, but also in hashie due to fewer method calls to various omniauth options.


## Bench

Before patch

```
Calculating -------------------------------------
                 ips       470 i/100ms
-------------------------------------------------
                 ips     4877.1 (±10.1%) i/s -      24440 in   5.061663s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       467 i/100ms
-------------------------------------------------
                 ips     4840.9 (±14.5%) i/s -      23817 in   5.068649s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       465 i/100ms
-------------------------------------------------
                 ips     4799.5 (±11.8%) i/s -      23715 in   5.019191s
```

After Patch

```
$ be rake perf:ips
Calculating -------------------------------------
                 ips       631 i/100ms
-------------------------------------------------
                 ips     6269.1 (±11.8%) i/s -      30919 in   5.014849s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       615 i/100ms
-------------------------------------------------
                 ips     6207.8 (±10.3%) i/s -      31365 in   5.106172s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       522 i/100ms
-------------------------------------------------
                 ips     6155.8 (±9.3%) i/s -      30798 in   5.048455s
```



## Memory Bench

100 requests without patch


```
allocated objects by gem
-----------------------------------
hashie-3.3.1 x 6901
rack-1.5.2 x 5300
omniauth/lib x 3637 <====================
ruby-2.1.2/lib x 301
other x 300
```

100 requests with patch 

```
allocated objects by gem
-----------------------------------
rack-1.5.2 x 4800
hashie-3.3.1 x 4701
omniauth/lib x 1037 <====================
ruby-2.1.2/lib x 301
other x 300
```

Results in `(3637  - 1037 )/ 3637.0 * 100  # => 71 %` less objects.
schneems added a commit to schneems/omniauth that referenced this pull request Oct 1, 2014
This patch isn't nearly as invasive as omniauth#773. Also it was tested against itself, not in a full Rails app. I'm contributing the perf scripts I wrote in another PR.

Notice that by memorizing methods, not only are we reducing objects created in omniauth, but also in hashie due to fewer method calls to various omniauth options.


## Bench

Before patch

```
Calculating -------------------------------------
                 ips       470 i/100ms
-------------------------------------------------
                 ips     4877.1 (±10.1%) i/s -      24440 in   5.061663s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       467 i/100ms
-------------------------------------------------
                 ips     4840.9 (±14.5%) i/s -      23817 in   5.068649s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       465 i/100ms
-------------------------------------------------
                 ips     4799.5 (±11.8%) i/s -      23715 in   5.019191s
```

After Patch

```
$ be rake perf:ips
Calculating -------------------------------------
                 ips       631 i/100ms
-------------------------------------------------
                 ips     6269.1 (±11.8%) i/s -      30919 in   5.014849s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       615 i/100ms
-------------------------------------------------
                 ips     6207.8 (±10.3%) i/s -      31365 in   5.106172s
2.1.2  ~/documents/projects/omniauth (master)
$ be rake perf:ips
Calculating -------------------------------------
                 ips       522 i/100ms
-------------------------------------------------
                 ips     6155.8 (±9.3%) i/s -      30798 in   5.048455s
```



## Memory Bench

100 requests without patch


```
allocated objects by gem
-----------------------------------
hashie-3.3.1 x 6901
rack-1.5.2 x 5300
omniauth/lib x 3637 <====================
ruby-2.1.2/lib x 301
other x 300
```

100 requests with patch 

```
allocated objects by gem
-----------------------------------
rack-1.5.2 x 4800
hashie-3.3.1 x 4701
omniauth/lib x 1037 <====================
ruby-2.1.2/lib x 301
other x 300
```

Results in `(3637  - 1037 )/ 3637.0 * 100  # => 71 %` less objects.
@schneems
Copy link
Contributor Author

schneems commented Oct 1, 2014

Even after #774 we're leaving a good amount of perf on the table. There's likely some other big easy-ish wins. We can look at what the middleware is doing when it is not being used (no callback or auth path). It's fine if we're slower on the paths omniauth is registered at, but for every other request, omniauth should basically be invisible from a perf perspective. Even with #744, hashie is still expensive.

I agree with the major version bump, and if we're going to do that we might as well use custom objects instead of struct or hash types. The real kicker is the dual nature of data access, that flexibility comes at a cost. If we did that, we could see even more drastic savings than this PR.

In my ideal world I would get the top 10 omniauth strategies, build object types that work for those cases out of the box. After that iterate over the next 10-20 strategies by pointing the omniauth dependency at a branch and running the tests, modifying the strategies so that they pass and put together a guide on how to update strategies (common problems/pitfalls).

That all being said, it seems more worthwhile to look at non-api breaking improvements. Only then consider a major version rev project.

Thanks for the look. Going to close this, as it was perhaps more of a proof of concept, and could potentially be used to base future custom config object work.

@schneems schneems closed this Oct 1, 2014
@j-manu
Copy link

j-manu commented Dec 18, 2014

You are linking to #744 instead of #774

@schneems
Copy link
Contributor Author

Fixed

@darrencauthon
Copy link

Since this was linked to @schneems' blog post that rips Hashie as a "horrible" choice that should never be used in production, I wanted to share one thought I had while the page was loading."

"I bet this PR has more code without Hashie than with Hashie."

There are different types of ways to measure code. Milliseconds to execute a method is one. Lines of code to implement a method is another. Time to implement a solution is even another!

When I've chosen to use Hashie, it was to lower the lines of code and speed up development. If I were to drop Hashie in those cases, I'd end up with something just like this... a PR with many more additions than subtractions. Sure, it's faster, but that performance boost is not free.

@schneems
Copy link
Contributor Author

Thanks for reading the article. This PR should be left for technical discussion about this specific code or the intent. This isn't the place for general discussion. Maybe http://www.reddit.com/r/ruby/comments/2pkzec/hashie_considered_harmful_an_ode_to_hash_and/ would be a better forum. I'll be happy to reply but don't want to clog up this issue.

@darrencauthon
Copy link

@schneems Well, you are right... I should be more specific to this pull request.

I'd feel worse about omniaith with this pull request than without. It may perform better, but I feel that it's taking on a chunk of technical debt. The form of this debt is OmniStruct, the new 70-line, untested class that inherits from a ruby base class (usually a big no-no, right?)

The class may be tested passively through the existing tests, and often that's fine when refactoring code. However, when the refactor takes the form of a large class with its own responsibilities, I think it becomes necessary to test it in isolation. In the form that OmniStruct is in currently, I would not trust that it is fully covered. In a class with revision, I'd consider this unsafe.

The missing tests are even more important when considering the fact that hashie's business is not omniauth's business. I bet hashie has tests around the fine details of these behaviors because that's why hashie exists. Omniauth's focus is on its own business. Omniauth's needs only a subset of hashie behavior, but any detail that it pulls from hashie should be managed with extreme caution. I don't see that caution in this code, and I especially don't see it in the author's comments. I see someone who is focused on a tiny bit of performance and personal values, with no mention of the costs.

Choices in programming are usually a trade-off. You get something, but you give something away. A reasoned approach to changes involved acknowledging both sides. I raise a red flag when I see the person making the changes focused on one.

I know this PR has been closed, but moving forward I implore the authors to treat it seriously.

@schneems
Copy link
Contributor Author

Responses to your comments in turn.

I'd feel worse about omniaith with this pull request than without.

Agreed, that's why we closed it. This PR didn't go far enough. It tried to support the current interface, in its (almost) entirety when the right way to do it would be to introduce proper config with real honest to goodness objects and methods. I talked about this #773 (comment)

I feel that it's taking on a chunk of technical debt.

The current config object being based on Hashie has TREMENDOUS technical debt, which is why this PR was so difficult and why the OmniStruct class can't be any smaller. This was my core argument in the article. Again, as i said before we shouldn't replicate all of Hashie's features, we should define an actual interface and use it.

Said this earlier "we might as well use custom objects instead of struct or hash types"

I bet this PR has more code without Hashie than with Hashie.

Nope. This PR has 112 lines added and 32 lines deleted (net of 80 lines added). Removing the Hashie dependency (only mash) is around 213 lines (without comments). Hashie depends on Hashie::Hash which has another 56 lines pretty inspect at 19 lines and stringify keys at another 50 lines. For a grand total of (213+56+19+50) 338 lines. This makes the change (80 - 338) NEGATIVE 258 lines. 💥

Now i know what you're thinking, hashie is a dependency, we don't get to count those lines. Well...when you use a third party library you become a maintainer of that code, Michael actually wrote the first version of Hashie and was quite literally the maintainer of the code. Even if that wasn't the case, if one of those 338 lines breaks something in omniauth, it's the maintainer of omniauth's job to find it, report it or fix it. It also increases the surface area that the maintainer needs to know.

Let's take a look. If you're using Hashie::Mash you need to know a bunch of methods:

Hashie::Mash.instance_methods.count
# => 176

And it doesn't include all those magic methods added by method_missing.

OmniAuth::OmniStruct.instance_methods.count
# => 71

In comparison, my proposed spike has less than half the methods (but the same method missing behavior). By introducing Hashie into your project you need to know more.

So we've got fewer lines of code to maintain and a much more defined interface (176 - 71 = 105 fewer methods).

I encourage people to use and share libraries but you're kidding yourself if you don't think you introduce technical debt into your project every dependency you add. When Rails 2 to Rails 3 broke a ton of gems and simultaneously a bunch of gem authors decided they didn't want to maintain those gems anymore. Guess what...every single dependency in your app became your responsibility. If it's code that is running on your server, it doesn't matter how it got there, it's your responsibility.

Time to implement a solution is even another! [measure of code]

I completely agree here. When Omniauth passes your Rails app a config object and you accidentally call a method on it that doesn't exist and no exceptions get raised there is a very real cost to the time you will spend fixing your Rails code. An explicit interface would have made the problem so obvious it would have taken seconds to see there was an exception. If i could accurately measure the times i've misconfigured an omniauth strategy and spent hours debugging, I would gladly list them here.

I think what you're getting at, is that you don't want to maintain things like hashes with indifferent access and deep_merge! methods. I think you shouldn't have to. A proper configuration object isn't based off of a hash, so it doesn't have to worry about symbol or string access and or any merge problems. By needing these types of behaviors for configuration, it's a red flag...i shouldn't have had to write that code. It shouldn't be needed.

hashie's business is not omniauth's business.

I disagree. Omniauth must know about how hashie works to perform correctly. When you touch code that hashie touches, you must know how to use hashie. Omniauth uses hashie, therefore you must know how hashie works to work in the omniauth codebase.

Omniauth's needs only a subset of hashie behavior, but any detail that it pulls from hashie should be managed with extreme caution.

The problem here is we pass this config object that is based on Hashie to 3rd party libraries and "strategies", even if it only needs a subset of the behavior, it is giving all the behavior to all the libraries. So while this code doesn't call a specific method, a third party library could. If we were going to actually go forwards, we would need to deprecate like crazy see my comment and example code here: zmoazeni/harvested#83 (comment)

I see someone who is focused on a tiny bit of performance and personal values, with no mention of the costs.

The performance gains here are anything but tiny. Five percent of total rails application request speed isn't tiny, especially considering that Omniauth shouldn't be doing anything on those requests. I struggle for months on code in Rails internals to get just 1~2% faster on an overall request. 5% gain on one relatively small code change is HUGE. While i emphasized performance in the pull request, i should have talked about the biggest benefits or rather the "costs".

The "costs" here provide us with a cleaner interface, less total lines of code to maintain and fewer methods to have to understand due to removing Hashie as a dependency.

Choices in programming are usually a trade-off. You get something, but you give something away. A reasoned approach to changes involved acknowledging both sides. I raise a red flag when I see the person making the changes focused on one.

What i think you meant here was "You seem to be only focussing on the performance, how does this impact usability? Anything else this gives up?" The point of the article is that removing hashie makes your code faster, cleaner, and easier to use. You're giving up flexibility, but it's a false flexibility. The flexibility provided by Hashie also introduces problems.

However, when the refactor takes the form of a large class with its own responsibilities, I think it becomes necessary to test it in isolation. In the form that OmniStruct is in currently, I would not trust that it is fully covered. In a class with revision, I'd consider this unsafe.

Totally agree. What you might not realize is that it took me 16 hours to write this pull request and get all the tests working. I did it on my flight to and from RubyKaigi in Japan. This doesn't include all the time I spent writing the debugging and benchmarking tools. That was just to get the existing tests to pass. After nearly two straight (business) days of coding, i'm more interested in throwing the idea over the fence and seeing what comes of it than shooting for 100% on rcov. I can assure you that had anything merged into master I would have tested the crap out of it.

Telling a closed pull request it needs tests is not helpful.

that performance boost is not free.

You're right it's not free. It took me 16 hours to write that patch, and as we've mentioned it's not even the final solution. If hashie had never been introduced in the first place, then it would have been like getting a 5% performance boost for from the start. Did i mention we would also get a better interface and easier code to work with? Removing Hashie is a win, win, win. Because when I win and you win, we all win.

@calebhearth
Copy link

All other arguments aside, it comes down to this: 5% less time spent on every request to applications using Rack. I wouldn't be surprised, too, if more configuration-heavy strategies saw larger speed improvements, but that's conjecture.

Did i mention we would also get a better interface and easier code to work with?

This as an added benefit is great too.

This, or the discussed extension to scratch-building a configuration system, should pretty clearly (to me) be the way forward for Omniauth.

@darrencauthon
Copy link

@schneems I won't keep this going forever, especially since it's closed. But a couple points:

Now i know what you're thinking, hashie is a dependency, we don't get to count those lines.
when you use a third party library you become a maintainer of that code

You're equivocating. I pointed that this PR is built on a a large, complicated, untested class -- a class which wouldn't have to exist if Hashie was used. This is a fact, and a matter of basic math.

I've asserted that this fact, alone, shows that there is a cost to removing Hashie. Or to look at it from the other side, Hashie provides a value at a cost. I can't convey developer experience in writing, and I doubt I have to with you... but good programming requires judging the costs/benefits of multiple options, and then making the best decision.

I'm raising my red flag because I don't see any acknowledgment of this cost/benefit. Hey, I have to do this with myself all the time. I'll start on a new feature or "enhancement" for all sorts of reasons... but honestly, deep-in-my-heart, I know that it's because "I just want to" or "I'd love to use this new tech" or "I just think it's better." Sometimes the cost for these things is small, and sometimes it's high. (New tech is almost always very high.) So I sleep on PRs, I get input from others, and I try to be mindful of myself.

I'm concerned about the true purpose behind this PR and this effort. That concern would not exist if you weren't on such a hard-line position. Just say something to the effect of, "I think Hashie is a very expensive dependency, and I think the performance and API improvements are worth the cost of removing it." See, then you make sense, and you're demonstrating good programming. But what I see is: "HASHIE SHOULD NEVER BE USED IN PRODUCTION!!!", untested code, and arguments that more code is less code.

I mean, gosh... imagine counting code like you say. I need a simple CRUD app for managing a list of my cats. Just their names and breeds. Well, obviously, I can't use Rails, because... gosh, my one-model Rails application would have hundreds of thousands of lines of code! But nobody counts LOC this way, for many good reasons that I won't get into during the Merry Christmas season.

After nearly two straight (business) days of coding, i'm more interested in throwing the idea over the fence and seeing what comes of it than shooting for 100% on rcov.

That's not the coverage I mean. If I'm testing a method like this

def do_something
  a
  b
  c
end

and I test for a, then b and c will be "covered" in the sense that the code is hit during the test run. They'll show up as covered in a simple code coverage check. That doesn't mean b and c are "covered" in the sense that they're actually tested, though... and that's the only sense that I care about.

Without running the tests, I'd bet OmniStruct is covered well in the first sense. But in order to actually be tested is would need detailed tests at this level: https://github.com/intridea/hashie/blob/f42367e30e6a868ce117b143729b02d5f2c0a279/spec/hashie/mash_spec.rb

Anyway, I just wanted to offer my thoughts, which I did. I'm going to do more Christmas-y things now and stop bugging you. Merry Christmas!

@schneems
Copy link
Contributor Author

Maybe you missed my statement earlier

Telling a closed pull request it needs tests is not helpful.

This is conversation is neither appropriate or productive. You're attacking me in a very thinly veiled manner, you're arguing against a CLOSED PULL REQUEST, and generally taking up the time I could be spending contributing to the community. I don't need to be :trollface:-d right now. I'm unsubscribing from the issue. Have a great end of the year 😁

@zmoazeni
Copy link

That concern would not exist if you weren't on such a hard-line position.

@darrencauthon You seem to just be playing devils advocate because @schneems is very clear about his position. @schneems has been over-the-top explicit why he holds his opinion. If you have any specific concerns to his points, I suggest you raise them instead. Your current comments are just vague theorizing about tradeoffs.

Frankly, @schneems has done more than enough to convince me of his stance. To see the forest through the trees here.

I pointed that this PR is built on a a large, complicated, untested class

The raw PR numbers are +112 / -32. The largest class addition is 72 lines. We're hardly talking about massive unmaintainable classes here, let alone "large" and "complicated"

... people have built thousands of applications depending on OmniAuth's AuthHash ...

... This speed up seems to be on all pages in an app.

The crux of this argument is removing Hashie and replacing it with either a Hash, OpenStruct, or a well defined PORO. He is not proposing a C-extension or cryptic ruby code where one wrong move will easily invalidate the performance gain. And then boom, the performance of thousands of apps have been improved by ~5% for free (from their perspective).

I think that bears repeating: A change to remove a dependency and use idiomatic Ruby code improves the performance of thousands of apps by ~5%.

Without any concrete arguments, I don't know why this effort isn't met with more appreciation for @schneems's hard work investigation and then proposing a proof of concept. I know I am convinced.

@zmoazeni
Copy link

Sorry @schneems for dragging you back in. You posted your comment seconds before I posted mine.

@stevenhaddox
Copy link

For what little it's worth, as a builder of a several strategies and a huge advocate of OmniAuth for many years, I'm very supportive of removing Hashie for performance benefits sake. I'd be more than happy to rewrite my strategies to be able to pass on that kind of performance increase to any application using my libraries.

Also, huge thanks to schneems for putting in 16+ hours of effort on a very thoroughly investigated alternative and well thought out PR. Hopefully this leads to big gains in OmniAuth in it's next major release.

@dblock
Copy link
Contributor

dblock commented Dec 30, 2014

#774 resolved most the performance problem without removing Hashie::Mash; I got confused by the long thread that followed that so I am re-adding it here for anyone still reading this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants