restrict to user_levels that have an associated script_level #14639

Bjvanminnen · 2017-04-25T17:58:45Z

fix for https://app.honeybadger.io/projects/3240/faults/33387488

This HB alert is curious for a few reasons.
(1) It popped up after yesterday's DTP, but this code has been in place for some time.
(2) One of the instances I investigated had what I believe to be bad data, namely a UserLevel where the associated script_id was for a script that does not contain this level. I'm not sure how that came about.

That said, this did all reveal what I believe to be a legit bug.

If you're on a script and you answer a question within a level group, we'll make milestone requests for both the particular level and the level group. If the LevelGroup goes through first, then you end up in a situation where your most recent UserLevel points to a Level (the multi inside the LevelGroup) that has no script_level.

Solution is to filter to user_levels that will allow us to get back to a script_level (so that line 628/633 doesn't result in nil)

Bjvanminnen · 2017-04-25T18:09:39Z

Assigned Ram to review, because he reviewed my last PR related to next_unpassed_progression_level. Asher, because he often has good ideas these sorts of things.

ashercodeorg

Have we verified that there are no extra DB hits as a result of this change? In particular, I'm concerned about a N+1 query fanout. Though we cache the script hierarchy, I think this only happens when you start with a cached object, not when you access a (say) level from a user_level.

If you are hitting the DB, a hack might be to map the user_level.level_id to Level.cache_find(level_id). Though that is pretty hacky.

ashercodeorg · 2017-04-25T19:07:09Z

dashboard/app/models/user.rb

@@ -619,9 +619,14 @@ def user_level_locked?(script_level, level)
  # script that hasn't yet been passed, starting its search at the last level we submitted
  def next_unpassed_progression_level(script)
    user_levels_by_level = user_levels_by_level(script)
+    user_levels = user_levels_by_level.values.flatten.select do |user_level|


It seems simpler to do user_levels.where(script: script) than user_levels_by_level(script).values.flatten if you want to simplify pre-existing code. Aside, I don't understand why the flatten is necessary.

One of the cases I was looking at was an instance where user_level.script_id was actually for a different script than any of the user_levels. I must admit, I dont understand this scenario:

irb(main):062:0> UserLevel.find(985612048).script_id => 142 irb(main):063:0> UserLevel.find(985612048).level.script_levels.map{|sl| sl.script_id} => [230]

Oh, I think I misunderstood your comment.

That userlevel's level used to be in script 142, but was recently removed: 135b583

One benefit to this change is that it will now work properly if your last progress was on a level that has since been removed from the script.

Ahhh....that helps explain why we just started hitting this error.

ashercodeorg · 2017-04-25T19:08:20Z

dashboard/app/models/user.rb

+    user_levels = user_levels_by_level.values.flatten.select do |user_level|
+      # we want to ignore levels that dont have an associated script_level (such
+      # as multis within a LevelGroup)
+      user_level.level.script_levels.where(script_id: script.id).present?


What is the difference between present? and any? here?

I didn't know the right ruby method here. I can probably switch to any?

ashercodeorg · 2017-04-25T19:09:31Z

dashboard/app/models/user.rb


    # Find the user level that we've most recently had progress on
-    user_level = user_levels_by_level.values.flatten.sort_by!(&:updated_at).last
+    user_level = user_levels.sort_by!(&:updated_at).last


Can we avoid a sort by doing something like max_by {|user_level| user_level.updated_at}?

Bjvanminnen · 2017-04-25T23:57:07Z

dashboard/app/models/user.rb

+      index_by(&:level_id)
+
+    # Find the user_level that we've most recently had progress on
+    user_level = user_levels_by_level.values.max_by(&:updated_at)


New approach. Instead of calling script_levels on each level (which was a db hit each time), we start out by only getting user_levels for which we have a script_level. The call to script.script_levels should be cached, as is the call to level_ids on each script_level.

Now, on line 636 we're guaranteed to have last_script_level by non-nil, and thus our return will be non-nil

It's also worth noting that this method is likely failing silently in many cases in production right now. Most of the time, when we fail to find a last_script_level, we're okay because there is at least one unpassed progression level elsewhere in the script. In the case where that is not true, this method returns nil, and then we get HB errors.

I'll defer to Asher on the performance implications of adding all the level ids to the db query, but my gut feeling is that adapting your previous approach to use the cache would be more efficient (and doesn't seem that hacky to me).

The previous approaches cache problem was not where we thought. user_level.level and Level.find_cache(user_level.level_id) both end up hitting the cache. However, with both of those, when you end up doing level.script_levels we do not have a cache to hit.

This new approach should be much better at avoiding non-cached db hits.

My gut agrees with @balderdash, I think the other approach may be slightly more efficient. That said, I'm fairly convinced this method is efficient enough, so I don't think further changes are necessary. And I really like how it reads.

If you remind me to monitor the DB after this goes in, I'll look at whether this is worth optimizing. Regardless, I'm convinced it won't hurt production noticeably (as the non-cached script_level previously would have).

balderdash · 2017-04-26T00:40:00Z

dashboard/app/models/user.rb

@@ -620,8 +620,16 @@ def user_level_locked?(script_level, level)
  def next_unpassed_progression_level(script)
    user_levels_by_level = user_levels_by_level(script)


You no longer need this line now, right?

Correct. Removed it locally, but forgot to push that :(

balderdash · 2017-04-26T00:56:23Z

dashboard/app/models/user.rb

+      index_by(&:level_id)
+
+    # Find the user_level that we've most recently had progress on
+    user_level = user_levels_by_level.values.max_by(&:updated_at)


I'll defer to Asher on the performance implications of adding all the level ids to the db query, but my gut feeling is that adapting your previous approach to use the cache would be more efficient (and doesn't seem that hacky to me).

ashercodeorg

LGTM (with my comments).

ashercodeorg · 2017-04-26T11:54:10Z

dashboard/app/models/user.rb

+    # some of our user_levels may be for levels within level_groups, or for levels
+    # that are no longer in this script. we want to ignore those, and only look
+    # user_levels that have matching script_levels
+    levels = script.script_levels.map(&:level_ids).flatten


nit: Is there any advantage to or need for removing duplicates? AFAICT, there is no advantage and no need. But I'll ask the question anyways.

I should hope there aren't duplicate levels within the same script?

Is this not possible (a typical use case) with level swapping? Though I imagine the level would only be active in one script_level, I could see it existing (as part of level swapping) in many. Right?

You may be right. I know little about (and often forget about) this feature.

I guess in the case that a level appears in multiple script_levels, and it's the one you have the most recent progress on, we'll always treat the first of the script_levels containing the level as your current stage - even if that level is not currently active for the first script_level.

we could instead make this something like:
levels = script.script_levels.map{|script_level| script_level.oldest_active_level.id}
which I think would be more resilient to this?

And I also could be wrong, I'm not certain. As I said, I'm not sure there is any advantage to or need for dealing with this. My vote would be to get the fix in as-is, with a TODO to consider this edge case.

ashercodeorg · 2017-04-26T12:01:24Z

dashboard/app/models/user.rb

+      index_by(&:level_id)
+
+    # Find the user_level that we've most recently had progress on
+    user_level = user_levels_by_level.values.max_by(&:updated_at)


This could be more succinctly written as user_level = user_levels.where(script_id: script.id, level: levels).max_by(&:updated_at).

We end up needing the full list of user_levels_by_level later in the method so that we can get the set of user_levels for a particular script_level. So I think it's the difference between doing a traversal of user_levels to get the max updated at (my current code) or a second (cached) DB query, that's presumably still going to need to do a traversal to get the max.

Given that we do need user_levels_by-level later, I would have thought my approach would have been better (though it probably doesnt make a big difference)?

Ah, I missed that usage. Yes, keep as you had it.

ashercodeorg · 2017-04-26T12:06:05Z

dashboard/app/models/user.rb

+      index_by(&:level_id)
+
+    # Find the user_level that we've most recently had progress on
+    user_level = user_levels_by_level.values.max_by(&:updated_at)


My gut agrees with @balderdash, I think the other approach may be slightly more efficient. That said, I'm fairly convinced this method is efficient enough, so I don't think further changes are necessary. And I really like how it reads.

If you remind me to monitor the DB after this goes in, I'll look at whether this is worth optimizing. Regardless, I'm convinced it won't hurt production noticeably (as the non-cached script_level previously would have).

ashercodeorg · 2017-04-26T12:06:57Z

dashboard/test/models/user_test.rb

+      updated_at: Time.now
+    )
+
+    assert(user_level1.updated_at < user_level2.updated_at)


This seems unnecessary.

Sometimes I see value in asserting what's important about the test data I've constructed. Yes, it should never fail, but it also makes it so that if I'm debugging why a later assertion did fail, I can be certain about this particular assumption.

ashercodeorg · 2017-04-26T12:07:28Z

dashboard/test/models/user_test.rb

+    assert(user_level1.updated_at < user_level2.updated_at)
+
+    next_script_level = user.next_unpassed_progression_level(script)
+    puts next_script_level


This should be removed.

ashercodeorg · 2017-04-26T12:07:44Z

dashboard/test/models/user_test.rb

+
+    next_script_level = user.next_unpassed_progression_level(script)
+    puts next_script_level
+    assert !next_script_level.nil?


This should be refute_nil next_script_level.

restrict to user_levels that have an associated script_level

20bb53a

Bjvanminnen requested review from balderdash and ashercodeorg April 25, 2017 18:08

ashercodeorg suggested changes Apr 25, 2017

View reviewed changes

filter initial user_levels to those that have script_levels

6d2af6c

Bjvanminnen commented Apr 25, 2017

View reviewed changes

balderdash reviewed Apr 26, 2017

View reviewed changes

ashercodeorg approved these changes Apr 26, 2017

View reviewed changes

balderdash approved these changes Apr 26, 2017

View reviewed changes

address some code review

e260bdb

Bjvanminnen merged commit 262d9cd into staging Apr 26, 2017

Bjvanminnen deleted the honeybadger33387488 branch April 26, 2017 20:13

Bjvanminnen mentioned this pull request May 3, 2017

handle empty script #14828

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

restrict to user_levels that have an associated script_level #14639

restrict to user_levels that have an associated script_level #14639

Bjvanminnen commented Apr 25, 2017

Bjvanminnen commented Apr 25, 2017

ashercodeorg left a comment

ashercodeorg Apr 25, 2017

Bjvanminnen Apr 25, 2017

Bjvanminnen Apr 25, 2017

balderdash Apr 25, 2017

Bjvanminnen Apr 25, 2017

ashercodeorg Apr 25, 2017

Bjvanminnen Apr 25, 2017

ashercodeorg Apr 25, 2017

Bjvanminnen Apr 25, 2017

Bjvanminnen Apr 25, 2017

balderdash Apr 26, 2017

Bjvanminnen Apr 26, 2017

ashercodeorg Apr 26, 2017

balderdash Apr 26, 2017

Bjvanminnen Apr 26, 2017

balderdash Apr 26, 2017

ashercodeorg left a comment

ashercodeorg Apr 26, 2017

Bjvanminnen Apr 26, 2017

ashercodeorg Apr 26, 2017

Bjvanminnen Apr 26, 2017

ashercodeorg Apr 26, 2017

ashercodeorg Apr 26, 2017

Bjvanminnen Apr 26, 2017

ashercodeorg Apr 26, 2017

ashercodeorg Apr 26, 2017

ashercodeorg Apr 26, 2017

Bjvanminnen Apr 26, 2017

ashercodeorg Apr 26, 2017

ashercodeorg Apr 26, 2017

		@@ -620,8 +620,16 @@ def user_level_locked?(script_level, level)
		def next_unpassed_progression_level(script)
		user_levels_by_level = user_levels_by_level(script)

restrict to user_levels that have an associated script_level #14639

restrict to user_levels that have an associated script_level #14639

Conversation

Bjvanminnen commented Apr 25, 2017

Bjvanminnen commented Apr 25, 2017

ashercodeorg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ashercodeorg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment