Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restrict to user_levels that have an associated script_level #14639

Merged
merged 3 commits into from
Apr 26, 2017

Conversation

Bjvanminnen
Copy link
Contributor

fix for https://app.honeybadger.io/projects/3240/faults/33387488

This HB alert is curious for a few reasons.
(1) It popped up after yesterday's DTP, but this code has been in place for some time.
(2) One of the instances I investigated had what I believe to be bad data, namely a UserLevel where the associated script_id was for a script that does not contain this level. I'm not sure how that came about.

That said, this did all reveal what I believe to be a legit bug.

If you're on a script and you answer a question within a level group, we'll make milestone requests for both the particular level and the level group. If the LevelGroup goes through first, then you end up in a situation where your most recent UserLevel points to a Level (the multi inside the LevelGroup) that has no script_level.

Solution is to filter to user_levels that will allow us to get back to a script_level (so that line 628/633 doesn't result in nil)

@Bjvanminnen
Copy link
Contributor Author

Assigned Ram to review, because he reviewed my last PR related to next_unpassed_progression_level. Asher, because he often has good ideas these sorts of things.

Copy link
Contributor

@ashercodeorg ashercodeorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we verified that there are no extra DB hits as a result of this change? In particular, I'm concerned about a N+1 query fanout. Though we cache the script hierarchy, I think this only happens when you start with a cached object, not when you access a (say) level from a user_level.

If you are hitting the DB, a hack might be to map the user_level.level_id to Level.cache_find(level_id). Though that is pretty hacky.

@@ -619,9 +619,14 @@ def user_level_locked?(script_level, level)
# script that hasn't yet been passed, starting its search at the last level we submitted
def next_unpassed_progression_level(script)
user_levels_by_level = user_levels_by_level(script)
user_levels = user_levels_by_level.values.flatten.select do |user_level|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems simpler to do user_levels.where(script: script) than user_levels_by_level(script).values.flatten if you want to simplify pre-existing code. Aside, I don't understand why the flatten is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the cases I was looking at was an instance where user_level.script_id was actually for a different script than any of the user_levels. I must admit, I dont understand this scenario:

irb(main):062:0> UserLevel.find(985612048).script_id
=> 142
irb(main):063:0> UserLevel.find(985612048).level.script_levels.map{|sl| sl.script_id}
=> [230]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I misunderstood your comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That userlevel's level used to be in script 142, but was recently removed: 135b583

One benefit to this change is that it will now work properly if your last progress was on a level that has since been removed from the script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh....that helps explain why we just started hitting this error.

user_levels = user_levels_by_level.values.flatten.select do |user_level|
# we want to ignore levels that dont have an associated script_level (such
# as multis within a LevelGroup)
user_level.level.script_levels.where(script_id: script.id).present?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between present? and any? here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know the right ruby method here. I can probably switch to any?


# Find the user level that we've most recently had progress on
user_level = user_levels_by_level.values.flatten.sort_by!(&:updated_at).last
user_level = user_levels.sort_by!(&:updated_at).last
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid a sort by doing something like max_by {|user_level| user_level.updated_at}?

index_by(&:level_id)

# Find the user_level that we've most recently had progress on
user_level = user_levels_by_level.values.max_by(&:updated_at)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New approach. Instead of calling script_levels on each level (which was a db hit each time), we start out by only getting user_levels for which we have a script_level. The call to script.script_levels should be cached, as is the call to level_ids on each script_level.

Now, on line 636 we're guaranteed to have last_script_level by non-nil, and thus our return will be non-nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also worth noting that this method is likely failing silently in many cases in production right now. Most of the time, when we fail to find a last_script_level, we're okay because there is at least one unpassed progression level elsewhere in the script. In the case where that is not true, this method returns nil, and then we get HB errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to Asher on the performance implications of adding all the level ids to the db query, but my gut feeling is that adapting your previous approach to use the cache would be more efficient (and doesn't seem that hacky to me).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous approaches cache problem was not where we thought. user_level.level and Level.find_cache(user_level.level_id) both end up hitting the cache. However, with both of those, when you end up doing level.script_levels we do not have a cache to hit.

This new approach should be much better at avoiding non-cached db hits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My gut agrees with @balderdash, I think the other approach may be slightly more efficient. That said, I'm fairly convinced this method is efficient enough, so I don't think further changes are necessary. And I really like how it reads.

If you remind me to monitor the DB after this goes in, I'll look at whether this is worth optimizing. Regardless, I'm convinced it won't hurt production noticeably (as the non-cached script_level previously would have).

@@ -620,8 +620,16 @@ def user_level_locked?(script_level, level)
def next_unpassed_progression_level(script)
user_levels_by_level = user_levels_by_level(script)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You no longer need this line now, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Removed it locally, but forgot to push that :(

index_by(&:level_id)

# Find the user_level that we've most recently had progress on
user_level = user_levels_by_level.values.max_by(&:updated_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to Asher on the performance implications of adding all the level ids to the db query, but my gut feeling is that adapting your previous approach to use the cache would be more efficient (and doesn't seem that hacky to me).

Copy link
Contributor

@ashercodeorg ashercodeorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (with my comments).

# some of our user_levels may be for levels within level_groups, or for levels
# that are no longer in this script. we want to ignore those, and only look
# user_levels that have matching script_levels
levels = script.script_levels.map(&:level_ids).flatten
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is there any advantage to or need for removing duplicates? AFAICT, there is no advantage and no need. But I'll ask the question anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should hope there aren't duplicate levels within the same script?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not possible (a typical use case) with level swapping? Though I imagine the level would only be active in one script_level, I could see it existing (as part of level swapping) in many. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be right. I know little about (and often forget about) this feature.

I guess in the case that a level appears in multiple script_levels, and it's the one you have the most recent progress on, we'll always treat the first of the script_levels containing the level as your current stage - even if that level is not currently active for the first script_level.

we could instead make this something like:
levels = script.script_levels.map{|script_level| script_level.oldest_active_level.id}
which I think would be more resilient to this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I also could be wrong, I'm not certain. As I said, I'm not sure there is any advantage to or need for dealing with this. My vote would be to get the fix in as-is, with a TODO to consider this edge case.

index_by(&:level_id)

# Find the user_level that we've most recently had progress on
user_level = user_levels_by_level.values.max_by(&:updated_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be more succinctly written as user_level = user_levels.where(script_id: script.id, level: levels).max_by(&:updated_at).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We end up needing the full list of user_levels_by_level later in the method so that we can get the set of user_levels for a particular script_level. So I think it's the difference between doing a traversal of user_levels to get the max updated at (my current code) or a second (cached) DB query, that's presumably still going to need to do a traversal to get the max.

Given that we do need user_levels_by-level later, I would have thought my approach would have been better (though it probably doesnt make a big difference)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that usage. Yes, keep as you had it.

index_by(&:level_id)

# Find the user_level that we've most recently had progress on
user_level = user_levels_by_level.values.max_by(&:updated_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My gut agrees with @balderdash, I think the other approach may be slightly more efficient. That said, I'm fairly convinced this method is efficient enough, so I don't think further changes are necessary. And I really like how it reads.

If you remind me to monitor the DB after this goes in, I'll look at whether this is worth optimizing. Regardless, I'm convinced it won't hurt production noticeably (as the non-cached script_level previously would have).

updated_at: Time.now
)

assert(user_level1.updated_at < user_level2.updated_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes I see value in asserting what's important about the test data I've constructed. Yes, it should never fail, but it also makes it so that if I'm debugging why a later assertion did fail, I can be certain about this particular assumption.

assert(user_level1.updated_at < user_level2.updated_at)

next_script_level = user.next_unpassed_progression_level(script)
puts next_script_level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed.


next_script_level = user.next_unpassed_progression_level(script)
puts next_script_level
assert !next_script_level.nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be refute_nil next_script_level.

@Bjvanminnen Bjvanminnen merged commit 262d9cd into staging Apr 26, 2017
@Bjvanminnen Bjvanminnen deleted the honeybadger33387488 branch April 26, 2017 20:13
@Bjvanminnen Bjvanminnen mentioned this pull request May 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants