-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restrict to user_levels that have an associated script_level #14639
Conversation
Assigned Ram to review, because he reviewed my last PR related to next_unpassed_progression_level. Asher, because he often has good ideas these sorts of things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we verified that there are no extra DB hits as a result of this change? In particular, I'm concerned about a N+1
query fanout. Though we cache the script hierarchy, I think this only happens when you start with a cached object, not when you access a (say) level
from a user_level
.
If you are hitting the DB, a hack might be to map the user_level.level_id
to Level.cache_find(level_id)
. Though that is pretty hacky.
dashboard/app/models/user.rb
Outdated
@@ -619,9 +619,14 @@ def user_level_locked?(script_level, level) | |||
# script that hasn't yet been passed, starting its search at the last level we submitted | |||
def next_unpassed_progression_level(script) | |||
user_levels_by_level = user_levels_by_level(script) | |||
user_levels = user_levels_by_level.values.flatten.select do |user_level| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems simpler to do user_levels.where(script: script)
than user_levels_by_level(script).values.flatten
if you want to simplify pre-existing code. Aside, I don't understand why the flatten
is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the cases I was looking at was an instance where user_level.script_id was actually for a different script than any of the user_levels. I must admit, I dont understand this scenario:
irb(main):062:0> UserLevel.find(985612048).script_id
=> 142
irb(main):063:0> UserLevel.find(985612048).level.script_levels.map{|sl| sl.script_id}
=> [230]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I misunderstood your comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That userlevel's level used to be in script 142, but was recently removed: 135b583
One benefit to this change is that it will now work properly if your last progress was on a level that has since been removed from the script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh....that helps explain why we just started hitting this error.
dashboard/app/models/user.rb
Outdated
user_levels = user_levels_by_level.values.flatten.select do |user_level| | ||
# we want to ignore levels that dont have an associated script_level (such | ||
# as multis within a LevelGroup) | ||
user_level.level.script_levels.where(script_id: script.id).present? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between present?
and any?
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know the right ruby method here. I can probably switch to any?
dashboard/app/models/user.rb
Outdated
|
||
# Find the user level that we've most recently had progress on | ||
user_level = user_levels_by_level.values.flatten.sort_by!(&:updated_at).last | ||
user_level = user_levels.sort_by!(&:updated_at).last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid a sort by doing something like max_by {|user_level| user_level.updated_at}
?
dashboard/app/models/user.rb
Outdated
index_by(&:level_id) | ||
|
||
# Find the user_level that we've most recently had progress on | ||
user_level = user_levels_by_level.values.max_by(&:updated_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New approach. Instead of calling script_levels on each level (which was a db hit each time), we start out by only getting user_levels for which we have a script_level. The call to script.script_levels
should be cached, as is the call to level_ids on each script_level.
Now, on line 636 we're guaranteed to have last_script_level by non-nil, and thus our return will be non-nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also worth noting that this method is likely failing silently in many cases in production right now. Most of the time, when we fail to find a last_script_level
, we're okay because there is at least one unpassed progression level elsewhere in the script. In the case where that is not true, this method returns nil, and then we get HB errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll defer to Asher on the performance implications of adding all the level ids to the db query, but my gut feeling is that adapting your previous approach to use the cache would be more efficient (and doesn't seem that hacky to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous approaches cache problem was not where we thought. user_level.level
and Level.find_cache(user_level.level_id)
both end up hitting the cache. However, with both of those, when you end up doing level.script_levels
we do not have a cache to hit.
This new approach should be much better at avoiding non-cached db hits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My gut agrees with @balderdash, I think the other approach may be slightly more efficient. That said, I'm fairly convinced this method is efficient enough, so I don't think further changes are necessary. And I really like how it reads.
If you remind me to monitor the DB after this goes in, I'll look at whether this is worth optimizing. Regardless, I'm convinced it won't hurt production noticeably (as the non-cached script_level
previously would have).
dashboard/app/models/user.rb
Outdated
@@ -620,8 +620,16 @@ def user_level_locked?(script_level, level) | |||
def next_unpassed_progression_level(script) | |||
user_levels_by_level = user_levels_by_level(script) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You no longer need this line now, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Removed it locally, but forgot to push that :(
dashboard/app/models/user.rb
Outdated
index_by(&:level_id) | ||
|
||
# Find the user_level that we've most recently had progress on | ||
user_level = user_levels_by_level.values.max_by(&:updated_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll defer to Asher on the performance implications of adding all the level ids to the db query, but my gut feeling is that adapting your previous approach to use the cache would be more efficient (and doesn't seem that hacky to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (with my comments).
dashboard/app/models/user.rb
Outdated
# some of our user_levels may be for levels within level_groups, or for levels | ||
# that are no longer in this script. we want to ignore those, and only look | ||
# user_levels that have matching script_levels | ||
levels = script.script_levels.map(&:level_ids).flatten |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is there any advantage to or need for removing duplicates? AFAICT, there is no advantage and no need. But I'll ask the question anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should hope there aren't duplicate levels within the same script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not possible (a typical use case) with level swapping? Though I imagine the level would only be active in one script_level, I could see it existing (as part of level swapping) in many. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may be right. I know little about (and often forget about) this feature.
I guess in the case that a level appears in multiple script_levels, and it's the one you have the most recent progress on, we'll always treat the first of the script_levels containing the level as your current stage - even if that level is not currently active for the first script_level.
we could instead make this something like:
levels = script.script_levels.map{|script_level| script_level.oldest_active_level.id}
which I think would be more resilient to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I also could be wrong, I'm not certain. As I said, I'm not sure there is any advantage to or need for dealing with this. My vote would be to get the fix in as-is, with a TODO to consider this edge case.
dashboard/app/models/user.rb
Outdated
index_by(&:level_id) | ||
|
||
# Find the user_level that we've most recently had progress on | ||
user_level = user_levels_by_level.values.max_by(&:updated_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be more succinctly written as user_level = user_levels.where(script_id: script.id, level: levels).max_by(&:updated_at)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We end up needing the full list of user_levels_by_level later in the method so that we can get the set of user_levels for a particular script_level. So I think it's the difference between doing a traversal of user_levels to get the max updated at (my current code) or a second (cached) DB query, that's presumably still going to need to do a traversal to get the max.
Given that we do need user_levels_by-level later, I would have thought my approach would have been better (though it probably doesnt make a big difference)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed that usage. Yes, keep as you had it.
dashboard/app/models/user.rb
Outdated
index_by(&:level_id) | ||
|
||
# Find the user_level that we've most recently had progress on | ||
user_level = user_levels_by_level.values.max_by(&:updated_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My gut agrees with @balderdash, I think the other approach may be slightly more efficient. That said, I'm fairly convinced this method is efficient enough, so I don't think further changes are necessary. And I really like how it reads.
If you remind me to monitor the DB after this goes in, I'll look at whether this is worth optimizing. Regardless, I'm convinced it won't hurt production noticeably (as the non-cached script_level
previously would have).
updated_at: Time.now | ||
) | ||
|
||
assert(user_level1.updated_at < user_level2.updated_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes I see value in asserting what's important about the test data I've constructed. Yes, it should never fail, but it also makes it so that if I'm debugging why a later assertion did fail, I can be certain about this particular assumption.
dashboard/test/models/user_test.rb
Outdated
assert(user_level1.updated_at < user_level2.updated_at) | ||
|
||
next_script_level = user.next_unpassed_progression_level(script) | ||
puts next_script_level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed.
dashboard/test/models/user_test.rb
Outdated
|
||
next_script_level = user.next_unpassed_progression_level(script) | ||
puts next_script_level | ||
assert !next_script_level.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be refute_nil next_script_level
.
fix for https://app.honeybadger.io/projects/3240/faults/33387488
This HB alert is curious for a few reasons.
(1) It popped up after yesterday's DTP, but this code has been in place for some time.
(2) One of the instances I investigated had what I believe to be bad data, namely a UserLevel where the associated script_id was for a script that does not contain this level. I'm not sure how that came about.
That said, this did all reveal what I believe to be a legit bug.
If you're on a script and you answer a question within a level group, we'll make milestone requests for both the particular level and the level group. If the LevelGroup goes through first, then you end up in a situation where your most recent UserLevel points to a Level (the multi inside the LevelGroup) that has no script_level.
Solution is to filter to user_levels that will allow us to get back to a script_level (so that line 628/633 doesn't result in nil)