-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update user listen history query to use new indexed table #2185
Conversation
9256907
to
a3890cb
Compare
a3890cb
to
fe91ef8
Compare
@@ -371,7 +371,6 @@ def format_offset(args, max_offset=MAX_LIMIT): | |||
def get_default_max(value, default, max=None): | |||
if not isinstance(value, int): | |||
return default | |||
elif max is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think new linter asked for this change.
# bundle peripheral info into track results | ||
tracks = populate_track_metadata(session, track_ids, tracks, current_user_id) | ||
|
||
if args.get("with_users", False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with_users is always true and user data is already populated due to model relationship.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we miss a step where we need to migrate existing user history into this table?
tracks = helpers.query_result_to_list(sorted_track_results) | ||
|
||
# bundle peripheral info into track results | ||
tracks = populate_track_metadata(session, track_ids, tracks, current_user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can shave a round trip to the DB by just having populate_track_metadata grab the raw track themselves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah I'll see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't see a way we can avoid querying for track data. Do you see a potential optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i think you're right here. In the future, probably we could rewrite populate_track_metadata to be join based rather than 2 round trips
"with_users": True, | ||
} | ||
track_history = get_track_history(get_tracks_args) | ||
get_tracks_args = GetUserListeningHistoryArgs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while I agree with your change (that we should be using current user id), that's not how this worked before and it makes this endpoint semantically kind of weird because you could specify user_id
and it effectively does nothing.
can we keep that existing behavior and pass both current_user_id and user_id through to the queries file?
To get the behavior you added, we could just do
if (user_id != current_user_id) return []
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup will do 👍
Like this? https://github.com/AudiusProject/audius-protocol/pull/2133/files It's not in prod yet so I'm gonna wait until it's populated before merging this. |
tracks = helpers.query_result_to_list(sorted_track_results) | ||
|
||
# bundle peripheral info into track results | ||
tracks = populate_track_metadata(session, track_ids, tracks, current_user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i think you're right here. In the future, probably we could rewrite populate_track_metadata to be join based rather than 2 round trips
75de877
to
e9f9b0b
Compare
Description
Query user_listening_history table directly instead of plays. This will enable pruning old plays. The only difference between the response should be deduplication.
Tests
Tested against prod snapshot. Confirmed de-duplication.
Remote box pointed to prod snapshot can take from 2 seconds to 30s. About the same as master pointed to prod snapshot.. This approach avoids querying plays directly but does query track. Not sure if there's a better way.
How will this change be monitored?
Manually checking /history.