-
Notifications
You must be signed in to change notification settings - Fork 2
DCJ-708: Refactor query performance #2412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| void testCancelDarCollectionAsChair_ChairHasDatasets() { | ||
| User user = new User(); | ||
| user.setEmail("email"); | ||
| user.setUserId(RandomUtils.nextInt(1, 10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a random user id for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a random user id for testing?
The only reason is to have something to query on. The actual value is irrelevant.
| * @return List of Dataset IDs | ||
| */ | ||
| public List<Integer> findDatasetIdsByDACUser(User user) { | ||
| return datasetDAO.findDatasetIdsByDACUserId(user.getUserId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change from user email to user id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code was using string comparison which can sometimes be flaky based on casing, etc. In theory, both should work, but I feel like the integer primary key is more stable than than the email.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, typically you want to query by the integer primary key because it will be indexed, which makes for fast searches.
rjohanek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great! so much more efficient!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, it might be worth describing the performance increase in the description 👍
| * @return List of Dataset IDs | ||
| */ | ||
| public List<Integer> findDatasetIdsByDACUser(User user) { | ||
| return datasetDAO.findDatasetIdsByDACUserId(user.getUserId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, typically you want to query by the integer primary key because it will be indexed, which makes for fast searches.
Addresses
Ticket: https://broadworkbench.atlassian.net/browse/DCJ-708
Summary
In this PR, we fix a long-running query bug that caused the instance to crash. This manifested in a never-ending loading screen for the user and an eventual redirect back to their console with an error message. The root of the problem is that it is very expensive to populate a large list of datasets which caused OOMs and the pod to crash/restart.
Fixes:
Have you read CONTRIBUTING.md lately? If not, do that first.