Skip to content

Conversation

@rushtong
Copy link
Contributor

@rushtong rushtong commented Oct 16, 2024

Addresses

Ticket: https://broadworkbench.atlassian.net/browse/DCJ-708

Summary

In this PR, we fix a long-running query bug that caused the instance to crash. This manifested in a never-ending loading screen for the user and an eventual redirect back to their console with an error message. The root of the problem is that it is very expensive to populate a large list of datasets which caused OOMs and the pod to crash/restart.

Fixes:

  • Rename service/dao methods to be more reflective of what they are actually doing
  • Query only for the Dataset ID instead of the full dataset object
    • This allows the server to complete the response instead of OOM-ing and crashing the instance
  • Update tests to match new signatures.

Have you read CONTRIBUTING.md lately? If not, do that first.

  • Label PR with a Jira ticket number and include a link to the ticket
  • Label PR with a security risk modifier [no, low, medium, high]
  • PR describes scope of changes
  • Get a minimum of one thumbs worth of review, preferably two if enough team members are available
  • Get PO sign-off for all non-trivial UI or workflow changes
  • Verify all tests go green
  • Test this change deployed correctly and works on dev environment after deployment

@rushtong rushtong marked this pull request as ready for review October 16, 2024 14:23
@rushtong rushtong requested a review from a team as a code owner October 16, 2024 14:23
void testCancelDarCollectionAsChair_ChairHasDatasets() {
User user = new User();
user.setEmail("email");
user.setUserId(RandomUtils.nextInt(1, 10));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a random user id for testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a random user id for testing?

The only reason is to have something to query on. The actual value is irrelevant.

* @return List of Dataset IDs
*/
public List<Integer> findDatasetIdsByDACUser(User user) {
return datasetDAO.findDatasetIdsByDACUserId(user.getUserId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change from user email to user id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code was using string comparison which can sometimes be flaky based on casing, etc. In theory, both should work, but I feel like the integer primary key is more stable than than the email.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, typically you want to query by the integer primary key because it will be indexed, which makes for fast searches.

Copy link
Contributor

@rjohanek rjohanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great! so much more efficient!

Copy link
Contributor

@fboulnois fboulnois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, it might be worth describing the performance increase in the description 👍

* @return List of Dataset IDs
*/
public List<Integer> findDatasetIdsByDACUser(User user) {
return datasetDAO.findDatasetIdsByDACUserId(user.getUserId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, typically you want to query by the integer primary key because it will be indexed, which makes for fast searches.

@rushtong rushtong merged commit 6edb7af into develop Oct 16, 2024
@rushtong rushtong deleted the gr-DCJ-708-refactor-query branch October 16, 2024 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants