Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify pending object and clear expired items from pending queue #3135

Merged
merged 18 commits into from
May 2, 2024

Conversation

raychu86
Copy link
Contributor

@raychu86 raychu86 commented Feb 28, 2024

Motivation

This PR unifies the pending and callbacks objects in the Pending struct for better tracking/access.

This PR also adds additional logic to properly clear the items from the Pending queue. Previously, we would clear the expired callbacks, but not clear the items from the pending queue; this leave an artifact in the queue for every unique element that was not removed explicitly.

In addition, we were only clearing items when an insert, remove, or num_callbacks function was called for that particular item. So if for any reason, the nodes are unable to complete the pending request, the item/callbacks will stay in the queue until the node is restarted.

The new changes will properly clear the expired items and periodically clear the pending queue of the expired callbacks and items. This should mitigate the incremental memory growth and also the redundant requests issue.

The CALLBACK_EXPIRATION_IN_SECS was also updated to prevent truncation, which should address the case where the callback expired but the fetch timeout hasn't.

Related PRs

https://github.com/AleoHQ/snarkOS/pull/3088

Copy link
Collaborator

@ljedrz ljedrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left 1 last comment with an additional, optional perf improvement

@raychu86 raychu86 changed the title Clear expired items from pending queue Unify pending object and clear expired items from pending queue Mar 11, 2024
@raychu86 raychu86 requested a review from ljedrz March 11, 2024 23:01
node/bft/src/helpers/pending.rs Outdated Show resolved Hide resolved
node/bft/src/helpers/pending.rs Show resolved Hide resolved
pub fn clear_expired_callbacks(&self) {
let items = self.pending.read().keys().copied().collect::<Vec<T>>();
let now = OffsetDateTime::now_utc().unix_timestamp();
for item in items.into_iter() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once this loop is done, it might make sense to call .shrink_to_fit on self.pending, as long as we don't call this method too often

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ljedrz We call clear_expired_callbacks quite often - on an interval, so I'd imagine we'd need some way to determine when it'd be appropriate to call shrink_to_fit.

Copy link
Collaborator

@ljedrz ljedrz Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall having seen this collection in the past heap profiles, but I'll keep it in mind, then 👍.

let item = item.into();

// Acquire the pending lock.
let mut pending = self.pending.write();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lock is acquired in a loop; it might make sense to tweak the API so that it's done just once for the duration of the clear_expired_callbacks loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in ProvableHQ@964599b.

This also removes a use of collect.

@ljedrz Can you review the new change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍; we can now consider if we still need the clear_expired_callbacks_for_item method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still use the clear_expired_callbacks_for_item on individual calls to num_sent_requests, num_callbacks, and insert. Primarily to clear the expired callbacks in order to fetch the correct output number.

node/bft/src/sync/mod.rs Outdated Show resolved Hide resolved
node/bft/src/worker.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@ljedrz ljedrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few new comments and suggestions.

Copy link
Collaborator

@niklaslong niklaslong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending Lukasz's comments. Left a few small suggestions.

raychu86 and others added 4 commits April 30, 2024 08:54
Co-authored-by: ljedrz <ljedrz@users.noreply.github.com>
Signed-off-by: Raymond Chu <14917648+raychu86@users.noreply.github.com>
@ghostant-1017
Copy link
Contributor

LGTM

@howardwu howardwu merged commit c4e1c89 into mainnet-staging May 2, 2024
24 checks passed
@howardwu howardwu deleted the adjust-timeouts branch May 2, 2024 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants