Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accounts_loading: use load account with cache for filtering #1408

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

HaoranYi
Copy link

@HaoranYi HaoranYi commented May 17, 2024

Problem

account_matches_owner doesn't put missed account into read-only cache, which can cause one additional read-only cache misse in subsequent transaction accounts loading.

Summary of Changes

Replace account_matches_owner with regular accounts load.

Fixes #

@HaoranYi HaoranYi changed the title use load account with cache for filtering accounts_loading: use load account with cache for filtering May 17, 2024
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.7%. Comparing base (20ee70c) to head (64cef8c).
Report is 8 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #1408     +/-   ##
=========================================
- Coverage    82.7%    82.7%   -0.1%     
=========================================
  Files         871      871             
  Lines      369894   369992     +98     
=========================================
+ Hits       306158   306226     +68     
- Misses      63736    63766     +30     

@HaoranYi
Copy link
Author

image

image

blue: load_account put into cache
red: accounts_match_owner

blue has better load_time and less cache misses.

@HaoranYi HaoranYi requested a review from Lichtso May 17, 2024 17:18
@HaoranYi HaoranYi marked this pull request as ready for review May 17, 2024 17:18
@jeffwashington
Copy link

another impl here would be to modify account_matches_owners to check owner using the read cache, but if it is not in the read cache, to look on disk as it does today. This change does cause us to put the entry IN the read cache. It also returns an arc to the data, which isn't necessary since we're just checking owners (This is what getting an AccountSharedData does.). Do we want the account in the read cache? If so, then it is cheap to get AccountSharedData to just check owner. If it we don't want it in the read cache, then it is unnecessarily expensive to add to the read cache just to check the owner (if it is not in the read cache already). Do we have metrics on how often account_matches_owners is called for entries IN the write or read caches?

@Lichtso
Copy link

Lichtso commented May 17, 2024

This change does cause us to put the entry IN the read cache.
Do we want the account in the read cache?

That is intentional, we wanted to runt the counter experiment of #1157 (comment).

It also returns an arc to the data, which isn't necessary since we're just checking owners

That is acting as a pre-fetch for the subsequent actual load in transaction account loading.

Copy link

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
Sounds like the side effect of putting these accounts in the read cache (if they aren't already in the write cache) is desired.

@brooksprumo
Copy link

account_matches_owner doesn't leverage read_only cache and always goes to the underlying storage, which is inefficient.

When I look at the impl for accounts_matches_owner(), it looks to me that we do check the read cache:

pub fn account_matches_owners(
&self,
ancestors: &Ancestors,
account: &Pubkey,
owners: &[Pubkey],
) -> Result<usize, MatchAccountOwnerError> {
let (slot, storage_location, _maybe_account_accesor) = self
.read_index_for_accessor_or_load_slow(ancestors, account, None, false)
.ok_or(MatchAccountOwnerError::UnableToLoad)?;
if !storage_location.is_cached() {
let result = self.read_only_accounts_cache.load(*account, slot);
if let Some(account) = result {
return if account.is_zero_lamport() {
Err(MatchAccountOwnerError::NoMatch)
} else {
owners
.iter()
.position(|entry| account.owner() == entry)
.ok_or(MatchAccountOwnerError::NoMatch)
};
}
}
let (account_accessor, _slot) = self
.retry_to_get_account_accessor(
slot,
storage_location,
ancestors,
account,
None,
LoadHint::Unspecified,
)
.ok_or(MatchAccountOwnerError::UnableToLoad)?;
account_accessor.account_matches_owners(owners)
}

but we do not put the loaded account into the read cache if it's not there. Is the problem this PR is solving actually to put the account into the read cache?

@Lichtso
Copy link

Lichtso commented May 23, 2024

Is the problem this PR is solving actually to put the account into the read cache?

Yes, that is the idea. Though, I suspect it will make little difference because in the subsequent transaction account loading we put programs in the read cache, so in the next TX batch they are already in.

@brooksprumo
Copy link

Though, I suspect it will make little difference because in the subsequent transaction account loading we put programs in the read cache, so in the next TX batch they are already in.

I see. Is this PR still the right direction then? The results from @HaoranYi indicate an improvement. Is that still accurate?

@HaoranYi
Copy link
Author

HaoranYi commented May 23, 2024

The profiler from @alessandrod shows that account_match_owner causes 2/3 of all the page faults. We had thought that the page faults were due to not using read cache. Apparently, that's not the reason. Maybe the page faults are due to accounts_index.
@alessandrod When you have time, can you run your profiler with this PR to see if the page faults are reduced?

@HaoranYi
Copy link
Author

HaoranYi commented May 23, 2024

Is the problem this PR is solving actually to put the account into the read cache?

Yes, that is the idea. Though, I suspect it will make little difference because in the subsequent transaction account loading we put programs in the read cache, so in the next TX batch they are already in.

Putting into the cache after checking owners helps the account loading later in the same TX batch. Instead of two read-cache misses, we only have one miss. That's probably why read cache miss is lower with this PR.

Copy link

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@alessandrod
Copy link

@alessandrod When you have time, can you run your profiler with this PR to see if the page faults are reduced?

Yep planning to do this today, please don't merge yet

@alessandrod
Copy link

Screenshot 2024-05-23 at 11 24 16 am

As you can see the cursed account_matches_owners is 59% of execution. About 3/4 is AppendVec::account_matches_owners - notice AppendVec - meaning we're hitting disk. The rest is the index.

if !storage_location.is_cached() {
this will only look inside the read cache, not the write cache. If the account is not in the read cache, it'll hit disk. Since we pass all accounts to account_matches_owners, we're probably hitting disk for anything that is in the write cache.

We should probably just remove this cursed function. Load all the accounts, then filter the programs later once you have the accounts.

@HaoranYi
Copy link
Author

HaoranYi commented May 23, 2024

We should probably just remove this cursed function. Load all the accounts, then filter the programs later once you have the accounts.

That's similar to #1157. The only difference is that we don't put program accounts into read cache in #1157.
Should we update #1157 to put program accounts into read cache too?

@HaoranYi
Copy link
Author

Performance comparison for a longer time frame 24h
load time and replay time

red: master; blue: this PR.

image
image

@alessandrod
Copy link

We should probably just remove this cursed function. Load all the accounts, then filter the programs later once you have the accounts.

That's similar to #1157. The only difference is that we don't put program accounts into read cache in #1157. Should we update #1157 to put program accounts into read cache too?

Yeah that sounds better than keeping account_matches_owners

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants