Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Speed up state res in rare case we don't have all events #16116

Merged
merged 4 commits into from Aug 18, 2023

Conversation

erikjohnston
Copy link
Member

If we don't have all the auth events in a room then not all state events will have a chain cover index. Even so, we can still use the chain cover index on the events that do have it, rather than bailing and using the slower functions.

This situation should not arise for newly persisted rooms, as we check we have the full auth chain for each event, but can happen for existing rooms.

c.f. #15245

If we don't have all the auth events in a room then not all state events
will have a chain cover index. Even so, we can still use the chain cover
index on the events that do have it, rather than bailing and using the
slower functions.

This situation should not arise for newly persisted rooms, as we check
we have the full auth chain for each event, but can happen for existing
rooms.

c.f. #15245
for event_id in state_set:
chain_id, seq_no = chain_info[event_id]
for state_id in state_set:
chain_id, seq_no = chain_info[state_id]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just to make mypy happy, as the changes above caused the type of event_id to change :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grr at how we don't get Rust's shadowing rules in Python

@erikjohnston erikjohnston marked this pull request as ready for review August 15, 2023 15:35
@erikjohnston erikjohnston requested a review from a team as a code owner August 15, 2023 15:35
@reivilibre reivilibre self-assigned this Aug 17, 2023
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite intricate and tricky but I think I get it

synapse/storage/databases/main/event_federation.py Outdated Show resolved Hide resolved
# We pull out those events with their auth events, which gives us enough
# information to construct the auth chain of an event up to auth events
# that have the chain cover index.
sql = """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I'm following along properly, this pulls out all (event_id, authing_event_id) pairs with event_id being in the set of chain-cover-uncalculated events for this room,

then annotates these (event_id, authing_event_id) pairs with whether the auth event has a chain cover...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then annotates these (event_id, authing_event_id) pairs with whether the auth event has a chain cover...

I'm not sure what you mean by "annotate" here? We look at all the events we've pulled out to see if they are indexed and mark them as such?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'annotate' as in labels that keypair with a bool value in a map ;-)

Comment on lines +686 to +698
processing = set(auth_ids)
to_add = set()
while processing:
auth_id = processing.pop()
to_add.add(auth_id)

sub_auth_ids = event_to_auth_ids.get(auth_id)
if sub_auth_ids is None:
continue

processing.update(sub_auth_ids - to_add)

event_id_to_partial_auth_chain[event_id] = to_add
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so event_id_to_partial_auth_chain[event_id] is the transitive auth-event closure of event_id?

We say 'partial auth chain' here... what's the partial in respect to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't pull out the full auth chain, we stop when the auth events are indexed.

synapse/storage/databases/main/event_federation.py Outdated Show resolved Hide resolved
Comment on lines +625 to +627
This modifies `state_sets` so that they only include events that have a
chain cover index, and returns a set of event IDs that are part of the
auth difference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, here's the hidden mut keyword :p

Took me a moment to notice that modifying state_sets is part of the result but not sure I have a better suggestion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's icky, but here we are. I mostly wanted to avoid needlessly copying the state sets when its easier to just mutate tehm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, reasonable. Half tempted to suggest a _mut suffix convention for this type of thing but probably gets overbearing soon enough. Meh, it will do.

Comment on lines +139 to +140
# degree, we "fork" execution and run the algorithm for each node in the
# zero degree.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. for each node that has no remaining incoming connections, we give it a chance to be the next in the list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, indeed.

T = TypeVar("T")


def get_all_topologically_sorted_orders(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at what point do you start writing tests for your test helpers haha, but looks good to me — if two humans agree on it, it must be right

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, quite

Comment on lines +177 to +181
for ordering in all_topological_orderings:
ordering.reverse()

for idx in range(len(ordering)):
graph_subsets.add(frozenset(ordering[:idx]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in each topological ordering then we just choose an arbitrary cut-off point, 'forking' as you put it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup!

return new_paths


def get_all_topologically_consistent_subsets(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test helper also LGTM

for event_id in state_set:
chain_id, seq_no = chain_info[event_id]
for state_id in state_set:
chain_id, seq_no = chain_info[state_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grr at how we don't get Rust's shadowing rules in Python

erikjohnston and others added 2 commits August 18, 2023 12:47
Co-authored-by: reivilibre <oliverw@matrix.org>
@erikjohnston
Copy link
Member Author

Thanks for the review @reivilibre! Do the answers make sense?

Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answers make sense, thanks!

@erikjohnston erikjohnston merged commit bd558a6 into develop Aug 18, 2023
37 checks passed
@erikjohnston erikjohnston deleted the erikj/fix_state branch August 18, 2023 14:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants