Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Reduce replication traffic due to reflected cache stream POSITION #16557

Merged
merged 3 commits into from Oct 27, 2023

Conversation

erikjohnston
Copy link
Member

This was exacerbated by #16473

synapse/replication/tcp/resource.py Show resolved Hide resolved
Comment on lines 210 to 211
# echo each write, and b) nothing cares if a given
# worker's caches stream position lags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that workers don't need to track other workers cache stream positions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is nothing uses the global "persisted upto position" for the caches stream, as everything that uses the caches streams only care about each streams position, not any sort of "combined" position. The reason we care about that for other streams is there is some code that still thinks that all streams can be considered as a single integer (e.g. federation and application sending code).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global "persisted upto position" for the caches stream

Just to refresh my memory -- this is the minimum persisted position across all workers, correct?

everything that uses the caches streams only care about each streams position

Because the caches stream gets blasted to everyone and they just clear their caches but we don't really care about tracking it? Do we even care about each stream's position in it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note I've just pushed a commit that changes things a bit)

global "persisted upto position" for the caches stream

Just to refresh my memory -- this is the minimum persisted position across all workers, correct?

Yup.

everything that uses the caches streams only care about each streams position

Because the caches stream gets blasted to everyone and they just clear their caches but we don't really care about tracking it? Do we even care about each stream's position in it?

There are two usages of the caches stream:

  1. Reliably receiving updates about cache invalidations that happened on other workers. We use the stream position here purely to detect when gaps happened.
  2. We sometimes want to wait for the caches stream to get to a particular position (this where we do a HTTP request and we want to wait for the action to replicate to the requesting worker after receiving a response).

@erikjohnston erikjohnston force-pushed the erikj/reduce_caches_position_traffic branch from 4d8735a to 22b2cca Compare October 26, 2023 14:37
@erikjohnston erikjohnston marked this pull request as ready for review October 27, 2023 08:46
@erikjohnston erikjohnston requested a review from a team as a code owner October 27, 2023 08:46
Copy link
Contributor

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly certain this is OK.

@erikjohnston erikjohnston merged commit 0680d76 into develop Oct 27, 2023
41 checks passed
@erikjohnston erikjohnston deleted the erikj/reduce_caches_position_traffic branch October 27, 2023 11:51
erikjohnston added a commit that referenced this pull request Oct 27, 2023
Follow on from / actually correctly does #16557
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants