Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Store state groups separately from events #2784

Merged
merged 15 commits into from Feb 6, 2018

Conversation

erikjohnston
Copy link
Member

We want to be able to calculate and use state groups on workers other than the master process. To make this easier we store new state groups immediately from the state handler, rather than later when the event is stored.

This means that the state group IDs have to be generated in the DB rather than in process, since multiple workers will potentially be generating them.

@erikjohnston
Copy link
Member Author

Oh, I suppose we should rename StateReadStore, as it now also writes rather than just reads?

@richvdh
Copy link
Member

richvdh commented Feb 5, 2018

Oh, I suppose we should rename StateReadStore, as it now also writes rather than just reads?

yes please

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the tests at all; generally much of this is mysterious to me :/

@@ -1923,11 +1923,14 @@ def do_auth(self, origin, event, context, auth_events):
logger.warn("Failed auth resolution for %r because %s", event, e)
raise e

def _update_context_for_auth_events(self, context, auth_events,
@defer.inlineCallbacks
def _update_context_for_auth_events(self, event, context, auth_events,
event_key):
"""Update the state_ids in an event context after auth event resolution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you mention that this will also store the state group?


def run_create(cur, database_engine, *args, **kwargs):
if isinstance(database_engine, PostgresEngine):
cur.execute("CREATE SEQUENCE state_group_id_seq")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you not need to initialise this? (see ab8567a which I did when playing with this)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, yes. I've cherry-picked that.

where_clause="type='m.room.member'",
)

def _have_persisted_state_group_txn(self, txn, state_group):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks to be redundant now?

@@ -744,7 +744,7 @@ def _persist_events_txn(self, txn, events_and_contexts, backfilled,

# Insert into the state_groups, state_groups_state, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is misleading now

@@ -982,7 +982,7 @@ def _update_outliers_txn(self, txn, events_and_contexts):
# insert into the state_group, state_groups_state and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

synapse/state.py Outdated
context.state_group = yield self.store.store_state_group(
event.event_id,
event.room_id,
prev_group=entry.prev_group,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this entry.prev_group rather than context.prev_group as calculated above? (any idea when entry.state_group is falsy so we do the prev_group thing?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally the logic in compute_event_context is mysterious :/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this entry.prev_group rather than context.prev_group as calculated above?

That was a typo tbh.

(any idea when entry.state_group is falsy so we do the prev_group thing?)

entry.state_group is falsey if the resolved state doesn't already have a state group. This would typically happen when the prev events for the event point to two different state groups.

generally the logic in compute_event_context is mysterious :/

I've tried to add some comments

@@ -229,6 +236,14 @@ def compute_event_context(self, event, old_state=None):
else:
context.current_state_ids = context.prev_state_ids

context.state_group = yield self.store.store_state_group(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to create a new state_group if it's not a state event?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Basically, we've been explicitly told the state at this point (old_state) and we need to persist that. This e.g. happens when we're backfilling and we're told the state at this oldest event. We could try and be clever and check if we've already persisted a state group with identical state, but we're not that clever atm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -979,8 +978,7 @@ def _update_outliers_txn(self, txn, events_and_contexts):
# an outlier in the database. We now have some state at that
# so we need to update the state_groups table with that state.

# insert into the state_group, state_groups_state and
# event_to_state_groups tables.
# insert into event_to_state_groups tables.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

synapse/state.py Outdated
defer.returnValue(context)

if old_state:
# We already have the state, so we don't to calculate it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't to

synapse/state.py Outdated
defer.returnValue(context)

if old_state:
# We already have the state, so we don't to calculate it.
# Lets just correctly fill out the context and create a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's

synapse/state.py Outdated
@@ -208,15 +215,26 @@ def compute_event_context(self, event, old_state=None):
context.current_state_ids = {}
context.prev_state_ids = {}
context.prev_state_events = []
context.state_group = self.store.get_next_state_group()

context.state_group = yield self.store.store_state_group(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per #synapse-dev: we don't currently store a state group here, and it's not obvious to me that the state group we will store will be meaningful.

@@ -229,6 +236,14 @@ def compute_event_context(self, event, old_state=None):
else:
context.current_state_ids = context.prev_state_ids

context.state_group = yield self.store.store_state_group(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

# Insert into the state_groups, state_groups_state, and
# event_to_state_groups tables.
self._store_mult_state_groups_txn(txn, events_and_contexts)
# Insert into event_to_state_groups tables.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/tables/table/. Or just s/tables//

@richvdh richvdh assigned erikjohnston and unassigned richvdh Feb 6, 2018
@erikjohnston erikjohnston assigned richvdh and unassigned erikjohnston Feb 6, 2018
synapse/state.py Outdated
)
# We don't store state for outliers, so we don't generate a state
# froup for it.
context.state_group = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. the state_group field on EventContext is documented as 'state_group (int): state group id', so if it can now be None as well as an int that needs updating, please.

Args:
event_id (str): The event ID for which the state was calculated
room_id (str)
prev_group (str|None): A previous state group for the room, optional.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't state groups ints?

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really reviewed the tests so I'll trust you not to be a crank ;)

@richvdh richvdh removed their assignment Feb 6, 2018
@richvdh
Copy link
Member

richvdh commented Feb 6, 2018

oh btw please squash this, we don't need 15 "fix the comments" commits to pick through and as already established the commits don't stand on their own merits

@erikjohnston
Copy link
Member Author

I haven't really reviewed the tests so I'll trust you not to be a crank ;)

Just be grateful I didn't rage delete them :p

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants