Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State key NID mappings can be missing #2094

Open
kegsay opened this issue Jan 19, 2022 · 1 comment
Open

State key NID mappings can be missing #2094

kegsay opened this issue Jan 19, 2022 · 1 comment
Labels
C-Roomserver T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Fix-With-Monolith This issue can be resolved with a monolith-only arch

Comments

@kegsay
Copy link
Member

kegsay commented Jan 19, 2022

`error="found 33449 users but only have state key nids for 33447 of them"``

Caused by roomserver code in storage: JoinedUsersSetInRooms:

func (d *Database) JoinedUsersSetInRooms(ctx context.Context, roomIDs []string) (map[string]int, error) {
	roomNIDs, err := d.RoomsTable.BulkSelectRoomNIDs(ctx, roomIDs)
	if err != nil {
		return nil, err
	}
	userNIDToCount, err := d.MembershipTable.SelectJoinedUsersSetForRooms(ctx, roomNIDs)
	if err != nil {
		return nil, err
	}
	stateKeyNIDs := make([]types.EventStateKeyNID, len(userNIDToCount))
	i := 0
	for nid := range userNIDToCount {
		stateKeyNIDs[i] = nid
		i++
	}
	nidToUserID, err := d.EventStateKeysTable.BulkSelectEventStateKey(ctx, stateKeyNIDs)
	if err != nil {
		return nil, err
	}
	if len(nidToUserID) != len(userNIDToCount) {
		return nil, fmt.Errorf("found %d users but only have state key nids for %d of them", len(userNIDToCount), len(nidToUserID))
	}

It shouldn't be possible to have NIDs in the membership table but not in the state keys table, but apparently it is.

@kegsay
Copy link
Member Author

kegsay commented Jan 19, 2022

So the place where InsertMembership is called is in membershipUpdaterTxn which is called just after targetUserNID, err = d.assignStateKeyNID(ctx, txn, targetUserID) when NewMembershipUpdater is made. However, assignStateKeyNID hits a cache first:

func (d *Database) assignStateKeyNID(
	ctx context.Context, txn *sql.Tx, eventStateKey string,
) (types.EventStateKeyNID, error) {
	if eventStateKeyNID, ok := d.Cache.GetRoomServerStateKeyNID(eventStateKey); ok {
		return eventStateKeyNID, nil
	}
	// Check if we already have a numeric ID in the database.
	eventStateKeyNID, err := d.EventStateKeysTable.SelectEventStateKeyNID(ctx, txn, eventStateKey)
	if err == sql.ErrNoRows {
		// We don't have a numeric ID so insert one into the database.
		eventStateKeyNID, err = d.EventStateKeysTable.InsertEventStateKeyNID(ctx, txn, eventStateKey)
		if err == sql.ErrNoRows {
			// We raced with another insert so run the select again.
			eventStateKeyNID, err = d.EventStateKeysTable.SelectEventStateKeyNID(ctx, txn, eventStateKey)
		}
	}
	if err == nil {
		d.Cache.StoreRoomServerStateKeyNID(eventStateKey, eventStateKeyNID)
	}
	return eventStateKeyNID, err
}

If it was possible for it to be in the cache but not the database, this would cause this error. Given this is a rare condition to hit, I wouldn't be surprised if the race logic here doesn't work as intended. If that were the case then the flow could be:

  • Insert state key, get back NID 7.
  • Insert state key, get back NID 8.
  • Cache state key -> 7.
  • Cache state key -> 8.
  • The control flow for 8 bails out (possibly due to conflicts from 7 or some other reason)
  • The control flow for 7 continues to success.
  • Subsequent calls for this state key will return a state key 8 which isn't in the database.

S7evinK added a commit that referenced this issue Nov 11, 2022
This should fix #2696 and possibly other related issues regarding
missing user NIDs.
(#2094?)
@kegsay kegsay added T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. C-Roomserver and removed bug labels Dec 5, 2022
@kegsay kegsay added the X-Fix-With-Monolith This issue can be resolved with a monolith-only arch label Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Roomserver T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Fix-With-Monolith This issue can be resolved with a monolith-only arch
Projects
None yet
Development

No branches or pull requests

1 participant