Skip to content

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Aug 27, 2025

Remove sentinel logcontext where we log in setup, start, and exit.

Instead of having one giant PR that removes all places we use sentinel logcontext, I've decided to tackle this more piece-meal. This PR covers the parts if you just startup Synapse and exit it with no requests or activity going on in between.

Part of #18905 (Remove sentinel logcontext where we log in Synapse)

Prerequisite for #18868. Logging with the sentinel logcontext means we won't know which server the log came from.

Why

Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
logcontext as we want to know which server the logs came from. In practice, this is not
always the case yet especially outside of request handling.
Global things outside of Synapse (e.g. Twisted reactor code) should run in the
`sentinel` logcontext. It's only when it calls into application code that a logcontext
gets activated. This means the reactor should be started in the `sentinel` logcontext,
and any time an awaitable yields control back to the reactor, it should reset the
logcontext to be the `sentinel` logcontext. This is important to avoid leaking the
current logcontext to the reactor (which would then get picked up and associated with
the next thing the reactor does).

(docs updated in #18900)

Testing strategy

  1. Run Synapse normally and with daemonize: true: poetry run synapse_homeserver --config-path homeserver.yaml
  2. Execute some requests
  3. Shutdown the server
  4. Look for any bad log entries in your homeserver logs:
    • Expected logging context sentinel but found main
    • Expected logging context main was lost
    • Expected previous context
    • utime went backwards!/stime went backwards!
    • Called stop on logcontext POST-0 without recording a start rusage
  5. Look for any logs coming from the sentinel context

With these changes, you should only see the following logs (not from Synapse) using the sentinel context if you start up Synapse and exit:

homeserver.log

2025-09-10 14:45:39,924 - asyncio - 64 - DEBUG - sentinel - Using selector: EpollSelector

2025-09-10 14:45:40,562 - twisted - 281 - INFO - sentinel - Received SIGINT, shutting down.

2025-09-10 14:45:40,562 - twisted - 281 - INFO - sentinel - (TCP Port 9322 Closed)
2025-09-10 14:45:40,563 - twisted - 281 - INFO - sentinel - (TCP Port 8008 Closed)
2025-09-10 14:45:40,563 - twisted - 281 - INFO - sentinel - (TCP Port 9093 Closed)
2025-09-10 14:45:40,564 - twisted - 281 - INFO - sentinel - Main loop terminated.

Dev notes

Logcontexts

Whenever we yield to the Twisted reactor (event loop), we need to set the sentinel log context so log contexts don't leak and apply to the next task.


Synapse log context docs: docs/log_contexts.md



The make_deferred_yieldable(...) function is a way of doing so, but it is equivalent to using with PreserveLoggingContext():, i.e. it clears the logcontext before awaiting (and so before execution passes back to the reactor) and restores the old context once the awaitable completes (execution passes from the reactor back to the code).

-- #18357 (comment)


Expected logging context sentinel but found main
Expected logging context main was lost
backwards

This is part of tracing though:

There was no active span when trying to log. Did you forget to start one or did a context slip?

sentinel spots

PreserveLoggingContext
clock.looping_call

Todo

Previous todo list

I've since decided to tackle this piece-meal and added these notes to #18905

  • Ensure usages of run_as_background_process use make_deferred_yieldable if they wait on the result
  • Ensure usages of run_in_background use make_deferred_yieldable if they wait on the result

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

```
2025-08-26 18:40:27,996 - my.synapse.linux.server - synapse.app.homeserver - 187 - WARNING - main - Starting daemon.
2025-08-26 18:40:27,996 - my.synapse.linux.server - synapse.app.homeserver - 181 - WARNING - atexit - Stopping daemon.
```
Running a normal server (`daemonize: false`):

```
poetry run synapse_homeserver --config-path homeserver.yaml
```

Bad logs being seen:

```
PreserveLoggingContext: Expected logging context sentinel but found main
```

```
LoggingContext: Expected logging context main was lost
```
Resulting in bad logs being seen:

```
PreserveLoggingContext: Expected logging context sentinel but found main
```

```
LoggingContext: Expected logging context main was lost
```
Comment on lines -188 to -191
# make sure that we run the reactor with the sentinel log context,
# otherwise other PreserveLoggingContext instances will get confused
# and complain when they see the logcontext arbitrarily swapping
# between the sentinel and `run` logcontexts.
Copy link
Contributor Author

@MadLittleMods MadLittleMods Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was added in 067b00d#diff-6e21d6a61b2f6b6f2d4ce961991ba7f27e83605f927eaa4b19d2c46a975a96c1R460-R463 (matrix-org/synapse#2027)

I can't tell exactly what's it's referring to but the one spot I was seeing Expected logging context sentinel but found main around run_as_background_process(...) usage has been fixed (#18870 (comment)).

Reproduction:

  1. poetry run synapse_homeserver --config-path homeserver.yaml
  2. Ctrl + C to stop the server
  3. Notice LoggingContext: Expected logging context main was lost in the logs

Comment on lines -193 to -195
# We also need to drop the logcontext before forking if we're daemonizing,
# otherwise the cputime metrics get confused about the per-thread resource usage
# appearing to go backwards.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was added in matrix-org/synapse#5609 and is accurately describing a real problem.

But we can be way more precise about what do to here. We only need to stop the current log context right before forking the process and start it again right after. This can be accomplished with the PreserveLoggingContext being moved strictly around the os.fork() call (search for os.fork() below).


Previously, this caused the sentinel context to be used for a whole bunch of logs (basically anything outside of a request). We now get the main context used 💪

Reproduction instructions:

 1. `poetry run synapse_homeserver --config-path homeserver.yaml`
 1. `curl http://localhost:8008/_matrix/client/versions`
 1. Stop Synapse (`Ctrl + c`)

Notice the bad log:

```
synapse.logging.context - WARNING - sentinel - LoggingContext: Expected logging context main was lost
```
…text rules"

This reverts commit 675d94a.

Things get stuck with this, see #18870 (comment)
Reproduction:

 1. `poetry run synapse_homeserver --config-path homeserver.yaml`
 1. Ctrl + C to stop the server
 1. Notice `LoggingContext: Expected logging context main was lost` in the logs
MadLittleMods added a commit that referenced this pull request Sep 9, 2025
So downstream usage doesn't need to use
`PreserveLoggingContext()` or `make_deferred_yieldable`

Spawning from #18870
and #18357 (comment)
MadLittleMods added a commit that referenced this pull request Sep 9, 2025
So downstream usage doesn't need to use
`PreserveLoggingContext()` or `make_deferred_yieldable`

Spawning from #18870
and #18357 (comment)
MadLittleMods added a commit that referenced this pull request Sep 10, 2025
…kground_process(...)` (#18900)

Also adds a section in the docs explaining the `sentinel` logcontext.

Spawning from #18870


### Testing strategy

1. Run Synapse normally and with `daemonize: true`: `poetry run
synapse_homeserver --config-path homeserver.yaml`
 1. Execute some requests
 1. Shutdown the server
 1. Look for any bad log entries in your homeserver logs:
    - `Expected logging context sentinel but found main`
    - `Expected logging context main was lost`
    - `Expected previous context`
    - `utime went backwards!`/`stime went backwards!`
- `Called stop on logcontext POST-0 without recording a start rusage`
    - `Background process re-entered without a proc`

Twisted trial tests:

 1. Run full Twisted trial test suite.
1. Check the logs for `Test starting with non-sentinel logging context ...`
@MadLittleMods MadLittleMods changed the title Remove sentinel context where we log Remove sentinel context where we log in setup, start and exit Sep 10, 2025
@MadLittleMods MadLittleMods changed the title Remove sentinel context where we log in setup, start and exit Remove sentinel logcontext where we log in setup, start and exit Sep 10, 2025
@MadLittleMods MadLittleMods marked this pull request as ready for review September 10, 2025 20:48
@MadLittleMods MadLittleMods requested a review from a team as a code owner September 10, 2025 20:48
Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it does the right things, and I learned more about logcontexts in the process!

Nice to see errors actually disappear after these changes.

@MadLittleMods MadLittleMods merged commit 84d6425 into develop Sep 16, 2025
44 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/remove-sentinel-context branch September 16, 2025 22:15
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @anoadragon453 🐄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants