Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use monotonic clock where possible for metrics #5498

Merged
merged 1 commit into from Jun 24, 2019

Conversation

2 participants
@richvdh
Copy link
Member

commented Jun 19, 2019

Fixes intermittent errors observed on Apple hardware which were caused by time.clock() appearing to go backwards when called from different threads.

Also fixes a bug where database activity times were logged as 1/1000 of their correct ratio due to confusion between milliseconds and seconds.

Use monotonic clock where possible for metrics
Fixes intermittent errors observed on Apple hardware which were caused by
time.clock() appearing to go backwards when called from different threads.

Also fixes a bug where database activity times were logged as 1/1000 of their
correct ratio due to confusion between milliseconds and seconds.

@richvdh richvdh requested a review from matrix-org/synapse-core Jun 19, 2019

@richvdh richvdh added this to In progress in Homeserver Task Board via automation Jun 19, 2019


def loop():
curr = self._current_txn_total_time
prev = self._previous_txn_total_time
self._previous_txn_total_time = curr

time_now = self._clock.time_msec()
time_now = monotonic_time()

This comment has been minimized.

Copy link
@richvdh

richvdh Jun 19, 2019

Author Member

so yes. this function turned out to be passing milliseconds into a function that expected seconds.

@richvdh

This comment has been minimized.

Copy link
Member Author

commented Jun 20, 2019

This is failing CI because it's based on current master, so that, should we need to, it can be rolled into a 1.0.1 release, which will make it easier to build the release.

@erikjohnston
Copy link
Member

left a comment

Do we not want to do this other places where we use time.time() to get a duration? Or is this only a problem due to threading? Are the metrics (that get run on a separate thread) affected?

@richvdh

This comment has been minimized.

Copy link
Member Author

commented Jun 21, 2019

Do we not want to do this other places where we use time.time() to get a duration?

Probably. I was hoping just to fix the known problems, rather than go spelunking in the whole codebase.

Or is this only a problem due to threading?

Well, I think it's more of a problem due to threading, but obviously if we're measuring a duration we should probably be using a monotonic clock rather than a wall-clock, to avoid situations where clocks get changed etc.

Are the metrics (that get run on a separate thread) affected?

whichnow?

@richvdh

This comment has been minimized.

Copy link
Member Author

commented Jun 24, 2019

As discussed, let's leave digging for other problems for now.

@richvdh richvdh merged commit c753c09 into develop Jun 24, 2019

20 of 22 checks passed

ci/circleci: sytestpy2merged Your tests failed on CircleCI
Details
ci/circleci: sytestpy2postgresmerged Your tests failed on CircleCI
Details
buildkite/synapse Build #2244 passed (21 minutes, 53 seconds)
Details
buildkite/synapse/check-sample-config Passed (1 minute, 11 seconds)
Details
buildkite/synapse/isort Passed (23 seconds)
Details
buildkite/synapse/newspaper-newsfile Passed (12 seconds)
Details
buildkite/synapse/packaging Passed (18 seconds)
Details
buildkite/synapse/pep-8 Passed (54 seconds)
Details
buildkite/synapse/pipeline Passed (9 seconds)
Details
buildkite/synapse/python-2-dot-7-slash-postgres-9-dot-4 Passed (15 minutes, 50 seconds)
Details
buildkite/synapse/python-2-dot-7-slash-postgres-9-dot-5 Passed (15 minutes, 59 seconds)
Details
buildkite/synapse/python-2-dot-7-slash-sqlite Passed (5 minutes, 9 seconds)
Details
buildkite/synapse/python-2-dot-7-slash-sqlite-slash-old-deps Passed (6 minutes, 55 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-postgres-9-dot-4 Passed (16 minutes, 21 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-postgres-9-dot-5 Passed (18 minutes, 53 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite Passed (5 minutes, 43 seconds)
Details
buildkite/synapse/python-3-dot-6-slash-sqlite Passed (5 minutes, 50 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-11 Passed (16 minutes, 31 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-9-dot-5 Passed (16 minutes, 34 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-sqlite Passed (5 minutes, 53 seconds)
Details
ci/circleci: sytestpy3merged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy3postgresmerged Your tests passed on CircleCI!
Details

Homeserver Task Board automation moved this from In progress to Done Jun 24, 2019

@richvdh richvdh deleted the rav/fix_clock_reversal branch Jun 24, 2019

hawkowl added a commit that referenced this pull request Jul 5, 2019

Merge tag 'v1.1.0' into shhs
Synapse 1.1.0 (2019-07-04)
==========================

As of v1.1.0, Synapse no longer supports Python 2, nor Postgres version 9.4.
See the [upgrade notes](UPGRADE.rst#upgrading-to-v110) for more details.

This release also deprecates the use of environment variables to configure the
docker image. See the [docker README](https://github.com/matrix-org/synapse/blob/release-v1.1.0/docker/README.md#legacy-dynamic-configuration-file-support)
for more details.

No changes since 1.1.0rc2.

Synapse 1.1.0rc2 (2019-07-03)
=============================

Bugfixes
--------

- Fix regression in 1.1rc1 where OPTIONS requests to the media repo would fail. ([\#5593](#5593))
- Removed the `SYNAPSE_SMTP_*` docker container environment variables. Using these environment variables prevented the docker container from starting in Synapse v1.0, even though they didn't actually allow any functionality anyway. ([\#5596](#5596))
- Fix a number of "Starting txn from sentinel context" warnings. ([\#5605](#5605))

Internal Changes
----------------

- Update github templates. ([\#5552](#5552))

Synapse 1.1.0rc1 (2019-07-02)
=============================

As of v1.1.0, Synapse no longer supports Python 2, nor Postgres version 9.4.
See the [upgrade notes](UPGRADE.rst#upgrading-to-v110) for more details.

Features
--------

- Added possibilty to disable local password authentication. Contributed by Daniel Hoffend. ([\#5092](#5092))
- Add monthly active users to phonehome stats. ([\#5252](#5252))
- Allow expired user to trigger renewal email sending manually. ([\#5363](#5363))
- Statistics on forward extremities per room are now exposed via Prometheus. ([\#5384](#5384), [\#5458](#5458), [\#5461](#5461))
- Add --no-daemonize option to run synapse in the foreground, per issue #4130. Contributed by Soham Gumaste. ([\#5412](#5412), [\#5587](#5587))
- Fully support SAML2 authentication. Contributed by [Alexander Trost](https://github.com/galexrt) - thank you! ([\#5422](#5422))
- Allow server admins to define implementations of extra rules for allowing or denying incoming events. ([\#5440](#5440), [\#5474](#5474), [\#5477](#5477))
- Add support for handling pagination APIs on client reader worker. ([\#5505](#5505), [\#5513](#5513), [\#5531](#5531))
- Improve help and cmdline option names for --generate-config options. ([\#5512](#5512))
- Allow configuration of the path used for ACME account keys. ([\#5516](#5516), [\#5521](#5521), [\#5522](#5522))
- Add --data-dir and --open-private-ports options. ([\#5524](#5524))
- Split public rooms directory auth config in two settings, in order to manage client auth independently from the federation part of it. Obsoletes the "restrict_public_rooms_to_local_users" configuration setting. If "restrict_public_rooms_to_local_users" is set in the config, Synapse will act as if both new options are enabled, i.e. require authentication through the client API and deny federation requests. ([\#5534](#5534))
- The minimum TLS version used for outgoing federation requests can now be set with `federation_client_minimum_tls_version`. ([\#5550](#5550))
- Optimise devices changed query to not pull unnecessary rows from the database, reducing database load. ([\#5559](#5559))
- Add new metrics for number of forward extremities being persisted and number of state groups involved in resolution. ([\#5476](#5476))

Bugfixes
--------

- Fix bug processing incoming events over federation if call to `/get_missing_events` fails. ([\#5042](#5042))
- Prevent more than one room upgrade happening simultaneously on the same room. ([\#5051](#5051))
- Fix a bug where running synapse_port_db would cause the account validity feature to fail because it didn't set the type of the email_sent column to boolean. ([\#5325](#5325))
- Warn about disabling email-based password resets when a reset occurs, and remove warning when someone attempts a phone-based reset. ([\#5387](#5387))
- Fix email notifications for unnamed rooms with multiple people. ([\#5388](#5388))
- Fix exceptions in federation reader worker caused by attempting to renew attestations, which should only happen on master worker. ([\#5389](#5389))
- Fix handling of failures fetching remote content to not log failures as exceptions. ([\#5390](#5390))
- Fix a bug where deactivated users could receive renewal emails if the account validity feature is on. ([\#5394](#5394))
- Fix missing invite state after exchanging 3PID invites over federaton. ([\#5464](#5464))
- Fix intermittent exceptions on Apple hardware. Also fix bug that caused database activity times to be under-reported in log lines. ([\#5498](#5498))
- Fix logging error when a tampered event is detected. ([\#5500](#5500))
- Fix bug where clients could tight loop calling `/sync` for a period. ([\#5507](#5507))
- Fix bug with `jinja2` preventing Synapse from starting. Users who had this problem should now simply need to run `pip install matrix-synapse`. ([\#5514](#5514))
- Fix a regression where homeservers on private IP addresses were incorrectly blacklisted. ([\#5523](#5523))
- Fixed m.login.jwt using unregistred user_id and added pyjwt>=1.6.4 as jwt conditional dependencies. Contributed by Pau Rodriguez-Estivill. ([\#5555](#5555), [\#5586](#5586))
- Fix a bug that would cause invited users to receive several emails for a single 3PID invite in case the inviter is rate limited. ([\#5576](#5576))

Updates to the Docker image
---------------------------
- Add ability to change Docker containers [timezone](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) with the `TZ` variable. ([\#5383](#5383))
- Update docker image to use Python 3.7. ([\#5546](#5546))
- Deprecate the use of environment variables for configuration, and make the use of a static configuration the default. ([\#5561](#5561), [\#5562](#5562), [\#5566](#5566), [\#5567](#5567))
- Increase default log level for docker image to INFO. It can still be changed by editing the generated log.config file. ([\#5547](#5547))
- Send synapse logs to the docker logging system, by default. ([\#5565](#5565))
- Open the non-TLS port by default. ([\#5568](#5568))
- Fix failure to start under docker with SAML support enabled. ([\#5490](#5490))
- Use a sensible location for data files when generating a config file. ([\#5563](#5563))

Deprecations and Removals
-------------------------

- Python 2.7 is no longer a supported platform. Synapse now requires Python 3.5+ to run. ([\#5425](#5425))
- PostgreSQL 9.4 is no longer supported. Synapse requires Postgres 9.5+ or above for Postgres support. ([\#5448](#5448))
- Remove support for cpu_affinity setting. ([\#5525](#5525))

Improved Documentation
----------------------
- Improve README section on performance troubleshooting. ([\#4276](#4276))
- Add information about how to install and run `black` on the codebase to code_style.rst. ([\#5537](#5537))
- Improve install docs on choosing server_name. ([\#5558](#5558))

Internal Changes
----------------

- Add logging to 3pid invite signature verification. ([\#5015](#5015))
- Update example haproxy config to a more compatible setup. ([\#5313](#5313))
- Track deactivated accounts in the database. ([\#5378](#5378), [\#5465](#5465), [\#5493](#5493))
- Clean up code for sending federation EDUs. ([\#5381](#5381))
- Add a sponsor button to the repo. ([\#5382](#5382), [\#5386](#5386))
- Don't log non-200 responses from federation queries as exceptions. ([\#5383](#5383))
- Update Python syntax in contrib/ to Python 3. ([\#5446](#5446))
- Update federation_client dev script to support `.well-known` and work with python3. ([\#5447](#5447))
- SyTest has been moved to Buildkite. ([\#5459](#5459))
- Demo script now uses python3. ([\#5460](#5460))
- Synapse can now handle RestServlets that return coroutines. ([\#5475](#5475), [\#5585](#5585))
- The demo servers talk to each other again. ([\#5478](#5478))
- Add an EXPERIMENTAL config option to try and periodically clean up extremities by sending dummy events. ([\#5480](#5480))
- Synapse's codebase is now formatted by `black`. ([\#5482](#5482))
- Some cleanups and sanity-checking in the CPU and database metrics. ([\#5499](#5499))
- Improve email notification logging. ([\#5502](#5502))
- Fix "Unexpected entry in 'full_schemas'" log warning. ([\#5509](#5509))
- Improve logging when generating config files. ([\#5510](#5510))
- Refactor and clean up Config parser for maintainability. ([\#5511](#5511))
- Make the config clearer in that email.template_dir is relative to the Synapse's root directory, not the `synapse/` folder within it. ([\#5543](#5543))
- Update v1.0.0 release changelog to include more information about changes to password resets. ([\#5545](#5545))
- Remove non-functioning check_event_hash.py dev script. ([\#5548](#5548))
- Synapse will now only allow TLS v1.2 connections when serving federation, if it terminates TLS. As Synapse's allowed ciphers were only able to be used in TLSv1.2 before, this does not change behaviour. ([\#5550](#5550))
- Logging when running GC collection on generation 0 is now at the DEBUG level, not INFO. ([\#5557](#5557))
- Reduce the amount of stuff we send in the docker context. ([\#5564](#5564))
- Point the reverse links in the Purge History contrib scripts at the intended location. ([\#5570](#5570))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.