Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanups and sanity-checking in cpu and db metrics #5499

Merged
merged 7 commits into from Jun 24, 2019

Conversation

2 participants
@richvdh
Copy link
Member

commented Jun 19, 2019

There are three commits here which are probably worth reviewing individually:

  • Remove some dead code (_get_event_counters)
  • Simplify PerformanceCounters interface
  • Sanity-checks that clocks don't go backwards (will either suggest that #3548 is fixed, or rapidly prove that it's not).

richvdh added some commits Jun 19, 2019

Remove unused _get_event_counters
This has been redundant since cdb3757.
Sanity-checking for metrics updates
Check that our clocks go forward.
Simplify PerformanceCounters.update interface
we already have the duration for the update, so may as well use it rather than
passing extra params around and recalculating it.

@richvdh richvdh force-pushed the rav/cleanup_metrics branch from 0a5774c to ae4d97b Jun 19, 2019

@richvdh richvdh requested a review from matrix-org/synapse-core Jun 19, 2019

@richvdh richvdh added this to In progress in Homeserver Task Board via automation Jun 19, 2019

@codecov

This comment has been minimized.

Copy link

commented Jun 20, 2019

Codecov Report

Merging #5499 into develop will decrease coverage by 0.01%.
The diff coverage is 60%.

@@             Coverage Diff             @@
##           develop    #5499      +/-   ##
===========================================
- Coverage    63.16%   63.15%   -0.02%     
===========================================
  Files          328      328              
  Lines        35907    35869      -38     
  Branches      5902     5896       -6     
===========================================
- Hits         22682    22654      -28     
+ Misses       11609    11599      -10     
  Partials      1616     1616
@codecov

This comment has been minimized.

Copy link

commented Jun 20, 2019

Codecov Report

Merging #5499 into develop will increase coverage by 0.01%.
The diff coverage is 53.57%.

@@            Coverage Diff             @@
##           develop   #5499      +/-   ##
==========================================
+ Coverage    63.18%   63.2%   +0.01%     
==========================================
  Files          328     328              
  Lines        35923   35933      +10     
  Branches      5902    5905       +3     
==========================================
+ Hits         22699   22712      +13     
+ Misses       11609   11603       -6     
- Partials      1615    1618       +3
if utime_delta < 0:
raise ValueError("utime went backwards! %f < %f" % (
current.ru_utime, self.usage_start.ru_utime,
))

This comment has been minimized.

Copy link
@erikjohnston

erikjohnston Jun 20, 2019

Member

What's the effect of raising here? Will this just take out metrics? Do we want to log.error but return 0 instead?

This comment has been minimized.

Copy link
@richvdh

richvdh Jun 21, 2019

Author Member

No, it will take out whatever is running at this point.

My concern with log.error is that it's never going to get noticed, and if it is, we won't be able to figure out why it is happening.

richvdh added some commits Jun 24, 2019

Avoid raising exceptions in metrics
Sentry will catch the errors if they happen, so that should be good enough, and
woun't make things explode if we hit the error condition.

@richvdh richvdh requested a review from erikjohnston Jun 24, 2019

@richvdh richvdh merged commit e59a8cd into develop Jun 24, 2019

19 checks passed

buildkite/synapse Build #2315 passed (21 minutes, 25 seconds)
Details
buildkite/synapse/check-sample-config Passed (1 minute, 10 seconds)
Details
buildkite/synapse/check-style Passed (1 minute, 42 seconds)
Details
buildkite/synapse/isort Passed (46 seconds)
Details
buildkite/synapse/newspaper-newsfile Passed (12 seconds)
Details
buildkite/synapse/packaging Passed (16 seconds)
Details
buildkite/synapse/pipeline Passed (2 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-postgres-9-dot-5 Passed (18 minutes, 9 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite Passed (6 minutes, 2 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite-slash-old-deps Passed (7 minutes, 9 seconds)
Details
buildkite/synapse/python-3-dot-6-slash-sqlite Passed (5 minutes, 46 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-11 Passed (18 minutes, 3 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-9-dot-5 Passed (18 minutes, 13 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-sqlite Passed (6 minutes, 21 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-postgres-9-dot-6-slash-monolith Passed (8 minutes, 9 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-postgres-9-dot-6-slash-workers Soft failed (exit status 1)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-sqlite-slash-monolith Passed (7 minutes, 24 seconds)
Details
codecov/patch 53.57% of diff hit (target 0%)
Details
codecov/project 63.2% (target 0%)
Details

Homeserver Task Board automation moved this from In progress to Done Jun 24, 2019

@richvdh richvdh deleted the rav/cleanup_metrics branch Jun 24, 2019

@richvdh richvdh referenced this pull request Jul 3, 2019

Closed

utime went backwards! #5608

hawkowl added a commit that referenced this pull request Jul 5, 2019

Merge tag 'v1.1.0' into shhs
Synapse 1.1.0 (2019-07-04)
==========================

As of v1.1.0, Synapse no longer supports Python 2, nor Postgres version 9.4.
See the [upgrade notes](UPGRADE.rst#upgrading-to-v110) for more details.

This release also deprecates the use of environment variables to configure the
docker image. See the [docker README](https://github.com/matrix-org/synapse/blob/release-v1.1.0/docker/README.md#legacy-dynamic-configuration-file-support)
for more details.

No changes since 1.1.0rc2.

Synapse 1.1.0rc2 (2019-07-03)
=============================

Bugfixes
--------

- Fix regression in 1.1rc1 where OPTIONS requests to the media repo would fail. ([\#5593](#5593))
- Removed the `SYNAPSE_SMTP_*` docker container environment variables. Using these environment variables prevented the docker container from starting in Synapse v1.0, even though they didn't actually allow any functionality anyway. ([\#5596](#5596))
- Fix a number of "Starting txn from sentinel context" warnings. ([\#5605](#5605))

Internal Changes
----------------

- Update github templates. ([\#5552](#5552))

Synapse 1.1.0rc1 (2019-07-02)
=============================

As of v1.1.0, Synapse no longer supports Python 2, nor Postgres version 9.4.
See the [upgrade notes](UPGRADE.rst#upgrading-to-v110) for more details.

Features
--------

- Added possibilty to disable local password authentication. Contributed by Daniel Hoffend. ([\#5092](#5092))
- Add monthly active users to phonehome stats. ([\#5252](#5252))
- Allow expired user to trigger renewal email sending manually. ([\#5363](#5363))
- Statistics on forward extremities per room are now exposed via Prometheus. ([\#5384](#5384), [\#5458](#5458), [\#5461](#5461))
- Add --no-daemonize option to run synapse in the foreground, per issue #4130. Contributed by Soham Gumaste. ([\#5412](#5412), [\#5587](#5587))
- Fully support SAML2 authentication. Contributed by [Alexander Trost](https://github.com/galexrt) - thank you! ([\#5422](#5422))
- Allow server admins to define implementations of extra rules for allowing or denying incoming events. ([\#5440](#5440), [\#5474](#5474), [\#5477](#5477))
- Add support for handling pagination APIs on client reader worker. ([\#5505](#5505), [\#5513](#5513), [\#5531](#5531))
- Improve help and cmdline option names for --generate-config options. ([\#5512](#5512))
- Allow configuration of the path used for ACME account keys. ([\#5516](#5516), [\#5521](#5521), [\#5522](#5522))
- Add --data-dir and --open-private-ports options. ([\#5524](#5524))
- Split public rooms directory auth config in two settings, in order to manage client auth independently from the federation part of it. Obsoletes the "restrict_public_rooms_to_local_users" configuration setting. If "restrict_public_rooms_to_local_users" is set in the config, Synapse will act as if both new options are enabled, i.e. require authentication through the client API and deny federation requests. ([\#5534](#5534))
- The minimum TLS version used for outgoing federation requests can now be set with `federation_client_minimum_tls_version`. ([\#5550](#5550))
- Optimise devices changed query to not pull unnecessary rows from the database, reducing database load. ([\#5559](#5559))
- Add new metrics for number of forward extremities being persisted and number of state groups involved in resolution. ([\#5476](#5476))

Bugfixes
--------

- Fix bug processing incoming events over federation if call to `/get_missing_events` fails. ([\#5042](#5042))
- Prevent more than one room upgrade happening simultaneously on the same room. ([\#5051](#5051))
- Fix a bug where running synapse_port_db would cause the account validity feature to fail because it didn't set the type of the email_sent column to boolean. ([\#5325](#5325))
- Warn about disabling email-based password resets when a reset occurs, and remove warning when someone attempts a phone-based reset. ([\#5387](#5387))
- Fix email notifications for unnamed rooms with multiple people. ([\#5388](#5388))
- Fix exceptions in federation reader worker caused by attempting to renew attestations, which should only happen on master worker. ([\#5389](#5389))
- Fix handling of failures fetching remote content to not log failures as exceptions. ([\#5390](#5390))
- Fix a bug where deactivated users could receive renewal emails if the account validity feature is on. ([\#5394](#5394))
- Fix missing invite state after exchanging 3PID invites over federaton. ([\#5464](#5464))
- Fix intermittent exceptions on Apple hardware. Also fix bug that caused database activity times to be under-reported in log lines. ([\#5498](#5498))
- Fix logging error when a tampered event is detected. ([\#5500](#5500))
- Fix bug where clients could tight loop calling `/sync` for a period. ([\#5507](#5507))
- Fix bug with `jinja2` preventing Synapse from starting. Users who had this problem should now simply need to run `pip install matrix-synapse`. ([\#5514](#5514))
- Fix a regression where homeservers on private IP addresses were incorrectly blacklisted. ([\#5523](#5523))
- Fixed m.login.jwt using unregistred user_id and added pyjwt>=1.6.4 as jwt conditional dependencies. Contributed by Pau Rodriguez-Estivill. ([\#5555](#5555), [\#5586](#5586))
- Fix a bug that would cause invited users to receive several emails for a single 3PID invite in case the inviter is rate limited. ([\#5576](#5576))

Updates to the Docker image
---------------------------
- Add ability to change Docker containers [timezone](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) with the `TZ` variable. ([\#5383](#5383))
- Update docker image to use Python 3.7. ([\#5546](#5546))
- Deprecate the use of environment variables for configuration, and make the use of a static configuration the default. ([\#5561](#5561), [\#5562](#5562), [\#5566](#5566), [\#5567](#5567))
- Increase default log level for docker image to INFO. It can still be changed by editing the generated log.config file. ([\#5547](#5547))
- Send synapse logs to the docker logging system, by default. ([\#5565](#5565))
- Open the non-TLS port by default. ([\#5568](#5568))
- Fix failure to start under docker with SAML support enabled. ([\#5490](#5490))
- Use a sensible location for data files when generating a config file. ([\#5563](#5563))

Deprecations and Removals
-------------------------

- Python 2.7 is no longer a supported platform. Synapse now requires Python 3.5+ to run. ([\#5425](#5425))
- PostgreSQL 9.4 is no longer supported. Synapse requires Postgres 9.5+ or above for Postgres support. ([\#5448](#5448))
- Remove support for cpu_affinity setting. ([\#5525](#5525))

Improved Documentation
----------------------
- Improve README section on performance troubleshooting. ([\#4276](#4276))
- Add information about how to install and run `black` on the codebase to code_style.rst. ([\#5537](#5537))
- Improve install docs on choosing server_name. ([\#5558](#5558))

Internal Changes
----------------

- Add logging to 3pid invite signature verification. ([\#5015](#5015))
- Update example haproxy config to a more compatible setup. ([\#5313](#5313))
- Track deactivated accounts in the database. ([\#5378](#5378), [\#5465](#5465), [\#5493](#5493))
- Clean up code for sending federation EDUs. ([\#5381](#5381))
- Add a sponsor button to the repo. ([\#5382](#5382), [\#5386](#5386))
- Don't log non-200 responses from federation queries as exceptions. ([\#5383](#5383))
- Update Python syntax in contrib/ to Python 3. ([\#5446](#5446))
- Update federation_client dev script to support `.well-known` and work with python3. ([\#5447](#5447))
- SyTest has been moved to Buildkite. ([\#5459](#5459))
- Demo script now uses python3. ([\#5460](#5460))
- Synapse can now handle RestServlets that return coroutines. ([\#5475](#5475), [\#5585](#5585))
- The demo servers talk to each other again. ([\#5478](#5478))
- Add an EXPERIMENTAL config option to try and periodically clean up extremities by sending dummy events. ([\#5480](#5480))
- Synapse's codebase is now formatted by `black`. ([\#5482](#5482))
- Some cleanups and sanity-checking in the CPU and database metrics. ([\#5499](#5499))
- Improve email notification logging. ([\#5502](#5502))
- Fix "Unexpected entry in 'full_schemas'" log warning. ([\#5509](#5509))
- Improve logging when generating config files. ([\#5510](#5510))
- Refactor and clean up Config parser for maintainability. ([\#5511](#5511))
- Make the config clearer in that email.template_dir is relative to the Synapse's root directory, not the `synapse/` folder within it. ([\#5543](#5543))
- Update v1.0.0 release changelog to include more information about changes to password resets. ([\#5545](#5545))
- Remove non-functioning check_event_hash.py dev script. ([\#5548](#5548))
- Synapse will now only allow TLS v1.2 connections when serving federation, if it terminates TLS. As Synapse's allowed ciphers were only able to be used in TLSv1.2 before, this does not change behaviour. ([\#5550](#5550))
- Logging when running GC collection on generation 0 is now at the DEBUG level, not INFO. ([\#5557](#5557))
- Reduce the amount of stuff we send in the docker context. ([\#5564](#5564))
- Point the reverse links in the Purge History contrib scripts at the intended location. ([\#5570](#5570))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.