Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate opentracing contexts through EDUs #5852

Merged
merged 42 commits into from Aug 22, 2019

Conversation

@JorikSchellekens
Copy link
Contributor

commented Aug 14, 2019

Opentracing contexts are included in edu's to facilitate device list tracing.

@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from be7a08b to e63fae2 Aug 14, 2019
@JorikSchellekens JorikSchellekens requested a review from richvdh Aug 14, 2019
@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from e63fae2 to 7674ae0 Aug 14, 2019
@JorikSchellekens JorikSchellekens changed the title Joriks/opentracing edus Propogate opentracing contexts through EDUs Aug 14, 2019
@richvdh richvdh changed the title Propogate opentracing contexts through EDUs Propagate opentracing contexts through EDUs Aug 16, 2019
@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from 27423c3 to 23d3e05 Aug 16, 2019
Copy link
Member

left a comment

looks much nicer now that it is split up a bit, but a few comments

docs/opentracing.rst Outdated Show resolved Hide resolved
docs/opentracing.rst Outdated Show resolved Hide resolved
edu_type: "type"
content: {
context: {
...opentracing_context

This comment has been minimized.

Copy link
@richvdh

richvdh Aug 19, 2019

Member

can you flesh this out a little more with an example?

This comment has been minimized.

Copy link
@JorikSchellekens

JorikSchellekens Aug 20, 2019

Author Contributor

An example of how it's used or what it contains? I can't make an example of the content because that's completely vendor specific and if we ever change to a different tracing solution I don't think anyone will remember to come back and fix this.

This comment has been minimized.

Copy link
@richvdh

richvdh Aug 22, 2019

Member

what it contains. Isn't a consistent format for this critical to make sure that different servers can pass contexts around? In any case, I think it getting out of date is a risk I'd be prepared to take, especially if we said "EDU format (assuming we are using Jaeger):" above it, or something.

This comment has been minimized.

Copy link
@JorikSchellekens

JorikSchellekens Aug 22, 2019

Author Contributor

Eh, I see the concern. I'm not going to fill in the values there but I will instruct people to inject directly into context. That makes the most sense for opentracing in general.

This comment has been minimized.

Copy link
@JorikSchellekens

JorikSchellekens Aug 22, 2019

Author Contributor

I don't think the idea is to let other servers implement their own opentracing and send it to us. I see this as a project to trace what's going on inside synapses we trust. If we want it to be an optional thing that other servers implement maybe it should be spec'd but then our spec depends on opentracing's.

This comment has been minimized.

Copy link
@JorikSchellekens

JorikSchellekens Aug 22, 2019

Author Contributor

Hows that?

This comment has been minimized.

Copy link
@richvdh

richvdh Aug 22, 2019

Member

I see this as a project to trace what's going on inside synapses we trust.

sure, but if we change the impl, we (or whoever is running the synapses in question) have to make sure that all those synapses get updated.

Hows that?

perfect.

docs/opentracing.rst Outdated Show resolved Hide resolved
changelog.d/5852.misc Outdated Show resolved Hide resolved
synapse/logging/opentracing.py Outdated Show resolved Hide resolved
synapse/storage/devices.py Show resolved Hide resolved
synapse/storage/devices.py Show resolved Hide resolved
@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from 7befd0a to 8c38cc6 Aug 20, 2019
@JorikSchellekens JorikSchellekens requested a review from richvdh Aug 20, 2019
Copy link
Member

left a comment

getting there.

I remain unconvinced about using a general context key: the problem is that it just means nothing to me as a concept, so I'm struggling to envisage what other stuff we might put in there. Perhaps if you have some examples, that might help me understand?

docs/opentracing.rst Outdated Show resolved Hide resolved
edu_type: "type"
content: {
context: {
...opentracing_context

This comment has been minimized.

Copy link
@richvdh

richvdh Aug 22, 2019

Member

what it contains. Isn't a consistent format for this critical to make sure that different servers can pass contexts around? In any case, I think it getting out of date is a risk I'd be prepared to take, especially if we said "EDU format (assuming we are using Jaeger):" above it, or something.

synapse/federation/sender/per_destination_queue.py Outdated Show resolved Hide resolved
synapse/federation/sender/per_destination_queue.py Outdated Show resolved Hide resolved
synapse/federation/units.py Outdated Show resolved Hide resolved
synapse/storage/devices.py Outdated Show resolved Hide resolved
synapse/storage/devices.py Show resolved Hide resolved
synapse/storage/devices.py Show resolved Hide resolved
@richvdh

This comment has been minimized.

Copy link
Member

commented Aug 22, 2019

(also: please can you merge develop into your branch, to fix the conflict?)

@JorikSchellekens

This comment has been minimized.

Copy link
Contributor Author

commented Aug 22, 2019

The general context key isn't something I have any plans for. It was a change @erikjohnston requested. It was suggested that some metrics for a given context might be put in there as well, I don't really see what else could go in there either but it seemed like it's a good place to put context related information in. There really is no vision for what's going in there other than the fact that it could be useful. If the word context itself is confusing in this.. context.. then it means that data therein links the Edu by some chain of references to an originating event.

@erikjohnston

This comment has been minimized.

Copy link
Member

commented Aug 22, 2019

If people feel strongly about not having a context key, then I'm happy to drop it tbh.

@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from 2680c99 to a2bf2e4 Aug 22, 2019
@JorikSchellekens

This comment has been minimized.

Copy link
Contributor Author

commented Aug 22, 2019

I'll rename it to opentracing_context

@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from a57c42a to a205e91 Aug 22, 2019
@JorikSchellekens JorikSchellekens requested a review from richvdh Aug 22, 2019
@JorikSchellekens JorikSchellekens force-pushed the joriks/opentracing_edus branch from 11a781b to 8b43172 Aug 22, 2019
Copy link
Member

left a comment

I think we're there, modulo the nitpicks below! Please can you fix them up then squash-merge?

docs/opentracing.rst Outdated Show resolved Hide resolved
docs/opentracing.rst Show resolved Hide resolved
docs/opentracing.rst Outdated Show resolved Hide resolved
JorikSchellekens and others added 3 commits Aug 22, 2019
Docs
Co-Authored-By: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
@JorikSchellekens JorikSchellekens merged commit 8767b63 into develop Aug 22, 2019
19 checks passed
19 checks passed
buildkite/synapse Build #3615 passed (12 minutes, 56 seconds)
Details
buildkite/synapse/check-sample-config Passed (1 minute, 35 seconds)
Details
buildkite/synapse/check-style Passed (1 minute, 27 seconds)
Details
buildkite/synapse/isort Passed (59 seconds)
Details
buildkite/synapse/newspaper-newsfile Passed (45 seconds)
Details
buildkite/synapse/packaging Passed (54 seconds)
Details
buildkite/synapse/pipeline Passed (10 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-postgres-9-dot-5 Passed (11 minutes, 1 second)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite Passed (4 minutes, 54 seconds)
Details
buildkite/synapse/python-3-dot-5-slash-sqlite-slash-old-deps Passed (6 minutes, 46 seconds)
Details
buildkite/synapse/python-3-dot-6-slash-sqlite Passed (4 minutes, 14 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-11 Passed (10 minutes, 38 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-postgres-9-dot-5 Passed (10 minutes, 34 seconds)
Details
buildkite/synapse/python-3-dot-7-slash-sqlite Passed (4 minutes)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-postgres-9-dot-6-slash-monolith Passed (6 minutes, 2 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-postgres-9-dot-6-slash-workers Passed (7 minutes, 3 seconds)
Details
buildkite/synapse/sytest-python-3-dot-5-slash-sqlite-slash-monolith Passed (4 minutes, 59 seconds)
Details
codecov/patch 58.9% of diff hit (target 0%)
Details
codecov/project 63.55% (target 0%)
Details
@JorikSchellekens JorikSchellekens deleted the joriks/opentracing_edus branch Aug 22, 2019
anoadragon453 added a commit that referenced this pull request Oct 3, 2019
Synapse 1.4.0 (2019-10-03)
==========================

Bugfixes
--------

- Redact `client_secret` in server logs. ([\#6158](#6158))

Synapse 1.4.0rc2 (2019-10-02)
=============================

Bugfixes
--------

- Fix bug in background update that adds last seen information to the `devices` table, and improve its performance on Postgres. ([\#6135](#6135))
- Fix bad performance of censoring redactions background task. ([\#6141](#6141))
- Fix fetching censored redactions from DB, which caused APIs like initial sync to fail if it tried to include the censored redaction. ([\#6145](#6145))
- Fix exceptions when storing large retry intervals for down remote servers. ([\#6146](#6146))

Internal Changes
----------------

- Fix up sample config entry for `redaction_retention_period` option. ([\#6117](#6117))

Synapse 1.4.0rc1 (2019-09-26)
=============================

Note that this release includes significant changes around 3pid
verification. Administrators are reminded to review the [upgrade notes](UPGRADE.rst#upgrading-to-v140).

Features
--------

- Changes to 3pid verification:
  - Add the ability to send registration emails from the homeserver rather than delegating to an identity server. ([\#5835](#5835), [\#5940](#5940), [\#5993](#5993), [\#5994](#5994), [\#5868](#5868))
  - Replace `trust_identity_server_for_password_resets` config option with `account_threepid_delegates`, and make the `id_server` parameteter optional on `*/requestToken` endpoints, as per [MSC2263](matrix-org/matrix-doc#2263). ([\#5876](#5876), [\#5969](#5969), [\#6028](#6028))
  - Switch to using the v2 Identity Service `/lookup` API where available, with fallback to v1. (Implements [MSC2134](matrix-org/matrix-doc#2134) plus `id_access_token authentication` for v2 Identity Service APIs from [MSC2140](matrix-org/matrix-doc#2140)). ([\#5897](#5897))
  - Remove `bind_email` and `bind_msisdn` parameters from `/register` ala [MSC2140](matrix-org/matrix-doc#2140). ([\#5964](#5964))
  - Add `m.id_access_token` to `unstable_features` in `/versions` as per [MSC2264](matrix-org/matrix-doc#2264). ([\#5974](#5974))
  - Use the v2 Identity Service API for 3PID invites. ([\#5979](#5979))
  - Add `POST /_matrix/client/unstable/account/3pid/unbind` endpoint from [MSC2140](matrix-org/matrix-doc#2140) for unbinding a 3PID from an identity server without removing it from the homeserver user account. ([\#5980](#5980), [\#6062](#6062))
  - Use `account_threepid_delegate.email` and `account_threepid_delegate.msisdn` for validating threepid sessions. ([\#6011](#6011))
  - Allow homeserver to handle or delegate email validation when adding an email to a user's account. ([\#6042](#6042))
  - Implement new Client Server API endpoints `/account/3pid/add` and `/account/3pid/bind` as per [MSC2290](matrix-org/matrix-doc#2290). ([\#6043](#6043))
  - Add an unstable feature flag for separate add/bind 3pid APIs. ([\#6044](#6044))
  - Remove `bind` parameter from Client Server POST `/account` endpoint as per [MSC2290](matrix-org/matrix-doc#2290). ([\#6067](#6067))
  - Add `POST /add_threepid/msisdn/submit_token` endpoint for proxying submitToken on an `account_threepid_handler`. ([\#6078](#6078))
  - Add `submit_url` response parameter to `*/msisdn/requestToken` endpoints. ([\#6079](#6079))
  - Add `m.require_identity_server` flag to /version's unstable_features. ([\#5972](#5972))
- Enhancements to OpenTracing support:
  - Make OpenTracing work in worker mode. ([\#5771](#5771))
  - Pass OpenTracing contexts between servers when transmitting EDUs. ([\#5852](#5852))
  - OpenTracing for device list updates. ([\#5853](#5853))
  - Add a tag recording a request's authenticated entity and corresponding servlet in OpenTracing. ([\#5856](#5856))
  - Add minimum OpenTracing for client servlets. ([\#5983](#5983))
  - Check at setup that OpenTracing is installed if it's enabled in the config. ([\#5985](#5985))
  - Trace replication send times. ([\#5986](#5986))
  - Include missing OpenTracing contexts in outbout replication requests. ([\#5982](#5982))
  - Fix sending of EDUs when OpenTracing is enabled with an empty whitelist. ([\#5984](#5984))
  - Fix invalid references to None while OpenTracing if the log context slips. ([\#5988](#5988), [\#5991](#5991))
  - OpenTracing for room and e2e keys. ([\#5855](#5855))
  - Add OpenTracing span over HTTP push processing. ([\#6003](#6003))
- Add an admin API to purge old rooms from the database. ([\#5845](#5845))
- Retry well-known lookups if we have recently seen a valid well-known record for the server. ([\#5850](#5850))
- Add support for filtered room-directory search requests over federation ([MSC2197](matrix-org/matrix-doc#2197), in order to allow upcoming room directory query performance improvements. ([\#5859](#5859))
- Correctly retry all hosts returned from SRV when we fail to connect. ([\#5864](#5864))
- Add admin API endpoint for setting whether or not a user is a server administrator. ([\#5878](#5878))
- Enable cleaning up extremities with dummy events by default to prevent undue build up of forward extremities. ([\#5884](#5884))
- Add config option to sign remote key query responses with a separate key. ([\#5895](#5895))
- Add support for config templating. ([\#5900](#5900))
- Users with the type of "support" or "bot" are no longer required to consent. ([\#5902](#5902))
- Let synctl accept a directory of config files. ([\#5904](#5904))
- Increase max display name size to 256. ([\#5906](#5906))
- Add admin API endpoint for getting whether or not a user is a server administrator. ([\#5914](#5914))
- Redact events in the database that have been redacted for a week. ([\#5934](#5934))
- New prometheus metrics:
  - `synapse_federation_known_servers`: represents the total number of servers your server knows about (i.e. is in rooms with), including itself. Enable by setting `metrics_flags.known_servers` to True in the configuration.([\#5981](#5981))
  - `synapse_build_info`: exposes the Python version, OS version, and Synapse version of the running server. ([\#6005](#6005))
- Give appropriate exit codes when synctl fails. ([\#5992](#5992))
- Apply the federation blacklist to requests to identity servers. ([\#6000](#6000))
- Add `report_stats_endpoint` option to configure where stats are reported to, if enabled. Contributed by @Sorunome. ([\#6012](#6012))
- Add config option to increase ratelimits for room admins redacting messages. ([\#6015](#6015))
- Stop sending federation transactions to servers which have been down for a long time. ([\#6026](#6026))
- Make the process for mapping SAML2 users to matrix IDs more flexible. ([\#6037](#6037))
- Return a clearer error message when a timeout occurs when attempting to contact an identity server. ([\#6073](#6073))
- Prevent password reset's submit_token endpoint from accepting trailing slashes. ([\#6074](#6074))
- Return 403 on `/register/available` if registration has been disabled. ([\#6082](#6082))
- Explicitly log when a homeserver does not have the `trusted_key_servers` config field configured. ([\#6090](#6090))
- Add support for pruning old rows in `user_ips` table. ([\#6098](#6098))

Bugfixes
--------

- Don't create broken room when `power_level_content_override.users` does not contain `creator_id`. ([\#5633](#5633))
- Fix database index so that different backup versions can have the same sessions. ([\#5857](#5857))
- Fix Synapse looking for config options `password_reset_failure_template` and `password_reset_success_template`, when they are actually `password_reset_template_failure_html`, `password_reset_template_success_html`. ([\#5863](#5863))
- Fix stack overflow when recovering an appservice which had an outage. ([\#5885](#5885))
- Fix error message which referred to `public_base_url` instead of `public_baseurl`. Thanks to @aaronraimist for the fix! ([\#5909](#5909))
- Fix 404 for thumbnail download when `dynamic_thumbnails` is `false` and the thumbnail was dynamically generated. Fix reported by rkfg. ([\#5915](#5915))
- Fix a cache-invalidation bug for worker-based deployments. ([\#5920](#5920))
- Fix admin API for listing media in a room not being available with an external media repo. ([\#5966](#5966))
- Fix list media admin API always returning an error. ([\#5967](#5967))
- Fix room and user stats tracking. ([\#5971](#5971), [\#5998](#5998), [\#6029](#6029))
- Return a `M_MISSING_PARAM` if `sid` is not provided to `/account/3pid`. ([\#5995](#5995))
- `federation_certificate_verification_whitelist` now will not cause `TypeErrors` to be raised (a regression in 1.3). Additionally, it now supports internationalised domain names in their non-canonical representation. ([\#5996](#5996))
- Only count real users when checking for auto-creation of auto-join room. ([\#6004](#6004))
- Ensure support users can be registered even if MAU limit is reached. ([\#6020](#6020))
- Fix bug where login error was shown incorrectly on SSO fallback login. ([\#6024](#6024))
- Fix bug in calculating the federation retry backoff period. ([\#6025](#6025))
- Prevent exceptions being logged when extremity-cleanup events fail due to lack of user consent to the terms of service. ([\#6053](#6053))
- Remove POST method from password-reset `submit_token` endpoint until we implement `submit_url` functionality. ([\#6056](#6056))
- Fix logcontext spam on non-Linux platforms. ([\#6059](#6059))
- Ensure query parameters in email validation links are URL-encoded. ([\#6063](#6063))
- Fix a bug which caused SAML attribute maps to be overridden by defaults. ([\#6069](#6069))
- Fix the logged number of updated items for the `users_set_deactivated_flag` background update. ([\#6092](#6092))
- Add `sid` to `next_link` for email validation. ([\#6097](#6097))
- Threepid validity checks on msisdns should not be dependent on `threepid_behaviour_email`. ([\#6104](#6104))
- Ensure that servers which are not configured to support email address verification do not offer it in the registration flows. ([\#6107](#6107))

Updates to the Docker image
---------------------------

- Avoid changing `UID/GID` if they are already correct. ([\#5970](#5970))
- Provide `SYNAPSE_WORKER` envvar to specify python module. ([\#6058](#6058))

Improved Documentation
----------------------

- Convert documentation to markdown (from rst) ([\#5849](#5849))
- Update `INSTALL.md` to say that Python 2 is no longer supported. ([\#5953](#5953))
- Add developer documentation for using SAML2. ([\#6032](#6032))
- Add some notes on rolling back to v1.3.1. ([\#6049](#6049))
- Update the upgrade notes. ([\#6050](#6050))

Deprecations and Removals
-------------------------

- Remove shared-secret registration from `/_matrix/client/r0/register` endpoint. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#5877](#5877))
- Deprecate the `trusted_third_party_id_servers` option. ([\#5875](#5875))

Internal Changes
----------------

- Lay the groundwork for structured logging output. ([\#5680](#5680))
- Retry well-known lookup before the cache expires, giving a grace period where the remote well-known can be down but we still use the old result. ([\#5844](#5844))
- Remove log line for debugging issue #5407. ([\#5860](#5860))
- Refactor the Appservice scheduler code. ([\#5886](#5886))
- Compatibility with v2 Identity Service APIs other than /lookup. ([\#5892](#5892), [\#6013](#6013))
- Stop populating some unused tables. ([\#5893](#5893), [\#6047](#6047))
- Add missing index on `users_in_public_rooms` to improve the performance of directory queries. ([\#5894](#5894))
- Improve the logging when we have an error when fetching signing keys. ([\#5896](#5896))
- Add support for database engine-specific schema deltas, based on file extension. ([\#5911](#5911))
- Update Buildkite pipeline to use plugins instead of buildkite-agent commands. ([\#5922](#5922))
- Add link in sample config to the logging config schema. ([\#5926](#5926))
- Remove unnecessary parentheses in return statements. ([\#5931](#5931))
- Remove unused `jenkins/prepare_sytest.sh` file. ([\#5938](#5938))
- Move Buildkite pipeline config to the pipelines repo. ([\#5943](#5943))
- Remove unnecessary return statements in the codebase which were the result of a regex run. ([\#5962](#5962))
- Remove left-over methods from v1 registration API. ([\#5963](#5963))
- Cleanup event auth type initialisation. ([\#5975](#5975))
- Clean up dependency checking at setup. ([\#5989](#5989))
- Update OpenTracing docs to use the unified `trace` method. ([\#5776](#5776))
- Small refactor of function arguments and docstrings in` RoomMemberHandler`. ([\#6009](#6009))
- Remove unused `origin` argument on `FederationHandler.add_display_name_to_third_party_invite`. ([\#6010](#6010))
- Add a `failure_ts` column to the `destinations` database table. ([\#6016](#6016), [\#6072](#6072))
- Clean up some code in the retry logic. ([\#6017](#6017))
- Fix the structured logging tests stomping on the global log configuration for subsequent tests. ([\#6023](#6023))
- Clean up the sample config for SAML authentication. ([\#6064](#6064))
- Change mailer logging to reflect Synapse doesn't just do chat notifications by email now. ([\#6075](#6075))
- Move last-seen info into devices table. ([\#6089](#6089))
- Remove unused parameter to `get_user_id_by_threepid`. ([\#6099](#6099))
- Refactor the user-interactive auth handling. ([\#6105](#6105))
- Refactor code for calculating registration flows. ([\#6106](#6106))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.