Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

age logic for events is broken #8429

Open
richvdh opened this issue Sep 30, 2020 · 6 comments
Open

age logic for events is broken #8429

richvdh opened this issue Sep 30, 2020 · 6 comments
Labels
T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Consult-Clients Need to investigate if this change breaks clients

Comments

@richvdh
Copy link
Member

richvdh commented Sep 30, 2020

There is a bunch of code inside synapse which tries to add an unspecced age field to events.

The intention of such a field is to try to mitigate problems with incorrectly-set clocks. Rather than saying "this event was created at 12:00", we can say "this event was created 10 minutes ago"; for certain applications (notably VoIP signalling), that information is more useful. It would be more useful still if it was actually specced, but that's a side-issue.

Considering a flow from client to client over federation, I think the intention of the implementation is this:

  • Server A receives event from sending client. It records its best estimate of the timestamp the client created the event (typically the time it received the request from the client) in a field in the event called age_ts.
  • Server A creates a transaction including the event to send to server B. Just before it sends that transaction over the wire, it removes age_ts from the event and replaces it with a raw age field giving the number of milliseconds that have elapsed since age_ts.
  • Server B receives the transaction, and recreates age_ts based on the time it received the transaction.
  • Server B formats the event for a client, and replaces age_ts with age.

Assuming that a given server's clock is consistently inaccurate, it is kinda plausible, though obviously it neglects any delays between formatting transactions and them being received at the destination.

However, I suspect it doesn't actually work. The C-S code in synapse stores age_ts in the unsigned property of the event (which is probably correct no, we have internal_metadata for this sort of thing); however, the logic in the federation sender which tries to replace age_ts actually looks for the property in the top level of the event rather than in unsigned. It therefore leaves age_ts in place in unsigned. The federation receiver logic will accept an age in either unsigned or at the top level, and replaces it with an age_ts at the top level. The C-S code doesn't know to strip it out of there, which means it leaks out to the C-S API (see matrix-org/matrix-spec-proposals#2685).

This has clearly been broken for ages, and nobody has really noticed. Given none of it is specced, I am inclined to say that we should strip it all out. (Update 2023/01/17: it's specced nowadays.)

@DMRobertson
Copy link
Contributor

There is a bunch of code inside synapse which tries to add an unspecced age field to events (see matrix-org/matrix-spec-proposals#684, matrix-org/matrix-spec-proposals#2685).

AFAICS this is specced. See https://spec.matrix.org/v1.2/rooms/v1/#event-format through https://spec.matrix.org/v1.2/rooms/v9/#event-format. The UnsignedData type definition mentions an (optional?) age field.

@DMRobertson
Copy link
Contributor

Some spec history:

@richvdh
Copy link
Member Author

richvdh commented May 25, 2022

AFAICS this is specced. See https://spec.matrix.org/v1.2/rooms/v1/#event-format through https://spec.matrix.org/v1.2/rooms/v9/#event-format. The UnsignedData type definition mentions an (optional?) age field.

Hrm, it seems like it might have got a bit more specced since this issue was first opened. Note that https://spec.matrix.org/v1.1/client-server-api/ doesn't mention age outside of examples.

But yes, it seems pretty clear now.

@squahtx squahtx added the T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. label May 26, 2022
@BillCarsonFr
Copy link
Member

BillCarsonFr commented Jun 27, 2022

FTR, this caused a bug on android, where verification requests were discarded as their age couldn't be determined (no age field other federation).

It's not really clear if the field is mandatory. As per spec a timestamp field is mandatory for verification requests over to_devices

Required when sent as a to-device message. The POSIX timestamp in milliseconds for when the request was made. If the request is in the future by more than 5 minutes or more than 10 minutes in the past, the message should be ignored by the receiver.

Can't find any reference for verification via room messages, but the code was expected to use the agefrom unsigned to check for validity.

@DMRobertson DMRobertson added X-Consult-Clients Need to investigate if this change breaks clients and removed z-maintenance labels Aug 25, 2022
@DMRobertson
Copy link
Contributor

It's not really clear if the field is mandatory.

The spec does not mark age as Required, so you should treat it as optional. (Though I sympathise and would like it if the spec said "Optional" explicitly.)

As per spec a timestamp field is mandatory for verification requests over to_devices.
Can't find any reference for verification via room messages

I don't think it would make sense to do this on a room-by-room basis?

Also, it's really confusing that we have both an age and a timestamp.

I think I'm leaning towards "we should remove this from the spec and Synapse", unless anyone can explain how this is useful for clients?

@realtyem
Copy link
Contributor

Here is some information about where this gets touched meaningfully(AFAICT). FTR, I couldn't tease this behavior out of Complement anywhere during transmission, however I was able to find it exposed just before a simple_upsert so I believe it's possible this is being written to the database(More below). I don't believe after it is corrected for that it will need a test specifically in Complement/Sytest or in unit tests for a regression. Fun fact: The bit of code that causes this is almost 9 years old.

  • federation/federation_server.py, in FederationServer._handle_pdus_in_txn() here
  • events/__init__.py, in EventBase.get_pdu_json() here
  • events/utils.py,
    • in prune_event_dict() here
    • in serialize_event() here
  • events/builder.py, in create_local_event_from_event_dict() here
  • federation/sender/transaction_manager.py, in TransactionManager.send_new_transaction().json_data_cb() here
  • A whole smattering of unit tests

The ones marked with checks are the ones I suspect. The rest are easy to discount, as they look to be doing what they are 'supposed' to do(provided that age_ts is meant to be in unsigned and not internal_metadata).

(RECEIVING)

  • In federation/federation_server.py(FederationServer._handle_pdus_in_txn()), it appears that a PDU comes in for processing and:
    1. If an unsigned.age field is found:
      • It is copied to top-level age
      • It is not removed from unsigned
    2. If an age at top-level is found
      • It is copied to a top-level age_ts
      • The top-level age is then deleted.
    3. This means the top-level age_ts is created here and none of this occurs if the unsigned.age field isn't found to begin with as a top-level age is otherwise never created(As far as I could find/tell).

(SENDING)

  • In transaction_manager.py(TransactionManager.send_new_transaction().json_data_cb()),
    1. If age_ts is found at top-level
      • age_ts is mutated/copied from top-level into unsigned.age(Potentially overwriting any existing value, however I don't think it does)
      • The top-level age_ts is then deleted.
  • However, json_data_cb() gets the data it operates on from EventBase.get_pdu_json(). (for context, when EventBase.get_pdu_json() is called from the TransactionManager.send_new_transaction(), no 'timestamp' is passed in, so this conditional never gets called, which is why I don't believe it ever gets overwritten. Two ships passing in the night.) There:
    1. IF a time is passed in, AND IF an unsigned.age_ts exists:
      • Then an unsigned.age is formed
      • And unsigned.age_ts is deleted

Here is a log line demonstrating the age_ts key just before it's passed into the SQL transaction(insert_received_event_to_staging()):

From a Complement run, so not important information to worry about redacting.
synapse_main | 2023-05-15 23:18:44,627 - synapse.federation.federation_server - 1144 - DEBUG - PUT-12-!OoKQsmzESNgiMrRVMV:hs1-$-cAu52xwZ-S9G3CqO7HxZequ8wHa2Al36Gz4NHeG0zI - JASON: FederationServer: _handle_received_pdu(): Dumping pdu.get_dict() before upsert: {
synapse_main |     "auth_events": [
synapse_main |         "$z_uT1ir_fwmtkc6EWl1QEGroqigmMWHBMxpjANv83rc",
synapse_main |         "$cnH3NF73iBuNiy37FlWO7yl77SsQcdAMsUWim5qat54",
synapse_main |         "$LhmXyqX__hJqygcrJf5YQWKgTo-cqHp4TkrN3cYlvEI"
synapse_main |     ],
synapse_main |     "content": {
synapse_main |         "avatar_url": null,
synapse_main |         "displayname": "Charlie",
synapse_main |         "membership": "join"
synapse_main |     },
synapse_main |     "depth": 8,
synapse_main |     "hashes": {
synapse_main |         "sha256": "LpWqHgZD8vMgUsUxTU5yJ3TBIBP8DQZWBGmen0SMfhQ"
synapse_main |     },
synapse_main |     "origin": "hs3",
synapse_main |     "origin_server_ts": 1684192724017,
synapse_main |     "prev_events": [
synapse_main |         "$_Jx0eNdPpMEZnRij0LyQ0oKBFK60cjiRrqnHdQzeVvs"
synapse_main |     ],
synapse_main |     "room_id": "!OoKQsmzESNgiMrRVMV:hs1",
synapse_main |     "sender": "@charlie:hs3",
synapse_main |     "state_key": "@charlie:hs3",
synapse_main |     "type": "m.room.member",
synapse_main |     "age_ts": 1684192724247,
synapse_main |     "signatures": {
synapse_main |         "hs3": {
synapse_main |             "ed25519:a_QSmP": "OPbGWoTO5uR31cOiixd4AY+8VVdTIVtNdBeLduIfe4dxityvqczLRh/ETw72JGf4O76zTpmgNXMdUk8qmlkeDw"
synapse_main |         }
synapse_main |     },
synapse_main |     "unsigned": {
synapse_main |         "age": 12
synapse_main |     }
synapse_main | }

I was able to format it with json.dumps() for easier staring at. It goes into insert_received_event_to_staging if you wish to find this in an existing log line.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Consult-Clients Need to investigate if this change breaks clients
Projects
None yet
Development

No branches or pull requests

6 participants