Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
Initial formal proposal for the AS API #5
Conversation
ara4n
added some commits
Dec 30, 2014
|
Overall this feels well thought out and seems like a great start. Few notes:
and
Rooms are not currently namespaced at all, how do you propose namespacing them? If you mean room
The C-S API doesn't rely on timestamps for operational matters.
This seems to be a direct contradiction of one of the key selling points of AS:
The justification for using C-S API is so that ASes can reuse existing CS SDKs. The |
|
Q: Rooms are not currently namespaced at all, how do you propose namespacing them? If you mean room A: I left it open for discussion, but the options i had in mind were registering lists of virtual room aliases (#matrix:matrix.org, #matrix-dev:matrix.org), regexp of aliases (#matrix.:matrix.org), or a whole vhost (.:irc.matrix.org) to the AS (for the example of an IRC gateway). You would also need to support subscribing to room IDs as well as aliases to intercept events for non-aliased rooms (but obviously you wouldn't use these to refer to named rooms on the AS) TODO: meaning the CS API needs to support massaged timestamps A. I'm simply asking that we let ASes override origin_server_ts in their sent events (or alternatively we ask all clients to be aware of a new origin_as_ts field, but that wouldn't be backwards compatible and feels unwieldy). It doesn't impact signing. Q. "or the HS must delegate conversation storage entirely to the AS using a Storage API (not defined here) which allows the existing conversation store to back the HS, complete with all necessary Matrix metadata (e.g. hashes, signatures, federation DAG, etc)" This seems to be a direct contradiction of one of the key selling points of AS: A. I think we are crosswired. I deliberately give two models of handling history provided by the AS - either replicating it into the HS when the "virtual" room is created (thus avoiding any confusion for the AS implementer, but limiting the amount of history available)... or alternatively go the whole hog and replace sqlite with their existing convo store (unspecced, scary, but necessary if you don't want to end up with two convo stores). So i don't think this is a contradiction; i will try to fix the wording assuming we are aligned. Q. "The sending user ID must be explicitly specified, as it cannot be inferred from the access_token, which will be the same for all AS requests." The justification for using C-S API is so that ASes can reuse existing CS SDKs. The A. I am suggesting extending the CS API slightly so that AS's can say access_token=secret-as-master-key&userid=@matthew:matrix.org in order to inject events on behalf of @matthew:matrix.org. Thus the AS doesn't have to spend its life doing login dances and maintaining userid->accesstoken mappings as we have currently in the various clientside bots and bridges. Obviously this gives ASes god privileges (modulo e2e crypto protection), but this is deliberate - ASes do useful things on behalf of the users in that HS, and are trusted. Unless of course we want users to have to opt-in somehow to them.... |
|
whilst i think of it: this needs to spell out how you handle bridging non-virtual users into the AS domain - eg having @matthew:matrix.org send stuff to IRC via Arathorn rather than m-Arathorn... or in a Lync gateway, how to get traffic to route back and forth correctly to the real @matthew:matrix.org user. Also need to spell out what happens with group chats which are anchored in the AS domain. We worked through this all with the XMPP gateway use case; just need to incorporate it correctly here. |
|
|
Can you think of a use case where the AS should do linearisation itself? If not, i think the HS should always do it. What sort of message modifications are you thinking of that the CS API doesn't already provide? If the AS goes down without unregistering we'd want the HS to queue and retransmit events. |
|
This is exactly what OAuth2 should be used for. Without that, any AS wields far too much power if they control all the users on the HS. It would be laughable to expect someone like Google to let an XMPP bridge have control over ALL of their users without an opt-in method like OAuth2. |
|
say i implement an AS that acts as a content indexer and search engine (eg with elastic search) (oh! look! another use case). It needs to subscribe to all rooms to sniff their events. Would we really want users to have to explicitly opt into this? Perhaps there's a difference between submitting traffic on behalf of a real user, and just spying passively on traffic... |
|
Indeed there is a difference between passively observing (which hey, if you don't want that, use a different HS or use E2E) and actively masquerading as a user. |
|
How does this all work in the presence of end to end encryption? To send a message as a user would the AS need their private keys? How can the AS sniff the content of events? |
|
for e2e you either invite the AS into the room and so they get the room key (or however we do it) or all bets are off. masquerading is certainly out of the question |
|
This looks mostly good to me. I would be interested in the reliability mechanisms of the HTTP push transport, as the I think those will have a bearing on whether we want to support clustering of AS servers or not. Do we want to expose the HTTP push transport for use by clients? |
ara4n
added some commits
Jan 5, 2015
|
Erik: the HTTP push transport would be usable by clients if the HS implemented it (as AS API extensions are not compulsory for an HS). But it'd make little sense to do so. |
|
I've updated the draft with all the feedback above. The structure of the doc is a bit weird, and the use cases haven't been fully fleshed out due to lack of time, but all the info should be there. |
Can we not provide a trade-off and just allow them to implement the event content? The HS would ask for the content which it would then sign/continue as normal.
Use case for regexpping room IDs? They are opaque strings. By all means we may be selecting some existing room IDs to bridge with, but that is a far cry from the regex for room aliases, user IDs, etc. This should be made clearer. |
|
On 14/01/2015 15:17, Kegsay wrote:
Where would the HS store the signatures and all the matrix-specific
Both !.*:matrix.org (for a vhost), and to support aliases as well as IDs. |
|
The room ID domains are purely for namespacing the opaque ID; we shouldn't be regexpping that at all.
This just seems like an HS implementation problem. If the HS caches signatures (and given events are immutatatatable, why wouldn't it?) then this problem disappears.
I don't think that is as unwieldly as you are making it out to be. Swap a json blob in a database with a URL to poke for the json blob? |
Kegsay
referenced this pull request
Jan 14, 2015
Merged
Proposal for the HTTP API for Application Services #7
I don't think we should do this. If the user doesn't exist on Matrix then we should be 404ing/failing, not silently passing it through in the hopes that One Day they will become mapped. This also fails to account for application service namespacing (e.g. you have two very popular IRC ASes, with two different namespaces for users, which do you attempt to talk to?). See my HTTP proposal for info on this. |
|
As per real-life discussion, I strongly think that 3rd party users should be mapped to a unique user-id in Matrix - i.e. Arathorn on irc.freenode.net should always be referred to as @/irc/irc.freenode.net/Arathorn:matrix.org or whatever the convention is. This decouples the virtual user's identity from the AS which is performing the bridging. And you should expect problems if you try to run two ASes against the HS bridging the same chunk of 3rd party namespace into Matrix. Meanwhile, the actual suggested behaviour is that if I'm on Matrix and want to talk to a 3PID like Arathorn-on-irc.freenode.net, i start off by checking for a 3PID mapping in my identity server. If the ID server has a mapping from Arathorn-on-irc.freenode.net to @arathorn:matrix.org, then we go and use that to talk to them. Otherwise, we can choose to go via a bridge (typically our local HS's AS, but the client could be configured to use another one like matrix.freenode.net, resulting in a virtual user ID of @/irc/irc.freenode.net/Arathorn:matrix.freenode.net) - and we'll end up talking to them via the gateway. |
Adding a bit more justification for this:
If you namespace ASes user IDs:
If you don't namespace ASes user IDs:
We may still want to consider a layer of indirection (e.g. "here is a random URI, please give user ID") rather than regexpping the URI to form an ugly looking user ID, or we may want slightly nicer conversion rules (this will depend on #3 ) |
Kegsay
added a commit
that referenced
this pull request
Jan 15, 2015
When would this ever happen? If the non-matrix user sends a message, the AS will make a room and invite the matrix user to it. If the matrix user sends a message, Also, can you justify in the general API doc why the lazy-loading of user IDs is absolutely required? (e.g. so you don't need to create every MSISDN evar). It's briefly touched on but not explained why it is important. |
|
This is describing the HS receiving an invite (or join) from matrix for a virtual room which needs to be lazy-created by the AS. So: if your HS is hooked up to an IRC bridge AS which has registered control of the #/irc/irc.freenode.net/ namespace, then if the HS receives an invite/join from a Matrix user on that HS to participate in #/irc/irc.freenode.net/randomchannel, it's possible the HS knows nothing about this room yet, as it's never been accessed before. In this scenario, the HS sees that namespace is delegated to the AS, so queries the AS as to whether that room should exist and be lazy-created. The AS checks if that channel exists on the IRC server, and if so, the AS then goes and creates the room on the HS via the AS/CS API - creating the room with the right alias, name, permissions, history, state, etc and inserting the appropriate virtual users to represent the ones in the IRC channel. Once this is done, the HS continues on with the normal interaction with the new lazy-created room. The reason why we need lazy-loading of user IDs is because another AS might be PSTN-gatewaying the entire MSISDN space - e.g. supporting @.tel.447700900123:whoever.com style user IDs. The space of possible valid MSISDNs is obviously huge: 10^13 or so. So when you start a conversation with one of these virtual users, they must be lazily provisioned by the AS. (For instance, the AS could reject the invite if it can't route to the given MSISDN. Alternatively, it provisions an account on the HS for that user's name, with appropriate profile info based on what it knows about that MSISDN and metadata about how it's bridging it). It's insane to think that the AS would preprovision accounts for all possible MSISDNs on the HS that it could gateway through to. |
|
Would it be helpful for the HS to include display name info for users and rooms when pushing events to the AS? It would save the AS from having to track what the current display name is for a user. |
|
We have an open problem in general on how to specify 3PID originators in If I've logged into my Matrix app with my MSISDN, and send a message to In general we don't want to leak Matrix IDs if users are identifying |
|
Also, these virtual user IDs are getting uglier and uglier. Is there so: the virtual user id for the foreign matthew@arasphere.net user on a |
I like this, but what namespace does the AS claim then? An alternative would be to do as I suggested a few comments up:
though I prefer your hash idea since it avoids adding another API and another round trip. |
|
Some application services may want exclusive rights on the namespaces they've claimed in the regex, e.g. to stop humans creating a room alias / user in the AS namespace. Other application services may not want exclusive rights but just want to sniff around (e.g. logging ASes). This concept isn't covered in the general API. |
Kegsay
added some commits
Feb 5, 2015
|
Sometimes application services need to create rooms (e.g. when lazy loading from room aliases). Created rooms need to have a user that created them, so federation works (as it relies on an entry existing in |
ara4n commentedDec 30, 2014
I thought more about the requirements for application services over Christmas, and here's a slightly high-level but fairly concrete proposal of what I think they should look like. This largely ignores policy servers and the Storage API idea, in favour of building on our current client-side gateway bots by giving them privileges and making them run as proper server-side services.
This hopefully decouples the AS API problem entirely from the CS v2 API discussion.
https://github.com/matrix-org/matrix-doc/blob/application-services/drafts/application_services.rst