-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3360: Server Status #3360
base: old_master
Are you sure you want to change the base?
MSC3360: Server Status #3360
Conversation
da3eba2
to
594744f
Compare
594744f
to
a09c98c
Compare
Looks good! However, it seems there's no MSC about signaling server decommissioning? But, if there was to such a feature, then I think this MSC could include it. In that case the end point would not be used only by the server's clients, but also by other servers. Along the lines: this instance is down for good, don't bother calling me again ;), and it would be possible to arrange this response even with a static web server. For this to work, the end point would need to be available for the world (or at the very least its peers), not just the clients of the server. Maybe its effects could be determined by the other server: maybe clear the decommissioning status if it sees inbound federating traffic from the server? I hear it's a real-world issue that decommissioned servers get incoming requests for long periods of time after having been terminated. Probably not a very severe issue, but issue nevertheless.. And it also consumes resources from other servers to keep on contacting it. |
This is an interesting idea. I'm wondering if that wouldn't be better served by some additional information under the well-known instead (which is typically already handled by a front-end returning a static response). Since the server is decommissioned, it feels odd to me personally to have an API endpoint for it. I believe servers periodically check I do foresee some potential issues with that though. Lets suppose you have "my-awesome-domain.com", hosted a Matrix deployment on it and eventually decommissioned it, signalling that to the federation using a hypothetical "hey this deployment is permanently gone". Now eventually you let the domain registration lapse because you no longer have any use for it and someone else registers it and in turn wants to run a Matrix deployment on it again. If this "server is permanently gone" thing is stored by all servers it's previously talked to, the new owner of the domain won't ever be able to successfully federate with those servers again. If it's not stored but periodically checked, then once the domain changes hands the new owner would need to continue to publish this information if they don't want to get random Matrix traffic, which also feels odd. This feels more like we need to clarify something in the s2s spec as to how and how long homeservers should back-off, and at which point it's OK for them to give up entirely? Also, if servers don't want to receive any traffic any more, wouldn't it be sufficient for the server admin to forcefully leave all users on their homeserver from all rooms, before decommissioning the deployment? Since they'd no longer be participating in any room there should be no reason for anyone to continue to federate with them.
|
} | ||
``` | ||
|
||
### Retrieval of status events: `GET /_matrix/client/r0/server/status` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For best reliability it would be better if this could be completely removed from the server domain. For example https://www.githubstatus.com/ is used by GitHub. Maybe it makes sense to advertise the preferred URL to the client and the client is expected to cache that URL until it gets a newer one?
The obvious downside is that this doesn't help for a new client. In that case maybe this URL could be used as a fallback if no better URL is known (or the better URL is returning an error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, and I've been trying to figure out a way to make that happen. The most obvious place to me would be to add this to either the client well-known response, or capabilities. Something like an m.server_status_endpoint
. I'm not sure what's most appropriate, though I'm leaning towards the capabilities endpoint.
This does open up a somewhat interesting can of worms if someone were to point to a different location, say myhomeserverstatus.com/matrix_server_status
, but someone else gains control of that domain (a lapsed registration for example). Not sure what to do about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but someone else gains control of that domain
I don't think this is worth worrying about. You can already make this argument about having the matrix server and federation domain be on different domains. In practice it is the site operators choice and if they choose to use multiple domains they need to be committed to maintaining them. Plus the downside isn't that bad, it just shows an informational message and can always be undone by the homeserver operator (once they get around to it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling a bit with where to properly put this. The capabilities endpoint is pretty clear that we shouldn't advertise support for unstable features, so it would fit better in /versions
's unstable_features
object. However, that one is only expected to be feature: boolean
mapping, which wouldn't let us point to a URL.
That brings me back to putting it in the well-known instead, which sort of feels appropriate for this since it allows us to direct C2S vs S2S requests at different hosts, but it feels weird to use it to offload a single endpoint.
So, where do we put this/how do we advertise it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think well-known sounds reasonable. It may even be preferred since it is often on a different infrastructure then the server itself. It means that even first-time users or fresh clients could find the status page which is a very nice property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mulling this over a bit, I was initially thinking about extending the .well-known/matrix/client
response. But it kept feeling inappropriate to redirect a single endpoint that way and potentially opening up the door for that discovery document growing and growing.
So I find myself wondering if maybe it's preferable to have something like .well-known/matrix/server-status
instead that would return an {m.status_endpoint: ""}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like .well-known/matrix/server-status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested .well-known/matrix/status
in #spec before arriving to this MSC, I think both could enhance eachother.
Regardless, I think this concern could/should be noted under the "Potential Issues" section though, or the drawback of this endpoint being on the same domain at least noted.
(Other than that, this is a solid proposal imo 👍)
Signed-off-by: Daniele Sluijters <daenney@users.noreply.github.com>
a09c98c
to
3064d38
Compare
``` | ||
|
||
### Retrieval of status events: `GET /_matrix/client/r0/server/status` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it'd help if a suggested value for caching/re-request time is also noted here.
I'm suggesting;
- 6 hours in normal conditions
- every 5 minutes when connection issues are present
(With exponential backoff to 30min max if the endpoints dont work, to hold off stampeding herds)
This proposal aims to provide a channel through which a client can get status information about its homeserver so that it can provide more useful context to users when problems occur or provide advance notice for upcoming maintenance.
Rendered