MSC3062: Bot verification #3062

uhoreg · 2021-03-12T23:26:47Z

piegamesde · 2021-03-20T13:31:16Z

proposals/3062-bot-verification.md

@@ -0,0 +1,106 @@
+# MSC3062: Bot verification


What is the purpose of this? What is the motivation on a security model level?

If you can verify the bot is being run by a trusted party (and therefore trust the device) you can detect if the homserver admin creates a malicious device to decrypt messages to the bot.

Basically the bot authenticates by saying "I own the content on this domain", and then I trust the bot if I trust that domain? In that case, the "Proposal" section might benefit from being split up in a user-centric description and the low-level protocol messages.

penn5 · 2021-06-03T16:45:12Z

proposals/3062-bot-verification.md

+identity given in this way matches the expected identity, or record that the
+given identity is associated with the human's Matrix ID.
+
+## Potential issues


Perhaps note that Punycode has caused security issues relating to humans verifying URLs, and that clients need to be careful when decoding it.

Yes, that is sort of implied by the part in the "Security considerations" section that says that it depends on "the human being able to distinguish a trusted URL from an untrusted URL", but it may be worth calling this out as an example, as well as other similar things (such as "paypaI", where that last character is an upper-case "i" rather than a lower-case "L")

turt2live · 2021-08-23T22:43:24Z

proposals/3062-bot-verification.md

+The human's client displays the URL to the human, to allow them to verify that
+the URL looks legitimate (e.g. that it belongs to a domain that the human
+trusts to be associated with the bot's operators).  If the human accepts the
+URL, the client makes a POST request to the URL with the request body being a
+JSON object with the following properties:


There's a few issues with using webservers:

Most bots don't have them today.

They can be load balanced or serverless (AWS Lambda), and trying to direct that request through to the right backend involves either breaking the load balancing/underlying tech or exposing internal infrastructure externally (ie: returning https://bot.nyc3.i2.example.org for a bot running in a NYC3 DC as instance 2). This then means exposing DDOS endpoints to bring down targeted parts of the infrastructure, or scraping for infrastructure.

Because the backend can't be guaranteed to be the bot, the user is actually verifying a provider/website with arbitrary backend. For larger providers (like if Discord were to switch wholesale to Matrix) they may very well end up with a backend service that handles all of these verification requests without ever actually talking to the bot. This means the bot could still be malicious but the service provider is hiding the details of that.

I don't really have alternatives at the moment, but the use of HTTP for verification doesn't feel like a safe route. Possibly for bots it might be sane enough to verify based purely off the ability to establish an Olm session?

This is an annoying requirement. But it seems like the only widely-available trust source today is DNS names and the web PKI. The only alternatives I can think of are email signing certs and DNSSEC. Both of the others seem harder for the average bot to manage.

I'm not really sure if I understand your point. Likely if you are load balanced you want to be able to do this verification on any backend so this seems reasonable. If you are sharding in a way that this isn't possible you have likely already solved this problem to manage incoming events.

I'm not sure what you mean by this. You "know" it is the bot because it gave you the URL. If you were given a URL that the bot doesn't control then they have no way to get the provided information and do the verification. (Some exceptions may be using a pastebin to receive the "upload" and retrieve it later. I've raised another comment about this.

One alternative would be:

The bot sticks a public key in DNS (not web-compatible) or at an well-known HTTP endpoint. This cert is downloaded and used to send signed data to the bot. If the bot can read the data it is assumed to control that domain and that case be used as an identity.

The benefit here is that the data hosted on HTTP is static, which makes it far easier to host. You just need to keep it up with a fresh TLS cert.

kevincox · 2021-08-28T16:27:17Z

proposals/3062-bot-verification.md

+The human's client displays the URL to the human, to allow them to verify that
+the URL looks legitimate (e.g. that it belongs to a domain that the human
+trusts to be associated with the bot's operators).  If the human accepts the
+URL, the client makes a POST request to the URL with the request body being a
+JSON object with the following properties:


This is an annoying requirement. But it seems like the only widely-available trust source today is DNS names and the web PKI. The only alternatives I can think of are email signing certs and DNSSEC. Both of the others seem harder for the average bot to manage.

I'm not really sure if I understand your point. Likely if you are load balanced you want to be able to do this verification on any backend so this seems reasonable. If you are sharding in a way that this isn't possible you have likely already solved this problem to manage incoming events.

I'm not sure what you mean by this. You "know" it is the bot because it gave you the URL. If you were given a URL that the bot doesn't control then they have no way to get the provided information and do the verification. (Some exceptions may be using a pastebin to receive the "upload" and retrieve it later. I've raised another comment about this.

One alternative would be:

The bot sticks a public key in DNS (not web-compatible) or at an well-known HTTP endpoint. This cert is downloaded and used to send signed data to the bot. If the bot can read the data it is assumed to control that domain and that case be used as an identity.

The benefit here is that the data hosted on HTTP is static, which makes it far easier to host. You just need to keep it up with a fresh TLS cert.

kevincox · 2021-08-28T16:45:49Z

proposals/3062-bot-verification.md

+- `keys`: a map of key ID to public key for each key that the client wants to
+  attest to
+
+The HTTPS server responds with an HTTP code of:


We should require more than just a 200. Otherwise the bot could pretend to be a pastebin service or similar as long as they find a service that:

Returns a 200 for JSON posts.

Makes that data available.

For example this can almost be done with pastebin.com. The only problem is that it requires Content-Type: application/x-www-form-urlencoded. (Although I don't actually see an explicit requirement for application/json in this MSC.) This works by using the bot-controlled transaction_id to pass required parameters. You can imagine that this would be even easier if the endpoint just accepted raw content.

% curl -iX POST --data-binary '{"transaction_id":"=bar&api_dev_key=REDACTED&api_option=paste&api_paste_code=","nonce":123,"from_device":"devid","keys":{"a":1}}' -HContent-Type:application/x-www-form-urlencoded "https://pastebin.com/api/api_post.php" HTTP/2 200 date: Sat, 28 Aug 2021 16:37:43 GMT content-type: text/html; charset=UTF-8 x-custom-api-dev-id: 362667 set-cookie: pastebin_posted=REDACTED; expires=Sat, 28-Aug-2021 17:37:43 GMT; Max-Age=3600; path=/; HttpOnly cf-cache-status: DYNAMIC expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct" server: cloudflare cf-ray: 685ef7407fc02962-ORD https://pastebin.com/SrNSQGBV

The user is supposed to verify that the URL looks legitimate.

Also, the attacker wouldn't be able to get the data that was POSTed without the URL that the pastebin returned.

kevincox · 2021-08-28T16:48:38Z

proposals/3062-bot-verification.md

+- the Matrix ID of the human, followed by `|`,
+- the device ID of the human, followed by `|`,
+- the Matrix ID of the bot, followed by `|`,
+- the `transaction_id`, followed by `|`,


IIUC the transaction_id is not forbidden from including the | character, however I don't think this is exploitable. But maybe it would be a good idea to require escaping anyways just to be extra sure.

uhoreg added 2 commits March 12, 2021 18:25

initial version of bot verification method

26f9c10

use MSC number

23cd01d

uhoreg changed the title ~~MSCxxxx: Bot verification~~ MSC3062: Bot verification Mar 12, 2021

uhoreg added e2e kind:feature MSC for not-core and not-maintenance stuff proposal A matrix spec change proposal labels Mar 12, 2021

uhoreg added 2 commits March 12, 2021 18:31

use the right brackets

f1be7c6

add a security consideration regarding IP addresses

edab512

piegamesde reviewed Mar 20, 2021

View reviewed changes

penn5 reviewed Jun 3, 2021

View reviewed changes

turt2live added the needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. label Jun 8, 2021

turt2live self-requested a review July 31, 2021 05:58

turt2live reviewed Aug 23, 2021

View reviewed changes

kevincox reviewed Aug 28, 2021

View reviewed changes

turt2live force-pushed the old_master branch from e895827 to dca99ee Compare August 30, 2021 22:34

semiviral mentioned this pull request Aug 2, 2023

Allow serverName to be specified in URL element-hq/element-web#17824

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSC3062: Bot verification #3062

MSC3062: Bot verification #3062

uhoreg commented Mar 12, 2021 •

edited

Loading

piegamesde Mar 20, 2021

Cadair Mar 20, 2021

piegamesde Mar 20, 2021

penn5 Jun 3, 2021

uhoreg Jun 3, 2021

turt2live Aug 23, 2021

kevincox Aug 28, 2021

kevincox Aug 28, 2021

kevincox Aug 28, 2021

uhoreg Feb 23, 2024

kevincox Aug 28, 2021

MSC3062: Bot verification #3062

Are you sure you want to change the base?

MSC3062: Bot verification #3062

Conversation

uhoreg commented Mar 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uhoreg commented Mar 12, 2021 •

edited

Loading