Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlencoded attribute/values have incomplete semantics #10

Closed
bblfish opened this issue Nov 26, 2015 · 25 comments
Closed

urlencoded attribute/values have incomplete semantics #10

bblfish opened this issue Nov 26, 2015 · 25 comments

Comments

@bblfish
Copy link

bblfish commented Nov 26, 2015

Webmention's urlencoded form communication mechanism has a problem when crossing contexts ( eg, between servers owned by different organisations or individuals). This is also known as Cross Origin communication, and special rules apply to it in Web Browsers (eg. CORS).

For most html Forms on the web this has not been a problem, as the organisation writing the page containing the form is the same as the one writing the program that parses the POSTed name/value pairs and does something with that information. But WebMention is designed to cross contexts.

The problem with urlencoded key/value pair forms when crossing origins, is that the keys lack a clear interpretation, when crossing namespace boundaries. The same key eg: source and target as proposed by the current WebMention spec, can have different meanings in different contexts. Since agents can come to a resource from any other server - we are in a p2p global information space after all - there has to be a way for the client to be able to be clear when he is posting something, what the meaning of what is being posted is, so that there is agreement on the client and the server.

As an example of what could go wrong: the army could quite plausibly set up a joining the army form, and by accident use exactly the same parameter names as webmention. Some people could by mistake or for profit publish links pointing to those forms, leading thereby a lot of people to join the army against their will, when they actually only wanted to send someone a ping message. This was known as taking the king's shilling.

This becomes prevalent when we are building User Agents that follow relations around Web Origins, say following a distributed social network, as these simple web agents won't be able to take the context, aesthetics and meaning of the page into account before acting. These agents therefore need to know before posting what the meaning of the content they will POST is from the point of view of the receiver.

  1. One way to do this is to make sure the mime type of what is POSTed has a well understood interpretation.
    • Semantic Pingback seems to suggest that adding RDFa to the form does the job of specifying the meaning of the form. (Is that actually specified somewhere?)
    • Using Activity Streams has been proposed as they have their own mime type
  2. Another way is to extend urlencoding by turning attributes into URLs as proposed by @melvincarvalho . This could function in that it could be argued that the meaning of urlencoded forms until now was always a client/server relation and that the parameters therefore were always interpreted relative to the forms base url. But see limitations discussed in clarifying URLEncoded form meaning #11
  3. The other way to do this, is for the resource receiving the urlencoded form (an endpoint in the webmention, a Container in LDP), to specify the interpretation of the meaning of what will be sent by mapping it into a well defined semantics (eg. RDF). This could be done the server providing an appropriate Link Header. This is considered in more detail in Issue 11: clarifying URLEncoded form meaning.

All of these answers make it easier to integrate with the SoLiD, if only for the simple reason that it then becomes possible for a POST to an LDPC to create a resource that can return a number of different representations.

@bblfish bblfish changed the title Lack of context WebMention Lack of context in WebMention Nov 26, 2015
@rhiaro
Copy link
Member

rhiaro commented Nov 26, 2015

In order for someone to accidentally join the army (or buy a product, sign a contract, etc) by sending a webmention using the current spec:
a) the person must create a document linking to the domain of the suspect service.
b) discover their script which handles such POST requests which has been explicitly delegated as the webmention endpoint by the service (so the service would have to be actively malicious).
c) post anything other than the two URLs required by the spec in the source and target, because I don't see how you can enter into some binding contract without posting your name, some kind of authentication, credit card number, etc...

That just doesn't seem like realistic scenario at all.

Nevertheless, I appreciate the desire for unique properties by those who are storing/mixing their incoming mentions alongside data from all manner of other sources and wish to keep them globally unique in that context. I think noting the default namespace in the spec for those who wish to use it should be adequate. Ie. if you get a post to your webmention endpoint (which you have described as your webmention endpoint in order for people to post to it in the first place) you know what to prefix source and target with if they come in un-prefixed.

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

@rhiaro, the army example is meant to help provide a simple example to focus your attention on security issues. It's not meant to be realistic, though it is not as unrealistic as you may think, if you read a bit of history, or if you follow the news. Or you only need to go to an auction and raise your hand at the wrong moment to understand that meaning goes beyond what you intended to mean.

Authentication is not covered by this spec, but it is not excluded either. So you can't argue that because you have not thought about authenticated webmentions that it won't be used that way.

Then what if in my profile I point my enemy to the army signup service? The resource itself has to be explicit about how it interprets incomplete information. Otherwise who else will do it?

@sandhawke
Copy link
Contributor

@bblfish, am I right in understanding that you're making a slippery slope argument here? You're not saying there is a weakness in webmention itself, but rather your claim is that if folks adopt webmention they will in the future be more likely to build systems that have this kind of army-sign-up vulnerability. Is that right?

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

@sandhawke there is a weakness in webmention yes, since attribute value pairs are lacking the context of interpretation. The application/x-www-form-urlencoded mime type does not provide the required context. This context was provided in the human readable web, by natural language placed around the form, added to the fact that the form is usually only sent to the server from which it came - known as "Same Origin Policy" . Here we are interpreting information from a link relation in a way that makes it programmable, and so we cannot rely on the human intuition the document web makes use of. Context therefore needs to be made explicit.

ActivityStreams does this with its mime type.
I proposed for the IndieWeb Folks in issue #11 that they add a Link header to the endpoint

200 Ok
Link: <http://w3c.org/social/WebMention>; rel="urlencoded"`

I also propose that a machine readable transformation be provided so as to allow more such protocols to be developed and worked with automatically. This would allow all three communities to work together.

@sandhawke
Copy link
Contributor

Sorry @bblfish I'm not seeing the actual security hole here. Can you give me a minimal Alice and Bob scenario, where some real harm is done to Alice because of how webmention is designed?

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

Bob gets to known Alice at a party and she gives him card. Someone has altered her webmention endpoint to point to the "join the military" web form and the country is in a state of war (not difficult to imagine if you follow the presidential elections in the US and the news). Bob wants to say hi to Alice, but instead says hi to the military and is recruited.

@sandhawke
Copy link
Contributor

Not yet convinced, but can I modify this to: Alice and Mallory meet at a party. Alice gets Mallory's business card. Later, Alice posts about the party and mentions meeting Mallory. Following proper etiquette she links to Mallory, and ends up doing a webmention. Unbeknownst to Alice, Mallory actually set his rel=webmention endpoint to be the army recruiter, so now Alice's system actually does a webmention POST to the army recruiter. (We've skipped the need for the system intrusion in your scenario.) So you're saying that the army might have a sign-up form where the field "source" is the URL of a page belonging to the person submitting the form, that's the way they're identified, and the act of posting to the form constitutes a legally binding contract? Is that the attack?

@dissolve
Copy link

To simplify it more, remember that anyone can send a webmention for anyone else. So really it's just that Alice posted data linking to anyone who has a webmention endpoint. Someone can send a webmention on her behalf without her knowing.

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

@sandhawke the point is that posting two key value pairs can have many different meanings in different contexts. So anyone is free to work out a better example . If the endpoint on the other hand returned a representation with the following type of header, then this type of issue would not arise.

200 Ok
Link: <http://w3c.org/social/WebMention>; rel="urlencoded"`
...

Its really quite simple. It's also why an LDP Container - see example 2 - returns a type header

Link: <http://www.w3.org/ns/ldp#BasicContainer>; rel="type",

This tells the agent posting that there are no further consequences to posting other than that the content of what he sends will be pubished at a new URL.

@sandhawke
Copy link
Contributor

@dissolve I think you're going in another direction. I think the supposed hole here is that webmention gets you to POST somewhere blindly. It's a POST done without any credentials, so some would say it can't do anything bad, but I think bblfish is imagining that maybe sometimes doing I POST even without credentials could be taken as a commitment. I think there may be occasional poorly designed systems where that's true.

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

@sandhawke in issue #11 I also point out that webmention may be tied in with authentication in the future, and that can't be ruled out. Some people may for example enable it to avoid spam, and I don't see why we should close that possibility from the get go.

@sandhawke
Copy link
Contributor

Example: a survey with two submit buttons, VOTE YES and VOTE NO, which have different end points and just count the number of unique ip addresses which post anything to each of those addresses. Webmention could be used to subvert that vote. I'm not sure this is big enough to be worth worrying about, but if it is the only fix I can think of requires another round trip. That round trip is something we might want anyway to allow conneg like @dret wants.

@dissolve
Copy link

In that case it's poor design (especially since there is no mechanism for multiple endpoints). Also the poll is not checking the source to verify the fact that it's a vote for THAT poll.

@sandhawke
Copy link
Contributor

@bblfish how can a header possibly help, since the agent doing the POST won't see the header until after it's competed the POST.

@sandhawke
Copy link
Contributor

@dissolve yes, I said this only mattered for "poorly designed systems", as far as I can tell

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

@sandhawke I answered on issue #11 as your question is more precisely related to that. The thread here concerns not just that possible answer but the whole issue of the problem of creating a resource where the mime type does not make the meaning of the content sent explicit, as is the case for application/x-www-form-urlencoded content. Attribute value pairs can clearly have different meanings in different forms. It's easy to prove. You can create a form for one endpoint, and I can easily build one that uses the same attributes to do something completely different. And in many case neither of the endpoints would be wrong with regard to their interpretation of the sent content.

@sandhawke
Copy link
Contributor

@bblfish in that that case, there's no problem. If there is ambiguity in a contract, you can't hold people to your chosen interpretation. Since application/x-www-form-urlencoded is entirely semantically ambitious, each post is entirely ambiguous, and thus there's no danger here.

@bblfish
Copy link
Author

bblfish commented Nov 26, 2015

@sandhawke the form content is ambiguous in a global context, but not in a local context. In the local context you are meant to read the form, before you send it. That's everyone experience using any shopping service on the web.

The problem is that WebMention if it is to be interesting is aiming to build a global context that is programmable (i.e it cannot rely on human contextual awareness as normal shopping services can). If WebMention is not interested in such an application, then Semantic Pingback certainly is.

@kevinmarks
Copy link
Contributor

As webmention is defined, it is just a notification that one URL (source) linked to another (target), which is verified by reading the source URL and interpreting it. The context is provided there. It need not be sent by either party; it is an unsourced but verifiable assertion.

What this thought experiment does illustrate is the danger of extending the webmention protocol beyond this, as discussed in #1, #2 and #4 where additional (unverifiable by an external link) context is sent along with the source and target URLs. This could encourage implementors to accept such additional context at face value, and use the existence of a link from source to target as confirming that.

As this issue links to the Royal Scots Grenadiers, and it is authored by Henry, a webmention endpoint that accepted a 'property' parameter of 'join_the_army', as in #1 or an as:join verb as in #4 could be mislead into asserting that Henry was volunteering.

Thus this is a good argument for rejecting such amendments to the webmention protocol.

@sandhawke
Copy link
Contributor

@bblfish you've make an unjustified leap. Clicking on a form can legally bind someone because it indicates agreement. That clicking might under the covers result in an http POST, true. But the POST is not the agreement. If someone activates a mechanism other than that form but which cause the POST, they are in no way obligated. It is of course trivial to make a form on one site which causes a POST to another. If you want POST on your site to mean something serous, you must have good xss security so you can make the case the POST was really caused by that person clicking on your form. And if you have that security, webmention can't cause you any difficulty. Conclusion: there's no hole here.

@sandhawke
Copy link
Contributor

@kevinmarks I'm pretty sure you're making a polemic, but if not, I believed you've misread Sarven's proposal.

@melvincarvalho
Copy link

Donald wishes to say good night to his wife. He uses a webmention form to to his wife's endpoint pointing to the message "Good night, dear".

However Donald happens to be logged in to his work account which is connected to a drone system. By mistake the webmention is routed to a form which is used to target drone strikes. By adding "target" of his wife, the AI enabled drone system is able to deduce that Donald requires a strike to be carried out against the target.

If the weapons system had a bit more context, it could instead ingore the request which would be semantically something more like:

[] a :WebMention ; :source <s> ; :target <t> .

Or it could be inferred from a mime type header or some other mechanism.

So we should probably do that in linked data by adding rdfs type. Hopefully I've understood the problem correctly ...

@bblfish
Copy link
Author

bblfish commented Nov 27, 2015

yes @melvincarvalho that is a good example.

If @sandhawke is really serious about the importance of anonymity here, then I suggest that he make a proposal that require that WebMention MUST be anonymous.

I am certainly interested in non anonymous WebMentions to have some protection against spammers of all types.

@melvincarvalho
Copy link

@bblfish got it. So this is less a security problem with webmention itself, but rather, a problem with urlencoded forms, in general.

Are there any existing W3C specs that have standardized the passing around of form encoded variables?

If not, perhaps standardizing on a JSON version, a la activity streams, solves this problem.

@bblfish bblfish changed the title Lack of context in WebMention URL encoding has incomplete semantics Nov 30, 2015
@bblfish bblfish changed the title URL encoding has incomplete semantics urlencoded attribute/values have incomplete semantics Nov 30, 2015
@aaronpk
Copy link
Member

aaronpk commented Dec 1, 2015

These arguments seem to be mostly poorly contrived examples, and it looks like a few of the more concrete issues have already been broken out into separate threads.

@aaronpk aaronpk closed this as completed Dec 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants