New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxies must treat target signals independently from client requests #114
Comments
Do you mean the oblivious request resource -> oblivious proxy resource hop? The target can't communicate with the proxy; anything it sends is encapsulated (which is why Tiru wanted a new header field that breaks out of that encapsulation). The most obvious example (see Erik's comment) is that a particular request is identified as abusive somehow. Letting the proxy know, so that it might treat that client differently, is the entire goal of that process. And I don't think that there is any inherent risk of that being tied to an individual request. Though I think that this is unlike the CDN reputation case where a set of attributes is forwarded along with a request; the goal here is to have the proxy do the extra work of offloading requests by blocking or slowing them. I guess that slowing requests might have an observable effect through the timestamps. There is a second piece, which is not request-related. A server might become overloaded and need to tell the proxy to slow down in the aggregate. A holistic signal makes sense there. |
Surely the target can communicate with the proxy -- it just adds a header, just like Tiru did.
Yes, I understand the goal, but I'm saying that goals runs contrary to the competing privacy goal of OHTTP, which is to maintain client<>request unlinkability. As a simple example, consider the following scenario. There are three clients C1, C2, and C3 sending requests to a target through a proxy in rounds. In each round, the target can guess which request corresponds to which client with some probability. Absent any information about the client, the target guesses correctly with probability 1/3. Now let's assume the proxy flags one request from one client Ci in round j with the shadowban bit. The target, in response, decides it's going to apply some rate limit or otherwise ask the proxy to treat the client differently. And let's say it does that by banning Ci from sending a request in round j+1. Now, in round j+1, the target sees only two requests. If all requests were unlinkable, the probability that the target could correctly guess the client for each request in j+1 and j would be 1/9 (=1/3 * 1/3). But that's not the case anymore, since the target knows the requests in j+1 correspond to the unflagged requests in round j. This is all pretty farfetched, but I think the core idea is simply that the proxy allowed the target to partition the anonymity set by applying per-client actions at request of the target. Balancing abuse and privacy here seems pretty challenging. |
The response from the target will be encapsulated, so new headers will be hidden from the proxy. What Tiru was suggesting was that we teach the response resource to recognize a header that is not encapsulated, but copied to the outer envelope instead.
Isn't this exactly the sort of thing that we're warning about when we say that every bit can split the client population in half? That speaks to a similar restriction on per-response entropy as exists on requests. The question whether this is orthogonal (and additive) or correlated (and therefore not able to divide the client population again). |
Yes, I'm saying this header is not encapsulated. 👍
Well, sort of. Perhaps restricting per-response entropy will help, but I think this needs more analysis. In any case, if the proxy does not treat client requests independently of what information it learns from the target, then things start becoming correlated. I'd prefer we not just overlook this relationship as we reason about (a) what information proxies (and targets) can send to targets (and proxies, respectively), and (b) how the recipient of that entity uses the information. As of now, I think applying per-client restrictions to requests upon being signaled from the target is harmful (see scenario described above). |
We updated draft-rdb-ohai-feedback-to-proxy to handle both server overload and malicious clients attacking the server scenarios. The latter case is tricky as it can be potentially abused to identify a client. We proposed the following changes to address this attack: 1: Indicates that RateLimit fields are applicable to all the clients 2: Indicates that RateLimit fields are applicable only to the |
The second value of the "ohttp-target" parameter seems to be the problematic case here, as described earlier in this issue. @tireddy2, do you think it's safe (from a client privacy perspective) for the proxy to change its behavior on a per-client basis? |
Yes, the second value can be potentially abused by the target. However, the proxy does not immediately act on the second value to rate-limit the traffic from the client but starts maintaining a count of responses to the client with "ohttp-target" parameter set to 2 (potential malcious-requests) and responses without the parameter (legitimate-requests). If the client has a high ratio of malicous-requests to legitimate-requests, it can shadow requests from the offending client for certain duration. Do you think the proposed mechanism is safe to protect the client privacy or it can possibly be abused by the target ? |
So I'm seeing some fairly positive things in the new version of the feedback draft. What I think we should do is first take this discussion to the mailing list, but then talk about whether just providing rate limiting signals is the right approach or whether a more generic communication path needs to be established. Rate limiting does allow us to address a good number of the use cases we have identified so far, but it could be constraining if we find that request resources need to be updated piecemeal to enable new use cases later. A more generic signaling scheme might relieve some of that pressure. It might be that all we need is a negotiation system whereby the request resource signals what it is willing to keep outside of encapsulation and the target then uses those. Given likely deployment scenarios, that might work even without signaling in some cases, but a more explicit scheme might be better for interoperability. |
No, I don't, as demonstrated above.
I don't think the question is whether rate limits are sufficient for the use cases we care about. As described via example above, the relevant question here seems to be whether the signals from target to proxy -- whatever they may be, and however they may be sent -- can be used to further partition the client anonymity set. |
Fair. So what does a system that looks good look like? Do we have to frame differential treatment in terms of things that the client accepts, just like we do for added information? Because we might frame differential treatment as being roughly equated to adding information to requests. |
This would be a reasonable framing, yeah. Something something "if you misuse this proxy, you run the risk of revealing information to the target"? |
In your example of three clients, the proxy does not rate-limit the requests from client C1 based on a single feedback from proxy. The proxy will block traffic from C1 only after it sees such a feedback signal for multiple requests from C1. Assuming the threshold is 10 requests, the probability score will be (0.000016) to send successive 10 feedback signals to C1. The mechanism only works if the target sees an attack pattern or garbage data in multiple requests from C1. An legitimate client will never send requests which are linkable unlike a malicious client which will send malformed requests. |
@tireddy2 Well, this is different from the example I proposed. Nevertheless, I'm not convinced this changes the situation in any meaningful way. I think we need to be clear about the risks here, and stating them in a way like @martinthomson proposed seems like a good way to do that. |
Agreed, I was trying to show the proposed mechanism does not adversely impact the privacy of legitimate clients. |
Sounds good, we can introduce a new header (e.g., h=header1:header2:header3) for the request resource to signal that it will keep header1, header2 and header3 outside of the encapsulation so the target can decide to use those headers. RateLimit-Limit can be one of the headers to start with. |
Sorry, but I don't think this is true. I'm saying that we've not demonstrated this to be the case, and the burden is on us to do so if we are to recommend it in the spec. |
Sure, the onus will anyway be on draft-rdb-ohai-feedback-to-proxy to prove it is privacy preserving. |
Updated draft based on the discussion with the authors of Oblivious HTTP protocol draft.
Something that came up during Vienna during the rate limiting topic is how these signals are consumed and acted upon by proxies. We know that the proxy sending additional bits per client might allow the target to partition the anonymity set to detrimental effect. But we haven't fully explored how target->proxy signals, like rate limits, are consumed and applied.
Generally speaking, I think that proxies need to act on any target->proxy signals independently of client requests, as otherwise a target can use this signal to try and partition the anonymity set. That means, for example, that a proxy should not apply rate limits to just one bad-behaving client, but apply limits uniformly across all client requests.
The text was updated successfully, but these errors were encountered: