Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define cross-site versus same-site privacy risks #8

Closed
bslassey opened this issue Aug 17, 2021 · 9 comments · Fixed by #35
Closed

Define cross-site versus same-site privacy risks #8

bslassey opened this issue Aug 17, 2021 · 9 comments · Fixed by #35

Comments

@bslassey
Copy link
Collaborator

Fingerprinting in general and IP addresses in particular can be used to identify users both across sites and within a single website. IP Privacy and anti-fraud and abuse solutions will vary greatly based on which of these privacy risks we are attempting to address.

My suggestion is to focus on preventing cross-site re identification and tracking but to keep same-site re-identification and tracking out of scope for this document. WDYT?

@chris-wood
Copy link
Collaborator

chris-wood commented Aug 25, 2021

I think this is right. Sites that want to re-identify users can do so with appropriately scoped signals such as first party cookies. This leads me to think same-site re-identification is out of scope for a document about IP address privacy.

@bakkot
Copy link

bakkot commented Aug 30, 2021

It seems reasonable to say that the focus of this document is to provide alternatives for the use cases served by cross-site re-identification, but I think it's important to consider the effects of IP privacy on same-site re-identification as well.

(For context, I work on an anti-abuse product at Shape Security which does exactly this sort of same-site re-identification.)

Sites that want to re-identify users can do so with appropriately scoped signals such as first party cookies.

Cookies are opt-in, so that's not particularly viable as an anti-abuse mechanism, particularly if account takeover or denial of service is in scope.

@chris-wood
Copy link
Collaborator

Cookies are opt-in, so that's not particularly viable as an anti-abuse mechanism, particularly if account takeover or denial of service is in scope.

To clarify, does this mean a signal that an on-by-default signal is needed for this use case? Is it possible to request the signal, rather than have it always be sent?

@bakkot
Copy link

bakkot commented Aug 30, 2021

Attackers need to not be able to opt out of sending the signal. Or rather, real users need to opt out so infrequently that outright blocking anyone who does not send it is acceptable. Cookies don't work here because any first-time visitor will lack cookies for the site, which means you can't simply block anyone who lacks cookies.

So if the "request" is out-of-band, such that it might be dropped by the network for real users (or not complete before they click submit, etc), that doesn't work. If the request is in-band, so that all real users are guaranteed to have gotten the request, that works.

(At least for some use cases, though not all. Consider DDoS: that frequently takes the form of "fetch this resource from the server". If a user might fetch that resource before having ever visited the site, they're not going to have been in a position to receive any "requests" for additional signals from the server, so only an on-by-default signal is of any use.)

@chris-wood
Copy link
Collaborator

So if the "request" is out-of-band, such that it might be dropped by the network for real users (or not complete before they click submit, etc), that doesn't work. If the request is in-band, so that all real users are guaranteed to have gotten the request, that works.

Agreed. Something in-band seems like a suitable replacement that might address the first-time-visitor problem of Cookies.

(At least for some use cases, though not all. Consider DDoS: that frequently takes the form of "fetch this resource from the server". If a user might fetch that resource before having ever visited the site, they're not going to have been in a position to receive any "requests" for additional signals from the server, so only an on-by-default signal is of any use.)

Could the server not block the response to the client's request while it challenges the client to present the signal? e.g.,

C -> S: GET index.html
S -> C: 400, "challenge"
C -> S: GET index.html, with suitable signal in header
S -> C: 200, index.html

@bakkot
Copy link

bakkot commented Aug 31, 2021

Could the server not block the response to the client's request while it challenges the client to present the signal?

In principle, yes, at least if browsers are set up to automatically respond to such requests. (Currently if a server responds with a 400 to an <img src="">, the browser is not going to retry that request with more headers, so this would require a change to browsers.)

In practice, asking everyone to pay an additional round-trip to fetch resources seems like quite a high cost, which is likely to be untenable in many applications.

Anyway, this is getting somewhat off topic for the original thread; do you want to open a new issue to continue this discussion (or should I)?

@chris-wood
Copy link
Collaborator

Forking to a new issue seems fine =) I think we're teasing out the tradeoffs piece by piece. Please go ahead and file one and we can continue there!

@sysrqb
Copy link
Collaborator

sysrqb commented Nov 3, 2021

Fingerprinting in general and IP addresses in particular can be used to identify users both across sites and within a single website. IP Privacy and anti-fraud and abuse solutions will vary greatly based on which of these privacy risks we are attempting to address.

My suggestion is to focus on preventing cross-site re identification and tracking but to keep same-site re-identification and tracking out of scope for this document. WDYT?

Coming back to this, just so my position is clear. I believe same-site re-identification (linkability) is in-scope because otherwise we leave a gap in the entire discussion about IP address privacy. Addressing same-site linkability may be out of scope for some IP Privacy solutions, but I see this document as one of the most important places to capture cross-site and same-site re-identification as use cases of IP addresses and whether that usage can be replaced.

@sysrqb
Copy link
Collaborator

sysrqb commented Nov 3, 2021

I think this is right. Sites that want to re-identify users can do so with appropriately scoped signals such as first party cookies. This leads me to think same-site re-identification is out of scope for a document about IP address privacy.

And, follow on from above, I have this opinion because of a simple example: a user opens a browser in incognito/private-browsing mode (PBM) and visits a web site, after some time they completely close the browser. Later, they open the browser in incognito/PBM again and visit the same web site.

In general, I believe an average person would assume their first and second visits are not linkable (assuming they don't login or provide any identifiable information). In reality they are linkable, for multiple reasons, but that is not the expectation and a document describing IP address privacy considerations shouldn't ignore this.

bslassey added a commit to bslassey/draft-ip-address-privacy that referenced this issue Jul 28, 2023
bslassey added a commit to bslassey/draft-ip-address-privacy that referenced this issue Jul 28, 2023
bslassey added a commit to bslassey/draft-ip-address-privacy that referenced this issue Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants