Merge pull request #38 from IRTF-PEARG/refactor36

DavidSchinazi · web-flow · commit c88801d90b15 · 2023-12-12T15:00:32.000-08:00
Refactor to separate uses of IP
diff --git a/draft-irtf-pearg-ip-address-privacy-considerations.md b/draft-irtf-pearg-ip-address-privacy-considerations.md
@@ -112,7 +112,8 @@ informative:
 
 This document provides an overview of privacy considerations related to user IP
 addresses. It includes an analysis of some current use cases for tracking of
-user IP addresses, mainly in the context of anti-abuse. It discusses the
+user IP addresses, grouping them into two categories: personalization and
+anti-abuse. This document also discusses the
 privacy issues associated with such tracking and provides input on mechanisms
 to improve the privacy of this existing model. It then captures requirements
 for proposed 'replacement signals' for IP addresses from this analysis. In
@@ -130,17 +131,18 @@ related to user IP addresses (informally, IP privacy). The draft is likely to
 evolve significantly over time and may well split into multiple drafts as
 content is added.
 
-Tracking of IP addresses is common place on the Internet today, and is
-particularly widely used in the context of anti-abuse, e.g. anti-fraud, DDoS
-management, and child protection activities. IP addresses are currently used in
-determining "reputation" {{!RFC5782}} in conjunction with other signals to
+Tracking of IP addresses is common place on the Internet today, and falls
+roughly into two broad categories. The first is personalization, the tailoring
+of content for a given user. The second is anti-abuse: e.g. anti-fraud, DDoS
+management, and child protection activities. The latter includes uses of IP
+addresses to determine "reputation" {{!RFC5782}} in conjunction with other signals to
 protect against malicious traffic, since these addresses are usually a
 relatively stable identifier of a request's origin. Servers use these
 reputations in determining whether or not a given packet, connection, or flow
 likely corresponds to malicious traffic. In addition, IP addresses are used in
 investigating past events and attributing responsibility.
 
-However, identifying the activity of users based on IP addresses has clear
+Personalizing content based on the user's IP address has clear
 privacy implications ({{WEBTRACKING1}}, {{WEBTRACKING2}}), e.g. user
 fingerprinting and cross-site identity linking. Many technologies exist today
 that allow users to obfuscate their external IP address to avoid such tracking,
@@ -151,8 +153,9 @@ Relay {{APPLEPRIV}}, Gnatcatcher {{GNATCATCHER}}, and Oblivious technologies
 
 General consideration about privacy for Internet protocols can be found in
 {{!RFC6973}}. This document builds upon {{!RFC6973}} and more specifically
-attempts to capture the following aspects of the tension between valid use
-cases for user identification and the related privacy concerns, including:
+attempts to capture the following aspects of the tension between use of IP
+addresses to prevent abuse, and some users' desire to prevent overzealous
+personalization:
 
 * An analysis of the current use cases, attempting to categorize/group such use
   cases where commonalities exist.
@@ -225,11 +228,59 @@ Consumption:
 : An interaction where one party primarily receives information from other
 parties.
 
-# IP address tracking
 
-## IP address use cases
+# Mitigations for IP address tracking
 
-### Anti-abuse {#antiabuse}
+The ability to track individual people by IP address has been well understood
+for decades. Due to the prevalence of systems that profile users using their IP
+addresses, countermeasures have been developed. Commercial VPNs and Tor are the
+most common methods of mitigating IP address-based tracking.
+
+- Commercial VPNs offer a layer of indirection between the user and the
+  destination, however if the VPN endpoint's IP address is static then this
+  simply substitutes one address for another. In addition, commercial VPNs
+  replace tracking across sites with a single company that may track their
+  users' activities.
+
+- Tor is another mitigation option due to its dynamic path selection and
+  distributed network of relays, however its current design suffers from
+  degraded performance. In addition, correct application integration is
+  difficult and not common.
+
+- Address anonymization (e.g. {{GNATCATCHER}} and similar):
+
+  - {{GNATCATCHER}} is a single-hop proxy system providing more protection
+    against third-party tracking than a traditional commercial VPN. However,
+    its design maintains the industry-standard reliance on IP addresses for
+    anti-abuse purposes and it provides near backwards compatibility for select
+    services that submit to periodic audits.
+
+  - {{APPLEPRIV}} iCloud Private Relay is described as using two proxies
+    between the client and server, and it would provide a level of protection
+    somewhere between a commercial VPN and Tor.
+
+- Recent interest has resulted in new protocols such as Oblivious DNS
+  ([ODoH]({{?I-D.pauly-dprive-oblivious-doh}})) and Oblivious HTTP
+  ([OHTTP]({{?I-D.thomson-ohai-ohttp}})). While they both prevent tracking by
+  individual parties, they are not intended for the general-purpose web
+  browsing use case.
+
+- The use of temporary addresses is another way to limit IP address-based
+  tracking. Changing addresses over time reduces the window of time during
+  which it is possible to easily correlate network activity when the same
+  address is employed for multiple transactions by the same host. Temporary
+  addresses have been introduced only for IPv6, as an extension of its
+  Stateless Address Configuration mechanism ({{?RFC8981}}). However, since the
+  network prefix remains the same, in many cases it remains possible to
+  identify a cellular user or a household.
+
+# Accepted Uses of IP Addresses
+
+The mitigations described above are often designed to prevent unwanted uses of
+IP addresses such as profiling users. However, they often prevent other uses of
+IP addresses that users did not necessarily want or intend to disrupt.
+
+## Anti-abuse {#antiabuse}
 
 IP addresses are a passive identifier used in defensive operations. They allow
 correlating requests, attribution, and recognizing numerous attacks, including:
@@ -248,7 +299,7 @@ correlating requests, attribution, and recognizing numerous attacks, including:
 Malicious activity recognized by one service provider may be shared with other
 services {{!RFC5782}} as a way of limiting harm.
 
-### DDoS and Botnets
+## DDoS and Botnets
 
 Cyber-attackers can leverage the good reputation of an IP address to carry out
 specific attacks that wouldn't work otherwise. Main examples are Distributed
@@ -259,7 +310,7 @@ to the attackers trigger (i.e., spoofed packets). Similarly botnets may use
 spoofed addresses in order to gain access and attack services that otherwise
 would not be reachable.
 
-### Multi-platform threat models
+## Multi-platform threat models
 
 As siloed (single-platform) abuse defenses improve, abusers have moved to
 multi-platform threat models. For example, a public discussion platform with a
@@ -274,15 +325,15 @@ addresses are commonly used to investigate, understand and communicate these
 cross-platform threats. There are very few alternatives for cross-platform
 signals.
 
-### Rough Geolocation
+## Rough Geolocation
 
 A rough geolocation can be inferred from a client's IP address, which is
 commonly known as either IP-Geo or Geo-IP. This information can have several
 useful implications. When abuse extends beyond attacks in the digital space, IP
 addresses may help identify the physical location of real-world harm, such as
 child exploitation.
 
-#### Legal compliance
+## Legal compliance
 
 Legal and regulatory compliance often needs to take the jurisdiction of the
 client into account. This is especially important in cases where regulations
@@ -291,13 +342,13 @@ universally). Because Geo-IP is often bound to the IP addresses a given ISP
 uses, and ISPs tend to operate within national borders, Geo-IP tends to be a
 good fit for server operators to comply with local laws and regulations
 
-#### Contractual obligations
+## Contractual obligations
 
 Similar to legal compliance, some content and media has licensing terms that
 are valid only for certain locations. The rough geolocation derived from IP
 addresses allow this content to be hosted on the web.
 
-#### Locally relevant content
+## Locally relevant content
 
 Rough geolocation can also be useful to tailor content to the client's location
 simply to improve their experience. A search for "coffee shop" can include
@@ -307,9 +358,9 @@ brick and mortar stores near the user and a news site can surface locally
 relevant news stories that wouldn't be as interesting to visitors from other
 locations.
 
-## Implications of IP addresses
+# Implications of IP addresses
 
-### Next-User Implications
+## Next-User Implications
 
 When an attacker uses IP addresses with "good" reputations, the collateral
 damage poses a serious risk to legitimate service providers, developers, and
@@ -318,7 +369,7 @@ temporal abuse, and legitimate users may be affected by blocklists as a result.
 This unintended impact may hurt the reputation of a service or an end user
 {{!RFC6269}}.
 
-### Privacy Implications
+## Privacy Implications
 
 IP addresses are sent in the clear throughout the packet journey over the
 Internet. As such, any observer along the path can pick it up and use it for
@@ -352,7 +403,7 @@ about user, device, and network that can be obtained via the IP address.
   which, in turn, could be the subject of further requests for subscriber
   information.
 
-### Cross-site vs Same-site
+## Cross-site vs Same-site
 
 In a web context, IP Addresses can be used to link a user's activity both
 within a single site and across multiple sites. Users may want to have a single
@@ -377,7 +428,7 @@ discussion uses the web and browsers as a concrete example, but this
 generalizes to other contexts such as linking user identity across VoIP
 solutions, DNS resolvers, video streaming platforms etc.
 
-## IP Privacy Protection and Law
+# IP Privacy Protection and Law
 
 Various countries, in the last decade, have adopted, or updated, laws that aim
 at protecting citizens privacy, which includes IP addresses. Very often, these
@@ -408,50 +459,6 @@ state, IP addresses may not be considered as personally identifiable
 information {{IP2009}}.
 
 
-## Mitigations for IP address tracking
-
-The ability to track individual people by IP address has been well understood
-for decades. Commercial VPNs and Tor are the most common methods of mitigating
-IP address-based tracking.
-
-- Commercial VPNs offer a layer of indirection between the user and the
-  destination, however if the VPN endpoint's IP address is static then this
-  simply substitutes one address for another. In addition, commercial VPNs
-  replace tracking across sites with a single company that may track their
-  users' activities.
-
-- Tor is another mitigation option due to its dynamic path selection and
-  distributed network of relays, however its current design suffers from
-  degraded performance. In addition, correct application integration is
-  difficult and not common.
-
-- Address anonymization (e.g. {{GNATCATCHER}} and similar):
-
-  - {{GNATCATCHER}} is a single-hop proxy system providing more protection
-    against third-party tracking than a traditional commercial VPN. However,
-    its design maintains the industry-standard reliance on IP addresses for
-    anti-abuse purposes and it provides near backwards compatibility for select
-    services that submit to periodic audits.
-
-  - {{APPLEPRIV}} iCloud Private Relay is described as using two proxies
-    between the client and server, and it would provide a level of protection
-    somewhere between a commercial VPN and Tor.
-
-- Recent interest has resulted in new protocols such as Oblivious DNS
-  ([ODoH]({{?I-D.pauly-dprive-oblivious-doh}})) and Oblivious HTTP
-  ([OHTTP]({{?I-D.thomson-ohai-ohttp}})). While they both prevent tracking by
-  individual parties, they are not intended for the general-purpose web
-  browsing use case.
-
-- The use of temporary addresses is another way to limit IP address-based
-  tracking. Changing addresses over time reduces the window of time during
-  which it is possible to easily correlate network activity when the same
-  address is employed for multiple transactions by the same host. Temporary
-  addresses have been introduced only for IPv6, as an extension of its
-  Stateless Address Configuration mechanism ({{?RFC8981}}). However, since the
-  network prefix remains the same, in many cases it remains possible to
-  identify a cellular user or a household.
-
 # Replacement signals for IP addresses
 
 Fundamentally, the current ecosystem operates by making the immediate peer of a