Skip to content

Latest commit

 

History

History
68 lines (54 loc) · 22.5 KB

README.md

File metadata and controls

68 lines (54 loc) · 22.5 KB

Private Measurement Important Dimensions for Attribution

This document serves to identify critical design dimensions the group is considering while evaluating designs for a privacy measurement API focusing on the high-level attribution measurement use-case. Additionally, we analyze a set of key use-cases that differ based on choices made within these dimensions.

This document tracks the Design Dimensions that have general agreement from the group.

Glossary

Major dimensions

These dimensions drastically change the architecture of the API, and nature of its output. Different choices on these axes will:

  • Likely lead to entirely different overall designs
  • More likely lead to non-interoperable designs
  • Directly impact minor dimensions
Dimension Description Where do existing proposals stand?
Server mediated Whether the API makes use of a server to compute the output PCM: No
E-ARA: No
A-ARA: Yes
IPA: Yes
SKAN: Yes
Privacy definition of API output Differential privacy, information theoretic (e.g. entropy), k-anonymity PCM: information theoretic
E-ARA: local DP + information theoretic
A-ARA: central DP + information theoretic
IPA: central DP
SKAN: Combination of information theoretic + k-anonymity
On-device / off-device attribution Whether join/attribution occurs on-device or in a server PCM: device
E-ARA: device
A-ARA: device
IPA: server
SKAN: device
Where is budgeting applied Where are user contributions bounded within the above scope? PCM: on-device
E-ARA: on-device
A-ARA: hybrid on-device and in servers
IPA: servers
SKAN: on-device
Scope of attribution Do we support attribution across channels / ad-tech? PCM: No
E-ARA: Yes - within a given an ad-network
A-ARA: Yes - within a given an ad-network
IPA: Yes
SKAN: Yes

Minor, but important dimensions

These design dimensions may critically impact some use-cases, but they do not drastically change the architecture of the API. It is possible to consider some differences in these dimensions while maintaining interoperability. Additionally, some of these dimensions are influenced entirely by some of the major dimensions.

Dimension Description Where do existing proposals stand?
Scope of privacy budgeting If applicable, the axes along which privacy budgeting applies: time epoch, site, campaign, delegate PCM: Per device/source-destination site pair. Limited in rate by reporting delay and user interaction.
A-ARA: Per epoch, source ⇔ destination site pair, device.
E-ARA: Per epoch, source ⇔ destination site pair, device
IPA: Per week/epoch, per site, per match key.
SKAN: Per device / source app
Cross device & device graph Can events on device A be linked to events on device B? How is the device graph maintained and used? PCM: No
E-ARA: Only with cross-device, same-vendor sync (w/ archived proposal)
A-ARA: Only with cross-device, same-vendor sync (w/ archived proposal)
IPA: Yes. Graph is maintained by sites setting
SKAN: No
Same device cross environment Can events on device A be linked to events on device A across different applications? PCM: App → Web/SFSafariViewController, Web → App
E-ARA: Partial w/ platform support
A-ARA: Partial w/ platform support
IPA: Yes
SKAN: No
Security guarantees of the agg infra What kind of aggregation service would we want to support? What security properties would it need it have? PCM: N/A
E-ARA: N/A
A-ARA: TEE w/ multi-party key holder (previously two-party MPC)
IPA: three-party MPC
SKAN: Trusted platform-owned servers (app store)
Stance on third party measurement providers / delegation Can multiple third parties measure the same events ? How are they restricted? Can third party code even invoke the API? PCM: No, disallowed in iframes, etc.
E-ARA: Each pair of source ⇔ destination sites can delegate to a limited number of delegates.
A-ARA: Each pair of source ⇔ destination sites can delegate to a limited number of delegates.
IPA: Sites can apportion their budget across multiple delegates.
SKAN: No
Allowed input events Clicks vs. views vs. avails vs. events from outside the platform/browser. Other restrictions on events (verifiable views/clicks, particular conversions, etc) PCM: Clicks
E-ARA: Clicks + Views/Avails
A-ARA: Clicks + Views /Avails
IPA: Clicks / Views / Avails + Offline Events
SKAN: Views only except for StoreKit ads
Time delay before reporting from the client Does the privacy of the mechanism rely on a delay from the client before being sent to the report collector? PCM: Yes: 24-48h delay
E-ARA: Yes: 2d1h or 7d1h or N+1h delay
A-ARA: Yes: 1h delay, but reducible to 0 with minor tweaks
IPA: No delaySKAN: Yes: 24-48h delay, SKAN 4.0 has multiple windows
Prior configuration What sort of a priori arrangements or commitments need to be made by parties., e.g. in the form of publishing a configuration file or a commitment to use a specific helper service, etc. Which parties need to make this commitment? PCM: none
E-ARA: none
A-ARA: none
IPA: report collectors make a weekly commitment to a helper / delegate
SKAN: Pre-registration of ad network for source apps, ad networks need to enroll with Apple
Structural support (in theory) for more sophisticated privacy mechanism, to optimize utility E.g. for differential privacy:
Advanced composition
Matrix mechanism
Above threshold / Sparse vector technique
etc
PCM: No
E-ARA: Minimal (local only)
A-ARA: Partial
IPA: In principle, yes
SKAN: No

Key use-cases that differ

There are a number of important measurement use-cases that are affected by the above design dimensions. The following is a (non-exhaustive) list, along with which dimensions affects them.

Use-case Proposal Support Notes Key related dimension(s)
View-through / Conversion-Lift / Avails / return-on-ad-spend aggregate measurement PCM: No, only supports click events.
E-ARA: Yes
A-ARA: Yes
IPA: Yes
SKAN: Yes
Allowed input events
Privacy definition of API output
Delegating ad serving & measurement to a service-provider PCM: No
E-ARA: Yes
A-ARA: Yes
IPA: Yes
SKAN: Yes
Stance on third party measurement providers / delegation
Scope of budgeting
Optimization, e.g.
Predicted conversion rate (pCVR)
Predicted value (pValue) to optimize return on ad spend
PCM: No
E-ARA: Partial (no pValue support)
A-ARA: Potentially in future
IPA: Potentially in future
SKAN: No
Privacy definition of API output
Time delay before reporting from the client
Security guarantees of the agg infra
Trigger breakdown support
Server mediated
“Real time” Budget steering / monitoring (for CPA billing) PCM: No
E-ARA: No
A-ARA: Partial
IPA: Partial
SKAN: No
Privacy definition of API output
Server mediated
Time delay before reporting from the client
Cross environment attribution. Attribution works across {App, Web, Webview, Offline} contexts, including across devices. PCM: App → Web/SFSVC, Web → App
E-ARA: Same-device across app and web
A-ARA: Same-device across app and web
IPA: Potentially everything
SKAN: No
Server-mediated
On-device / off-device attribution
Cross device & device graph
Attribution models other than last-touch (rules-based / data driven) PCM: No
ARA: Partial (priority based)
IPA: Potentially in the future
SKAN: No
Data-driven: No for all
Spam / fraud filtering PCM: Token-based (online filtering)
E-ARA: Partial (online source filtering)
A-ARA: Partial (online trigger filtering, token-based, early stage)
IPA: online & offline source and trigger filtering
SKAN: Device attestation
On-device / off-device attribution
Privacy definition
Time delay
Multiple triggers per source. Marketers want to report on not only sales, but also sessions, baskets… all in accurate ways. PCM: max of 1 attributed conversion per source event
E-ARA: Views: max of 1 attributed conversion per source event; Clicks: more than 1
A-ARA: more than 1
IPA: more than 1
SKAN: No (just a single install)
Privacy definition
Server mediated
Trigger breakdown support. What options are available for managing different types of attribution on the trigger side? PCM: each trigger carries 4 bits of information
E-ARA: 3 bits (click) or 1 bit (view/avail) w/ noise
A-ARA: Potential to construct rich keys based on both source and trigger side data
IPA: None in current proposal except value, but potential in the future to construct rich keys based on trigger side data
SKAN: 6 bit value; SKAN 4.0 has hierarchical source identifiers (2, 3 or 4 digits) and hierarchical conversion values (6-bit or [low, medium, high] – level of information available is based on size of event pool.
Privacy definition
Server mediated
Flexible attribution windows. Whether there is a fixed or configurable period for attribution events. PCM: Fixed
E-ARA: Flexible
A-ARA: Flexible
IPA: Flexible
SKAN: In SKAN 4.0
Privacy definition
Time delay
On-device / off-device attribution