# Attribution Measurement Workflows

This notebook contains three digital advertising campaign attribution measurement workflows informed by the [Attribution Data Matching Protocol (ADMaP) specification](https://iabtechlab.com/admap/).

**_The current version of this document specifies the plaintext versions of the workflows. These serve as a reference for ongoing work to integrate the corresponding PET into each workflow._**

## Common Dependencies

In [73]:
import random
import uuid
import faker

## Example ADMaP-Compatible Data

A common space of keys (emails) is generated below and used for generating simulated data within different data sets.

In [117]:
random.seed(123)
faker.Faker.seed(123)
fake = faker.Faker()
emails = [fake.email() for _ in range(15)]

### Publisher Engagement Events

In [118]:
es = random.sample(emails, 10)
campaigns = ['A', 'B', 'C']
regions = ['NA', 'LATAM', 'EMEA', 'APAC']
ages = ['18-24', '25-44', '45+']
events = [
    [
        random.randint(1, 1),     # Space ID
        es[i],                    # Key
        'click',                  # Type
        random.choice(campaigns), # Campaign Name
        random.choice(regions),   # Event Region
        random.choice(ages)       # Event User's Age Range
    ]
    for i in range(10)
]
events

[[1, 'adamskayla@example.com', 'click', 'B', 'NA', '18-24'],
 [1, 'mayala@example.com', 'click', 'B', 'EMEA', '45+'],
 [1, 'robersonnancy@example.com', 'click', 'A', 'NA', '25-44'],
 [1, 'davisdouglas@example.org', 'click', 'C', 'APAC', '18-24'],
 [1, 'mcintyredominique@example.org', 'click', 'B', 'APAC', '18-24'],
 [1, 'boonedebbie@example.net', 'click', 'A', 'LATAM', '18-24'],
 [1, 'matthew61@example.net', 'click', 'B', 'APAC', '45+'],
 [1, 'ameyer@example.com', 'click', 'B', 'APAC', '18-24'],
 [1, 'alexander86@example.net', 'click', 'B', 'APAC', '18-24'],
 [1, 'rhonda97@example.com', 'click', 'A', 'APAC', '45+']]

### Advertiser Conversions

In [119]:
es = random.sample(emails, 10)
names = ['Purchase', 'Subscription']
conversions = [
    [
        random.randint(1, 1), # Space ID
        es[i],                # Key
        random.choice(names)  # Event Name
    ]
    for i in range(10)
]
conversions

[[1, 'xmonroe@example.com', 'Purchase'],
 [1, 'davisdouglas@example.org', 'Subscription'],
 [1, 'alexander86@example.net', 'Purchase'],
 [1, 'matthew61@example.net', 'Subscription'],
 [1, 'boonedebbie@example.net', 'Purchase'],
 [1, 'aimee73@example.net', 'Subscription'],
 [1, 'rhonda97@example.com', 'Purchase'],
 [1, 'mcintyredominique@example.org', 'Purchase'],
 [1, 'matthew02@example.org', 'Purchase'],
 [1, 'mayala@example.com', 'Purchase']]

## Privacy-Preserving Aggregation of Conversions with $k$-anonymity

The workflow below joins the publisher engagement events with the advertiser conversions and aggregates the number of conversions for each distinct Space ID in the overlap.

In [128]:
join = [
    [spaceid_e, type_e, campaign_e, region_e, age_e, name_c]
    for (spaceid_e, key_e, type_e, campaign_e, region_e, age_e) in events
    for (spaceid_c, key_c, name_c) in conversions
    if key_e == key_c
]
aggregate = {
    campaign: sum([1 for (_, _, c, _, _, _,) in join if campaign == c])
    for [_, _, campaign, _, _, _] in join
}
aggregate

{'B': 4, 'C': 1, 'A': 2}

## Privacy-Preserving Aggregation of Conversions with Differential Privacy

The workflow below joins the publisher engagement events with the advertiser conversions and aggregates the number of conversions for each distinct Space ID in the overlap.

In [129]:
join = [
    [spaceid_e, type_e, campaign_e, name_c]
    for (spaceid_e, key_e, type_e, campaign_e, region_e, age_e) in events
    for (spaceid_c, key_c, name_c) in conversions
    if key_e == key_c
]
aggregate = {
    campaign: sum([1 for (s, t, c, n) in join if campaign == c])
    for [_, _, campaign, _] in join
}
aggregate

{'B': 4, 'C': 1, 'A': 2}

## Matching Encrypted User PII and Filtered Aggregation of Conversions with HE

The workflow below joins the publisher engagement events with the advertiser conversions and aggregates the number of conversions for each distinct Space ID in the overlap.

In [127]:
join = [
    [spaceid_e, type_e, campaign_e, name_c]
    for (spaceid_e, key_e, type_e, campaign_e, region_e, age_e) in events
    for (spaceid_c, key_c, name_c) in conversions
    if key_e == key_c
]
aggregate = {
    campaign: sum([1 for (s, t, c, n) in join if campaign == c and n == 'Purchase'])
    for [_, _, campaign, _] in join
}
aggregate

{'B': 3, 'C': 0, 'A': 2}