<a href="https://colab.research.google.com/github/AltmannPeter/webuild-architecture/blob/main/Pseudonyms.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pseudonyms for the EUDI Wallet

Background detailed [here](https://github.com/AltmannPeter/webuild-architecture/blob/main/webuild-drafts/pseudonyms.md).

This document focuses only on a clarifying example of pseudonym generation; many important details for enabling a pseudonym solution are left out.

## Alias-based Pseudonyms

Recap:

* Users select an alias to use when authenticating to a service
* Requires two local mappings:
  1. Bijunction between an index and a User chosen alias for the pseudonym
  2. A dictionary between RP services and an alias set (or optionally pseudonym set)
* Issuer supplied seeds enable reidentification (preferred approach pre-ZKP)
* User supplied seed needs to be communicated, e.g., using OID4VCI flow [H.5.](https://openid.net/specs/openid-4-verifiable-credential-issuance-1_0.html#name-wallet-initiated-issuance-du)
* Pseudonym computed as `HMAC(key=nym_seed, data=index)`
* Issuer generates pseudonyms in range $[0, n]$ and includes all $n$ pseudonyms as a selectively disclosable array

In [None]:
%%capture
!pip install bidict

In [None]:
import secrets
from cryptography.hazmat.primitives import hashes, hmac
import base64
import json
from dataclasses import dataclass
from bidict import bidict

In [None]:
KEY_LENGTH = 32

def int_to_bytes(n: int) -> bytes:
  if n == 0:
    return b"\x00"
  length = (n.bit_length() + 7) // 8
  return n.to_bytes(length, "big")

def int_from_bytes(b: bytes) -> int:
  return int.from_bytes(b, "big")

def base64url(x):
  return base64.urlsafe_b64encode(x).rstrip(b"=").decode("utf-8")

def sha256_digest(data: bytes) -> bytes:
  digest = hashes.Hash(hashes.SHA256())
  digest.update(data)
  return digest.finalize()

def generate_disclosure(a):
  json_bytes = json.dumps(a, ensure_ascii=False).encode("utf-8")
  return base64url(json_bytes)

def generate_sd(x):
  digest = hashes.Hash(hashes.SHA256())
  digest.update(bytes(x, 'ascii'))
  return base64url(digest.finalize())

In [None]:
class NymRegistry:
  def __init__(self):
    self._site_registry: dict[str, set(str)] = {}  # Each site can have multiple aliases
    self._alias_registry = bidict() # A bijunction between idx and alias

  def register(self, site: str, alias: str):
    if site not in self._site_registry:
      self._site_registry[site] = set()
    self._site_registry[site].add(alias)

    if alias not in self._alias_registry.inv:
      idx = max(self._alias_registry, default=-1) + 1 # incrementing idx
      self._alias_registry[idx] = alias

  def get_nym_record(self, site: str) -> str | None:
    return self._site_registry.get(site)

In [None]:
class Client:
  def __init__(self, family_name: str, given_name: str, personal_number: str):
    self.family_name = family_name
    self.given_name = given_name
    self.personal_number = personal_number

    self._nym_seed = secrets.token_bytes(KEY_LENGTH) # Likely issuer supplied
    self._nyms = NymRegistry()

  def add_site(self, site: str, alias: str) -> None:
    self._nyms.register(site, alias)

  def get_site_record(self, site) -> set | None:
    return self._nyms.get_nym_record(site)

  def get_alias_idx(self, alias) -> bidict | None:
    if alias not in self._nyms._alias_registry.inv:
      return None
    return self._nyms._alias_registry.inv[alias]

  def _generate_pseudonym(self, idx: int) -> str:
    h = hmac.HMAC(self._nym_seed, hashes.SHA256())
    h.update(int_to_bytes(idx))
    pseudonym = h.finalize()
    return base64url(pseudonym)

  def present_nym_seed(self) -> str:
    return self._nym_seed.hex()

  def check_nym_usage(self, alias) -> list:
    return [s for s, a in self._nyms._site_registry.items() if alias in a]

In [None]:
# Create an instance of client
client = Client(
    family_name="Doe",
    given_name="John",
    personal_number="199001011234"
)

# Add some sites
client.add_site('google.com', 'l33tMouse')
client.add_site('google.com', 'l00tMouse')
client.add_site('developers.google.com', 'l33tMouse')
client.add_site('developers.google.com', 'l8Mouse')
client.add_site('apple.com', 'n00bMouse')
client.add_site('apple.com', 'l00tMouse')

In [None]:
print(f'nym_seed: {client.present_nym_seed()}')
print("Pseudonym generation using idx=0: ", client._generate_pseudonym(0))
value = generate_disclosure(["salt-123", client._generate_pseudonym(0)])
print(f"_sd element of idx=0: {json.dumps({'...': value})}")
print(f"Devices can sync alias mapping: {dict(client._nyms._alias_registry)}")
print(f"Devices can sync sites mapping: {client._nyms._site_registry}")
print(f"The alias set for google.com: {client.get_site_record('google.com')}")

nym_seed: f637f3cf9f9f0bdb45fe4dabb41d928263267f46dd5af9c7b5e2ab86afd4787c
Pseudonym generation using idx=0:  Kq59Z8FUTGAH2m8CQjTMqyn5Eb4KWsOdFK2VPWGOMLw
_sd element of idx=0: {"...": "WyJzYWx0LTEyMyIsICJLcTU5WjhGVVRHQUgybThDUWpUTXF5bjVFYjRLV3NPZEZLMlZQV0dPTUx3Il0"}
Devices can sync alias mapping: {0: 'l33tMouse', 1: 'l00tMouse', 2: 'l8Mouse', 3: 'n00bMouse'}
Devices can sync sites mapping: {'google.com': {'l33tMouse', 'l00tMouse'}, 'developers.google.com': {'l33tMouse', 'l8Mouse'}, 'apple.com': {'n00bMouse', 'l00tMouse'}}
The alias set for google.com: {'l33tMouse', 'l00tMouse'}


## Directed pseudonyms

Recap:

* Pseudonyms are site specific
* Requires `rp_identifier` and `ps_context` as input to pseudonym generation  
* Issuer supplied seeds enable reidentification (preferred approach pre-ZKP)
* User supplied seed needs to be communicated, e.g., using OID4VCI flow [H.5.](https://openid.net/specs/openid-4-verifiable-credential-issuance-1_0.html#name-wallet-initiated-issuance-du)
* Pseudonym computed as `HMAC(key=nym_seed, data=rp_identifier || ps_context)`

In [None]:
class Issuer:
  def __init__(self):
    self._nym_seed = secrets.token_bytes(KEY_LENGTH)

  def present_nym_seed(self) -> bytes:
    return base64.urlsafe_b64encode(self._nym_seed).decode('utf-8').rstrip("=")


class PseudonymService:
  def __init__(self, nym_seed: bytes):
    self._nym_seed = nym_seed
    self.ps_context = None
    self.rp_identifier = None

  def get_pseudonym(self) -> str:
    h = hmac.HMAC(self._nym_seed, hashes.SHA256())
    h.update(self.rp_identifier.encode('utf-8'))
    h.update(self.ps_context.encode('utf-8'))
    self.ps_context = None
    self.rp_identifier = None
    return base64url(h.finalize())

  def set_rp(self, rp_identifier: str):
    self.rp_identifier = rp_identifier

  def set_ps_context(self, ps_context: str):
    self.ps_context = ps_context

In [None]:
issuer = Issuer()
print(f'Issuer supplied nym_seed: {issuer.present_nym_seed()}')
print(f"Issuer generated disclosure': {generate_disclosure(['salt', issuer.present_nym_seed()])}")

Issuer supplied nym_seed: Rhmdin18qpaaB-9N_L8qCHk5ApAktAxTPZ_D_GwszuQ
Issuer generated disclosure': WyJzYWx0IiwgIlJobWRpbjE4cXBhYUItOU5fTDhxQ0hrNUFwQWt0QXhUUFpfRF9Hd3N6dVEiXQ


In [None]:
pseudonym_service = PseudonymService(nym_seed=issuer._nym_seed)

In [None]:
pseudonym_service.set_rp('google.com')
pseudonym_service.set_ps_context('')
print(f'Global pseudonym for google.com: \t\t\t\t{pseudonym_service.get_pseudonym()}')

pseudonym_service.set_rp('google.com')
pseudonym_service.set_ps_context('colab.research')
print(f'Pseudonym for colab.research.google.com: \t\t{pseudonym_service.get_pseudonym()}')

pseudonym_service.set_rp('google.com')
pseudonym_service.set_ps_context('abc123')
print(f'Pseudonym for google.com + context "abc123": \t{pseudonym_service.get_pseudonym()}')

pseudonym_service.set_rp('apple.com')
pseudonym_service.set_ps_context('')
print(f'Global pseudonym for apple.com: \t\t\t\t{pseudonym_service.get_pseudonym()}')

Global pseudonym for google.com: 				PZ7YF6RMQE7wn9gR35G0-raG0tkMLbAoFUWpXGFmuwg
Pseudonym for colab.research.google.com: 		xAI8dWqfXboUXm0h5Qzstdgxzv7I9V0e0SrkxJS-EX4
Pseudonym for google.com + context "abc123": 	reBT_UKhzw7j4WHlBz-rGTIkOxPwM4lxNxCLEuPj-uU
Global pseudonym for apple.com: 				GOn_V2GTaXsGeS1QluIYWXenUt-C3yYHWtlzbFdAW-0


The pseudonym can be communicated in a pseudonym attestation that includes information about:

1. The pseudonym value
2. The corresponding PID
3. The `rp_identifier` and `ps_context` used as input to pseudonym generation