# Exercise 1 - Signing Simply with RSA

This first exercise shows how to prove the validity of a credential using an RSA signature.  
Then we create a simple version of Selective Disclosure (SD) to hide some of the data.  
Finally we'll fix our errors in the simple SD implementation.

To understand these concepts in more detail, please refer to our blog post on [E-ID infrastructure](https://c4dt.epfl.ch/article/the-swiss-confederation-e-id-public-sandbox-trust-infrastructure-part-2/)

## Sections

1. Basic E-ID example using RSA cryptographic scheme
2. Selective Disclosure using hashing
3. Discussion: security of this scheme, and introduction to unlinkability
4. Coding exercise: protect the hashes

---

## 1. Basic E-ID example using RSA cryptographic scheme

In this first section you'll learn how to do a simple protection of a credentials using signing.
You'll find the following elements:

- **credential**: an object holding personal data
- **issuer**: a trusted entity who can sign a credential
- **holder**: the person described by the personal data
- **verifier**: wanting to learn parts of the personal data of the holder

### Definition: Verifiable Credentials
A verifiable credential, in its simplest form, exists as a signed string of data. An issuer will issue a credential by signing a specific string of data then sharing that string of data along with a cryptographic signature that can prove that this string was authorized/ issued by this specific issuer.

In [None]:
// We start by creating a typical E-ID credential object that we will use through out this exercise
const birthDate = new Date("1993-08-01T00:00:00")
const ID_DATA = {
    name: "Jack Sparrow",
    timeOfBirth: birthDate.getTime(),
    profession: "IT Manager"
}

### Issuer

The issuer has a public/private key pair.
We suppose that its public key is known to everybody through an appropriate Public Key Infrastructure (PKI).
If a holder wants to use their E-ID, they first need to get a verified credential from the issuer.
This verified credential is simply the credential + a signature from the issuer.

In [None]:
import * as crypto from 'crypto';

// For an issuer to be able to start issuing Verifiable Credentials, it first needs
// to have its own cryptographic key pair. 
// Issuers will sign the data using their private key. 
const { publicKey, privateKey } = crypto.generateKeyPairSync('rsa', {
    modulusLength: 4096,
    publicKeyEncoding: {
      type: 'spki',   
    
      format: 'pem'
    },
    privateKeyEncoding: {
      type: 'pkcs8',
      format: 'pem',
    }
});

// Create a signature over the hash of the data
const message = JSON.stringify(ID_DATA)
const signer = crypto.createSign('SHA256');
signer.update(message);
const signature = signer.sign(privateKey, 'base64');

console.log("The signed message is:", message);
console.log("\n The signature is:", signature);

### Issuer →  Holder

After the signature is created, the data along with the signature is transfered to the holder

In this case, the holder can only do one thing with this data which is to share the whole data string along with the signature

### Holder →  Verifier

Once the data is sent from holder to verifier, the verifier can verify that information as follows:

In [None]:
// The verifier recieves the "message" and the "signature".
// We suppose it has a copy of the issuer's public key using some kind of
// Public Key Infrastructure (PKI).

let verifier = crypto.createVerify('SHA256');
verifier.update(message)
console.log("The signature is correct:", verifier.verify(publicKey, signature, 'base64'))

### Exercises

Make sure that the signature fails in the following cases:

1. the message is different from the message used in the signing process
2. the public key is different than the public key from the issuer

---

## 2. Selective Disclosure using hashing

What if the holder of the credential wants to only share his name and profession but not his timeOfBirth?
The current implementation wouldn't allow for that, so we will need to change it.
The simplest solution is to hash all fields, and then only send the fields to be disclosed to the verifier.



### Issuer

In [None]:
// One way to implement selective disclosure is to hash every value, so the 
// holder of the credential can decide which values they want to share.

function hashValue(value: string): string {
  const hash = crypto.createHash('sha256');
  hash.update(value);
  return hash.digest('hex');
}

// The object to be signed only contains the hashes of the actual data of
// the credential.
const ID_DATA_HASHED = {
    name: hashValue(ID_DATA['name']),
    timeOfBirth: hashValue(ID_DATA['timeOfBirth'].toString()),
    profession: hashValue(ID_DATA['profession'])
}

// As before, the issuer creates a signature of the hash of the hashed fields.
const message = JSON.stringify(ID_DATA_HASHED);
const signer = crypto.createSign('SHA256');
signer.update(message);
const signature = signer.sign(privateKey, 'base64');

console.log(message);
console.log("----------------------------");
console.log(signature);

### Issuer -> Holder

Now the issuer will send the original data, the hashed data, and the sigature to the holder.

### Issuer

If the holder wants to disclose their name and profession, but keep the time of birth private, they will do the followig:

In [None]:
// The holder chooses which fields they want to disclose.
// This data is then sent to the verifier, together with the originally
// signed data.
const HOLDER_DISCLOSED_DATA = {
    name: ID_DATA['name'],
    profession: ID_DATA['profession']
}

### Holder -> Verifier

Now the holder will send the following data to the verifier:

- `ID_DATA_HASHED`
- `signature`
- `HOLDER_DISCLOSED_DATA`

This means that the verifier doesn't have access to the `timeOfBirth`.

### Verifier

The verifier now wants to make sure that the data they got from the holder is correct.

In [None]:
// First, the verifier has to check that the signature on the hashes is correct:
let verifier = crypto.createVerify('SHA256');
verifier.update(message)
console.log("Signature verification:", verifier.verify(publicKey, signature, 'base64'))

// Now, the verifier can compare the disclosed data with the hashed values in the credential.
// If they are equal, and the hash-function is cryptographically secure, the verifier can be covinced that the data is correct.
const RETRIEVED_DATA = JSON.parse(message);
for (const [key, value] of Object.entries(HOLDER_DISCLOSED_DATA)){
    console.log("Verifying", key, ":", hashValue(value), "==", RETRIEVED_DATA[key], "?");
    if (hashValue(value) != RETRIEVED_DATA[key]) {
        throw new Error(`Reconstructed data in key ${key} is not the same as the hashed counterpart`);
    }
}

// Since we've already verified that the hashed message is valid in the previous code cell,
// and now we verified that the hashed values are equal to the revealed values, then
// we conclude that we trust these revealed data.
console.log("Hashed values are equal, so the following is verified:", HOLDER_DISCLOSED_DATA);

### Exercises

1. Print the hashes using `hashValue(value)` and compare each output to `RETRIEVED_DATA(key)` and compare them visually
2. Change the disclosed fields and make sure it still runs

---

## 3. Discussion: security of this scheme, and introduction to unlinkability

### Security of hashed values

To keep some anonymity, the fields which the holder decides not to disclose remain shared with the verifier in a hashed format.
How secure is this?
For example, if the holder selectively discloses the `profession`, what can the verifier do with the other fields?

### Unlinkability

One of the big problems in current day ads is that even if you visit different websites, the advertising industry will correlate these visits into a single user profile.
This allows these data brokers to sell your profile not only for ads, but also for influence campains, and for geo-tracking.
Not only ad companies can do this, but these profiles are also sold by the data brokers to the government, or even to private persons!
For this reason, multiple presentations of a credential should be **unlinkable**.

Does the current scheme guarantee unlinkability of the holder towards different verifiers?

---

## 4. Coding exercise: protect the hashes

Right now, the `timeOfBirth` can be guessed by looping over all possible dates to figure out someone's exact date of birth.

1. Create a way to hack the `timeOfBirth`. How can this be made faster?
2. How would you hack the `name`, or `profession`?
3. Reimplement the communication between the holder and verifier modifying the hash function in a way that doesn't make it easy to guess the fields