# Homomorphic Facial Recognition with TenSEAL

This notebook demonstrates how to perform facial recognition while preserving privacy using Homomorphic Encryption (HE). The key privacy benefit is that we can perform facial comparison operations on encrypted data without ever having access to the actual facial embeddings.

## How Homomorphic Encryption Protects Privacy

1. **Traditional Approach (Privacy Risk)**:
   - Facial embeddings are sent to a server in plain text
   - Server can see and store the actual facial features
   - Server can potentially misuse or leak the sensitive biometric data

2. **Homomorphic Approach (Privacy-Preserving)**:
   - Facial embeddings are encrypted before being sent to the server
   - Server can perform computations on encrypted data without seeing the actual values
   - Only the final result (match/no match) is decrypted by the client
   - Server cannot access or misuse the sensitive biometric data

## System Architecture

1. **Client Side**:
   - Generates encryption keys
   - Performs facial embedding extraction
   - Encrypts the embeddings
   - Decrypts final results

2. **Server Side**:
   - Receives encrypted embeddings
   - Performs distance calculations on encrypted data
   - Returns encrypted results
   - Never sees actual facial features

# Requirements

We'll use the following libraries:
- `tenseal`: For homomorphic encryption operations
- `deepface`: For facial recognition and embedding extraction

pip install tenseal; pip install deepface

In [1]:
import base64
import tenseal as ts
import numpy as np
from deepface import DeepFace
from numpy.linalg import norm

ValueError: You have tensorflow 2.19.0 and this requires tf-keras package. Please run `pip install tf-keras` or downgrade your tensorflow.

# Finding embeddings (Client Side)
We are going to find vector representations of facial images. This will be done in the client side to ensure the raw facial data never leaves the user's device.

The embeddings are 128-dimensional vectors that represent the unique features of each face. These embeddings are what we'll encrypt before sending to the server.


In [None]:
img1_path = "data/z1.jpg"
img2_path = "data/th1.jpg"
img1_embedding = DeepFace.represent(img1_path, model_name='Facenet')
img2_embedding = DeepFace.represent(img2_path, model_name='Facenet')

# Utility Functions
Utility functions for handling encrypted data:
- `write_data`: Saves encrypted data to files (simulating cloud storage)
- `read_data`: Reads encrypted data from files

In [21]:
def write_data(file_name, file_content):
    if type(file_content) == bytes:
        #bytes to base64
        file_content = base64.b64encode(file_content)
    with open(file_name, 'wb') as f:
        f.write(file_content)

def read_data(file_name):
    with open(file_name, "rb") as f:
        file_content = f.read()
    #base64 to bytes
    return base64.b64decode(file_content)

# Initialization (Client Side)
Here we set up the homomorphic encryption context with CKKS scheme, which allows us to perform computations on encrypted real numbers. The parameters are chosen to balance security and performance:

- `poly_modulus_degree = 8192`: Determines the size of encrypted vectors
- `coeff_mod_bit_sizes = [60, 40, 40, 60]`: Controls the precision of computations

In [22]:
context = ts.context(ts.SCHEME_TYPE.CKKS, poly_modulus_degree = 8192, coeff_mod_bit_sizes = [60, 40, 40, 60])
context.generate_galois_keys()
context.global_scale = 2**40
secret_context = context.serialize(save_secret_key = True)
write_data("secret.txt", file_content=secret_context)
type(secret_context)
context.make_context_public()
public_context=context.serialize()
write_data(file_name="public.txt",file_content=public_context)
del context, secret_context, public_context

# Encryption
In this step, we encrypt the facial embeddings before they would be sent to the server. This is crucial for privacy because:
1. The server never sees the actual facial features
2. The encrypted data cannot be reverse-engineered to recover the original face
3. Even if the server is compromised, the attacker cannot access the biometric data.

In [23]:
context = ts.context_from(read_data("secret.txt"))
enc_v1 = ts.ckks_vector(context, img1_embedding[0]['embedding'])
enc_v2 = ts.ckks_vector(context, img2_embedding[0]['embedding'])
enc_v1_proto = enc_v1.serialize()
enc_v2_proto = enc_v2.serialize()
write_data("enc_v1.txt", enc_v1_proto)
write_data("enc_v2.txt", enc_v2_proto)
del context, enc_v1, enc_v2, enc_v1_proto, enc_v2_proto

# Calculations (Server side)
This section demonstrates the power of homomorphic encryption. The server can:
1. Load the encrypted embeddings
2. Compute the Euclidean distance between them
3. All while never seeing the actual facial features

The server only has access to the public key, so it cannot decrypt the data. This ensures the privacy of the biometric data even during computation.


In [24]:
context = ts.context_from(read_data("public.txt"))
enc_v1_proto = read_data("enc_v1.txt")
enc_v2_proto = read_data("enc_v2.txt")
enc_v1 = ts.lazy_ckks_vector_from(enc_v1_proto)
enc_v1.link_context(context)

enc_v2 = ts.lazy_ckks_vector_from(enc_v2_proto)
enc_v2.link_context(context)
euclidean_squared = enc_v1 - enc_v2
euclidean_squared = euclidean_squared.dot(euclidean_squared)
write_data("euclidean_squared.txt", euclidean_squared.serialize())

In [25]:
# we must not decrypt the homomorphic encrypted euclidean squared value in this stage
# because we don't have the secret key. check this operation. it should throw an exception!

try:
    euclidean_squared.decrypt()
except Exception as err:
    print("Exception: ", str(err))

Exception:  the current context of the tensor doesn't hold a secret_key, please provide one as argument


In [26]:
del context, enc_v1_proto, enc_v2_proto, enc_v1, enc_v2, euclidean_squared

# Decryption (Client side)

Only the client can decrypt the results because only they possess the secret key. This means:
1. The server cannot see whether faces match or not
2. The final result is only revealed to the authorized client
3. The entire process maintains privacy while still providing accurate results



In [27]:
context = ts.context_from(read_data("secret.txt"))
euclidean_squared_proto = read_data("euclidean_squared.txt")
euclidean_squared = ts.lazy_ckks_vector_from(euclidean_squared_proto)
euclidean_squared.link_context(context)
euclidean_squared_plain = euclidean_squared.decrypt()[0]
euclidean_squared_plain
if euclidean_squared_plain < 100:
    print("they are same person")
else:
    print("they are different persons")

they are different persons


# Validation
This section compares the results of traditional (unencrypted) and homomorphic (encrypted) computations to verify that:
1. The homomorphic encryption provides the same accuracy as traditional methods
2. The small difference in results is due to the inherent approximation in homomorphic encryption
3. The privacy benefits come without sacrificing accuracy

The difference between traditional and homomorphic results should be negligible (< 0.00001), proving that we can achieve privacy without compromising the functionality of the facial recognition system.

In [28]:
img1_emb = np.array(img1_embedding[0]["embedding"])
img2_emb = np.array(img2_embedding[0]["embedding"])
distance = norm(img1_emb - img2_emb )

In [29]:
print("euclidean squared - tradational: ", distance*distance)
print("euclidean squared - homomorphic: ", euclidean_squared_plain)

euclidean squared - tradational:  276.97510330867624
euclidean squared - homomorphic:  276.97514287660164


In [30]:
#check the difference is acceptable
abs(distance * distance - euclidean_squared_plain) < 0.00001

np.False_