# **Getting Started**

We first install the required concrete-ml packages for this Colab instance.

The packages need to be reinstalled each time the notebook is opened since the runtime is deleted after roughly 90 minutes of inactivity, but this can be circumvented by storing the packages in your Google Drive and importing from those instead.

In [None]:
#reinstall packages (required unless the packages are stored in your google drive)
!pip install -U pip wheel setuptools
!pip install concrete-ml

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting setuptools
  Downloading setuptools-68.0.0-py3-none-any.whl (804 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m804.0/804.0 kB[0m [31m44.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 67.7.2
    Uninstalling setuptools-67.7.2:
      Successfully uninstalled setuptools-67.7.2
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.34.0 requires jedi>=0.16, which is not installed.[0m[31m
[0mSuccessfully installed setuptools-68.0.0


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting concrete-ml
  Downloading concrete_ml-1.0.3-py3-none-any.whl (178 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m178.1/178.1 kB[0m [31m13.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting boto3<2.0.0,>=1.23.5 (from concrete-ml)
  Downloading boto3-1.26.156-py3-none-any.whl (135 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.6/135.6 kB[0m [31m17.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting brevitas==0.8.0 (from concrete-ml)
  Downloading brevitas-0.8.0-py3-none-any.whl (357 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m357.3/357.3 kB[0m [31m37.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting concrete-python==1.0.0 (from concrete-ml)
  Downloading concrete_python-1.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (61.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.9/61.9 MB[0m [31m10.3 MB/s

# **Step 1: Download required files and compiled model, and instantiate on-disk network**

In [None]:
import requests, platform, time, os, subprocess, numpy, stat
from pandas import DataFrame as pd
from pandas import read_csv
from shutil import copyfile
import shutil
from tempfile import TemporaryDirectory
from concrete.ml.deployment import FHEModelClient, FHEModelDev, FHEModelServer

def getRequiredFiles(suppress = False):
    files = [
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/ClientDownloads/dashing_s512",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/Compiled%20Model/client.zip",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/ClientDownloads/selected_features.txt",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/ClientDownloads/dashingShell512.sh",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/dashingShell128.sh",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/dashingShell256.sh",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/ClientDownloads/readHLLandWrite512.sh",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/readHLLandWrite128.sh",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/readHLLandWrite256.sh",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/Compiled%20Model/client.zip",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/Compiled%20Model/server.zip",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/ClientDownloads/selected_features.txt",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/dashing_s128",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/dashing_s256",
        #r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/ClientDownloads/dashing_s512",
        #r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/fastas/b.1.1.529_EPI_ISL_17717676.fasta",
        #r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/fastas/b.1.617.2_EPI_ISL_17727797.fasta",
        #r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/fastas/b.1.621_gisaid_hcov-19_2023_06_14_08.fasta",
        r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/fastas/c.37_gisaid_hcov-19_2023_06_07_02.fasta",
        #r"https://raw.githubusercontent.com/bjorgkav/concreteml-covid-classifier/main/client/AlternativeDashingDownloads/dashing_s128",
        ]
    if(not suppress):
      for file in files:
          print(file.split("/")[-1].replace("%20", " "))
          if file.split("/")[-1].replace("%20", " ") not in os.listdir("/content"):
              download(file, "/content")

def download(url, dest_folder):
    if not os.path.exists(dest_folder):
        os.makedirs(dest_folder)

    filename = url.split('/')[-1].replace(" ", "_")
    file_path = os.path.join(dest_folder, filename)

    r = requests.get(url, stream=True)

    if r.ok:
        print("saving to", os.path.abspath(file_path))
        with open(file_path, 'wb') as f:
            for chunk in r.iter_content(chunk_size=1024 * 8):
                if chunk:
                    f.write(chunk)
                    f.flush()
                    os.fsync(f.fileno())
    else:  # HTTP status code 4XX/5XX
        print("Download failed: status code {}\n{}".format(r.status_code, r.text))

def get_size(self, file_path, unit='bytes'):
        file_size = os.path.getsize(file_path)
        exponents_map = {'bytes': 0, 'kb': 1, 'mb': 2, 'gb': 3}
        if unit not in exponents_map:
            raise ValueError("Must select from \
            ['bytes', 'kb', 'mb', 'gb']")
        else:
            size = file_size / 1024 ** exponents_map[unit]
            return round(size, 3)

getRequiredFiles()

dashing_s512
client.zip
selected_features.txt
dashingShell512.sh
saving to /content/dashingShell512.sh
dashingShell128.sh
saving to /content/dashingShell128.sh
dashingShell256.sh
saving to /content/dashingShell256.sh
readHLLandWrite512.sh
readHLLandWrite128.sh
readHLLandWrite256.sh
client.zip
server.zip
selected_features.txt
dashing_s128
dashing_s256
c.37_gisaid_hcov-19_2023_06_07_02.fasta


For demonstration purposes, an on-disk network will be simulated to serve as our way to communicate with the server. The code for the on-disk network was taken from Concrete-ML's [sample ClientServer notebook](https://github.com/zama-ai/concrete-ml/blob/release/1.0.x/docs/advanced_examples/ClientServer.ipynb).

In [None]:
class OnDiskNetwork:
    """Simulate a network on disk."""

    def __init__(self):
        # Create 3 temporary folder for server, client and dev with tempfile
        self.server_dir = TemporaryDirectory()  # pylint: disable=consider-using-with
        self.client_dir = TemporaryDirectory()  # pylint: disable=consider-using-with
        self.dev_dir = TemporaryDirectory()  # pylint: disable=consider-using-with
        print("On-disk network initialized!\n")

    def client_send_evaluation_key_to_server(self, serialized_evaluation_keys):
        """Send the public key to the server."""
        with open(self.server_dir.name + "/serialized_evaluation_keys.ekl", "wb") as f:
            f.write(serialized_evaluation_keys)

        print("Evaluation keys sent to server.\n")

    def client_send_input_to_server_for_prediction(self, encrypted_input):
        """Send the input to the server and execute on the server in FHE."""
        with open(self.server_dir.name + "/serialized_evaluation_keys.ekl", "rb") as f:
            serialized_evaluation_keys = f.read()
        time_begin = time.time()
        encrypted_prediction = FHEModelServer(self.server_dir.name).run(
            encrypted_input, serialized_evaluation_keys
        )
        time_end = time.time()
        with open(self.server_dir.name + "/encrypted_prediction.enc", "wb") as f:
            f.write(encrypted_prediction)
        return time_end - time_begin

    def dev_send_model_to_server(self):
        """Send the model to the server."""
        copyfile(self.dev_dir.name + "/server.zip", self.server_dir.name + "/server.zip")

        print("Model sent to server.\n")

    def server_send_encrypted_prediction_to_client(self):
        """Send the encrypted prediction to the client."""
        with open(self.server_dir.name + "/encrypted_prediction.enc", "rb") as f:
            encrypted_prediction = f.read()
        return encrypted_prediction

    def dev_send_clientspecs_to_client(self):
        """Send the client specs to the client."""
        copyfile(self.dev_dir.name + "/client.zip", self.client_dir.name + "/client.zip")

        print("Successfully sent client specs to client.\n")

    def cleanup(self):
        """Clean up the temporary folders."""
        self.server_dir.cleanup()
        self.client_dir.cleanup()
        self.dev_dir.cleanup()

    def move_client_server_specs_to_network(self):
        copyfile("/content/server.zip" , self.dev_dir.name + "/server.zip")
        copyfile("/content/client.zip" , self.dev_dir.name + "/client.zip")
        print("Compiled model and client specs copied to on-disk network.\n")

# Let's instantiate the network and move our downloaded demo files into it
network = OnDiskNetwork()
network.move_client_server_specs_to_network()

# List files inside the temporary development directory
!ls -lh $network.dev_dir.name

On-disk network initialized!

Compiled model and client specs copied to on-disk network.

total 12K
-rw-r--r-- 1 root root 2.6K Jun 20 06:58 client.zip
-rw-r--r-- 1 root root 5.5K Jun 20 06:58 server.zip


As this is a guide on the overall production workflow of the system, we won't be going through the training phase here, and will instead be using the compiled model downloaded from the Github repository.

At this point, we then move our compiled model onto the server, where it can then be used for classification.

In [None]:
# Let's send the model to the server
network.dev_send_model_to_server()
!ls -lh $network.server_dir.name

Model sent to server.

total 8.0K
-rw-r--r-- 1 root root 5.5K Jun 20 06:58 server.zip


Now that we have our compiled model on our server, we only need to send the client specifications to our simulated client to proceed with the next step.

In [None]:
# Let's send the clientspecs and evaluation key to the client
network.dev_send_clientspecs_to_client()
!ls -lh $network.client_dir.name

Successfully sent client specs to client.

total 4.0K
-rw-r--r-- 1 root root 2.6K Jun 20 06:58 client.zip


# **Step 2: Generate the keys on the client machine**

In [None]:
# Let's create the client and load the model
fhemodel_client = FHEModelClient(network.client_dir.name, key_dir=network.client_dir.name)

# The client first need to create the private and evaluation keys.
fhemodel_client.generate_private_and_evaluation_keys()

# Get the serialized evaluation keys
serialized_evaluation_keys = fhemodel_client.get_serialized_evaluation_keys()

print("Private and evaluation keys generated.\n")

Private and evaluation keys generated.



In [None]:
# Evaluation keys can be quite large files but only have to be shared once with the server.
# Let's send this evaluation key to the server (this has to be done only once)
network.client_send_evaluation_key_to_server(serialized_evaluation_keys)
!ls -lh $network.server_dir.name

Evaluation keys sent to server.

total 12K
-rw-r--r-- 1 root root   24 Jun 20 06:09 serialized_evaluation_keys.ekl
-rw-r--r-- 1 root root 5.5K Jun 20 06:09 server.zip


# **Step 3: Prepare data for encryption**

The next step is for the client to prepare the data for encryption. Since we're working with DNA sequences in FASTA files, we still need to convert these sequences into features that we can use with our logistic regression model.

For this, we use the [Dashing](https://github.com/dnbaker/dashing) tool and the shell scripts we downloaded to convert the sequences and then create a CSV file that can be used in our workflow.

In practice, this whole step is abstracted away by our client-side application. Additionally, the default Dashing binary used by the system is *dashing_s512*. If the Dashing shell script does not work with it, trying other binaries will most likely work. For instance, *dashing_s256* worked for this notebook instead of the s512 version because of its incompatible instruction set.

In [None]:
def readTruncateSequence(fasta_fpath):
        truncated_seq = ""

        with open(fasta_fpath, "r") as f:
            for line in f.readlines(): #chunks() method is essentially opening the file in binary mode.
                if ">" not in line:

                    #print(f"New chunk: {line[-1]}")

                    to_add = line.strip().replace('\n', '')
                    #print(f"New line found: {to_add.decode()}")

                    truncated_seq += to_add
                else:
                    #print("> found.")
                    first_line = line
                    id = line.split("|")[1].strip().replace('EPI_ISL_', '')

        decoded_truncated_seq = truncated_seq[20000:]

        return first_line, decoded_truncated_seq, id

def writeFasta(id, first_line, sequence):
        """Writes a .fasta file in the 'fastas' folder named after the fasta's ID and containing the truncated sequence."""
        fasta_folder = os.path.join("/content", f"fastas")
        if not os.path.exists(fasta_folder):
            os.mkdir(fasta_folder)

        with open(os.path.join(fasta_folder, f"{id}.fasta"), "w") as output_file:
            output_file.write(first_line)
            output_file.write(sequence)

def useDashing():
        """Calls the appropriate shell scripts (dashingShell.sh) and files after giving them execution permissions."""

        #grant execution permissions
        files_to_allow = [
            'dashingShell512.sh',
            'dashing_s512',
            'readHLLandWrite512.sh',
            'dashingShell256.sh',
            'dashing_s256',
            'readHLLandWrite256.sh',
            'dashingShell128.sh',
            'dashing_s128',
            'readHLLandWrite128.sh',
        ]

        for f in files_to_allow:
            st = os.stat(f)
            os.chmod(f, st.st_mode | stat.S_IEXEC)

        print("Permissions added.")

        #subprocess.call(['sh', "dashingShell.sh"])

        #calls the shell script and returns CalledProcessError if an exit code is not zero
        try:
            #check if dashing_s512 works. if it works, run dashingShell512.sh
            subprocess.check_output(['sh', 'dashingShell512.sh'])
            print("Dashing Completed!")
        except subprocess.CalledProcessError as e:
            print(f"Error running default dashing_s512: {'OS must support AVX512BW instructions'}.")
            print("Trying dashing_s256...")
            try:
                subprocess.check_output(['sh', 'dashingShell256.sh'])
                print("Dashing Completed!")
            except subprocess.CalledProcessError as e:
                print(f"Error running default dashing_s256: {'OS must support AVX2 instructions.'}")
                print("Trying dashing_s128...")
                try:
                    subprocess.check_output(['sh', 'dashingShell128.sh'])
                    print("Dashing Completed!")
                except subprocess.CalledProcessError as e:
                    print(f"Error running all dashing binaries: {'OS must support SSE2 instructions'}")

def dropColumns(dashing_output, file = os.path.join("/content", "selected_features.txt")):
        with open(file, "r") as feature_file:
            features = [feature.strip() for feature in feature_file.readlines()]
        #print("Selected features:", features)

        feature_list = ["Accession ID"] + features

        drop_df = read_csv(dashing_output)
        drop_df = drop_df[[column.strip() for column in feature_list]]
        drop_df.to_csv("./output.csv", index=False, header=True)

required_folder_names = ["fastas", "keys", "predictions"]

getRequiredFiles(suppress = True)

#create required folders
for name in required_folder_names:
    if not os.path.exists(os.path.join("/content", f"{name}")):
        os.mkdir(os.path.join("/content", f"{name}"))

if(os.listdir(os.path.join("/content", "fastas"))):
  for f in os.listdir(os.path.join("/content", "fastas")):
      os.remove(os.path.join(os.path.join("/content", "fastas"), f))

for filename in os.listdir("/content/"):
  if(filename.endswith(".fasta")):
    first_line, sequence, id = readTruncateSequence("/content/" + filename)
    writeFasta(id, first_line, sequence)

useDashing()

dashing_output = os.path.join("/content", f"output.csv")
dropColumns(dashing_output)

Permissions added.
Error running default dashing_s512: OS must support AVX512BW instructions.
Trying dashing_s256...
Dashing Completed!


# **Step 4: Encrypt your prepared data**

Once we've finished preparing our data, we can then encrypt our data using Concrete-ML's Client API.

First, we generate our private and evaluation keys using the [FHEModelClient](https://docs.zama.ai/concrete-ml/v/0.5-1/developer-guide/api/concrete.ml.deployment.fhe_client_server#class-fhemodelclient) API.

In [None]:
def get_size(self, file_path, unit='bytes'):
        file_size = os.path.getsize(file_path)
        exponents_map = {'bytes': 0, 'kb': 1, 'mb': 2, 'gb': 3}
        if unit not in exponents_map:
            raise ValueError("Must select from \
            ['bytes', 'kb', 'mb', 'gb']")
        else:
            size = file_size / 1024 ** exponents_map[unit]
            return round(size, 3)

!ls -lh $network.server_dir.name

total 12K
-rw-r--r-- 1 root root   24 Jun 19 13:20 serialized_evaluation_keys.ekl
-rw-r--r-- 1 root root 5.5K Jun 19 13:19 server.zip


Next, we use **pandas** to read our Dashing output file and encrypt each row.

In [None]:
print(dashing_output)

df = read_csv(dashing_output)
arr_no_id = df.drop(columns=['Accession ID']).to_numpy(dtype="uint16")

#encrypted rows for input to server
encrypted_rows = []

#encrypted dictionary for outputs
count = 0

#print(self.data_dictionary)
for row in range(0, arr_no_id.shape[0]):
    clear_input = arr_no_id[[row],:]
    encrypted_input = fhemodel_client.quantize_encrypt_serialize(clear_input)
    encrypted_rows.append(encrypted_input)

print(encrypted_rows)

/content/output.csv
[b'\x01\x03\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x14\x00\x00\x00\x00\x00\x00\x00\x01\x05\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00\x001zu\xf2\xd0\xd5\xd57\x89R\xa3\x05\xbe\xbfo\xb7\xae\xdf\xb8\x85\xad3_\xf3l0\xa8gh\xf5\x1e\xb3\\r\xaa,E\xeb\xd6\xf5]\xbb6Z\xdeA#\x92{\xdad\x0f4\x9f\xe03\xfc\xdd\xc2[\xc3\x1f\xc0aU\xf1YS\x13JS\x04QA\xeb\xdc\xa8\xbd\x89\xdeb\xe66TW\xdeo5<\x06\xb9%)Yh\x8a\x93-\xc1*Ad\x7f";\x8ak\xf3\xe9\xa8\x91\x91\xa4\xdeY\xe3\xb2\xf0*\xce\x13\x14\xd1\xa1c\xc0:W\xc4K\x1d\x0f\xbaa\xa2\xfb\xca\x84\xca\xc8<\x1a\x12\xc7\x19\xf7,\xd3\xef&o\xa2\x1d_\xd02\xfc\x88\xcc\xff@E\xe2\xdb\x93\xd9E\x1e\xddU\x02\xc5\x9a\xc6;I\x87V\xff\xf7\x89 .\xc2\xd5\xbe\x05\xda*\xc2T\xcbi?\xf0eB\x10+\xb0\x11:\x9e~\xfd\xf0z2\r\xffD\x985\xb6\x89\xa75\n.\xfa\xf6\xbf\xe2\xc8\x07\xb7,\xa0\xb5\xd3\x94\x81)\x94B\x8d!\x05rUO\xc05\x1a\xc1\xf6\x98\x8d\xbf]\xbe\xf2\x1e\xb6"\x89?SF\xc1\x15\xe8o\xc5[\xaf\x04\xb8Y\xf9|\xad\xfd4\xb9\xcc\x9f-\xd6*N\xd5\x1f\x95\x90\x0c\

We've successfully encrypted our sample sequence, and we are now ready to send it to the server for inference.

# **Step 5: Send the data to the server for FHE inference**

We first send the encrypted data to the server for classification.

In [None]:
print("Sending encrypted input to server...")

network.client_send_input_to_server_for_prediction(encrypted_rows[0])

Sending encrypted input to server...


0.013916969299316406

After the encrypted data is sent to the server, the server then calls the [run()](https://) function of Concrete-ML's [FHEModelServer](https://docs.zama.ai/concrete-ml/v/0.5-1/developer-guide/api/concrete.ml.deployment.fhe_client_server#class-fhemodelserver) API, loading and running the compiled model that is stored on the server to classify the data we just sent to it.

After classification, it should then send the encrypted prediction results back to the client, like so:

In [None]:
encrypted_prediction = network.server_send_encrypted_prediction_to_client()

# **Step 6: Decrypt your encrypted prediction results**

Now that the server has sent the prediction results back to the client, we then begin decryption of the results using the [FHEModelClient](https://docs.zama.ai/concrete-ml/v/0.5-1/developer-guide/api/concrete.ml.deployment.fhe_client_server#class-fhemodelclient) API.

In [None]:
classes_dict = {0: 'B.1.1.529 (Omicron)', 1: 'B.1.617.2 (Delta)', 2: 'B.1.621 (Mu)', 3: 'C.37 (Lambda)'}

decrypted_predictions = []

decrypted_prediction = fhemodel_client.deserialize_decrypt_dequantize(encrypted_prediction)[0]

decrypted_predictions.append(decrypted_prediction)

decrypted_prediction_class = numpy.array(decrypted_predictions).argmax(axis=1)

final_output = [classes_dict[i] for i in decrypted_prediction_class]

Finally, we simply print the output.

In [None]:
for output in final_output:
  print(output)

C.37 (Lambda)
