In [None]:
!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
!dpkg -i cuda-keyring_1.1-1_all.deb
!apt-get update
!apt-get -y install cuquantum

!pip3 install matplotlib
!pip3 install qiskit
!pip3 install qiskit-aer
!pip3 install qiskit-aer-gpu
!pip3 install qiskit_algorithms
!pip3 install pylatexenc
!pip3 install cupy-cuda12x
!pip3 install cuquantum-python
!pip3 install ipywidgets

In [None]:
from qiskit import QuantumCircuit, Aer, transpile, assemble, execute
from qiskit.quantum_info import Statevector
from qiskit_algorithms import Grover, AmplificationProblem
from qiskit.primitives import Sampler
from qiskit.circuit.library import GroverOperator, QuantumVolume
from qiskit.utils import QuantumInstance
from qiskit.providers.aer import AerSimulator

import numpy as np
import matplotlib.pyplot as pl
%matplotlib inline

#**Encoding vectors as quantum states**

The **encode_vector_as_quantum_state()** function encodes a classical vector into a quantum state using amplitude encoding, which is a process that maps classical data to the amplitudes of a quantum state.

###**Process of amplitude encoding:**
The input vector is first normalized to ensure that its amplitudes can be correctly represented in a quantum state because a quantum state must have amplitudes that square to one, reflecting the probability interpretation in quantum mechanics. The number of qubits is determined based on the length of the input vector. For a vector of lengths ***N***, you need **[Log<sub>2</sub>(N)]** qubits, since each qubit is added to the system doubles the dimensionality of the state space. Finally, **the initialize()** method is used to set the state of these qubits to represent the normalized input vector. The method prepapres a quantum state whose amplitudes correspond to the elements of the input vector. In other words, the initialize method dynamically generates a sequence of gates to create a state where the amplitudes match the elements of normalized input vector.

###**Execution:**
**Input:** A classical data vector (a np array) that we want to encode as a quantum state. The vector represents the classical information that we want to process quantum-mechanically.

**Output:** Returns two items. A **QuantumCircuit** Qiskit object that represents the quantum circuit used for encoding the input vector and the resulting **state vector** of the quantum state after the circuit is executed.

In [None]:
def amplitude_encode(vector):

    # Conditional input validation
    if not isinstance(vector, np.ndarray):
        raise TypeError("Input must be a numpy array.")

    if vector.size == 0:
        raise ValueError("Input vector cannot be empty.")

    if np.all(vector == 0):
        raise ValueError("Input vector cannot be a zero vector.")

    # Normalization
    norm_vector = vector / np.linalg.norm(vector)

    # Quantum circuit
    num_qubits = np.ceil(np.log2(len(vector))).astype(int)
    qc = QuantumCircuit(num_qubits)
    qc.initialize(norm_vector, range(num_qubits))

    return qc

def quantum_state_vector(encoded_vector, visualize=False):
    backend = Aer.get_backend('statevector_simulator')
    job = backend.run(transpile(encoded_vector, backend))
    state = job.result().get_statevector()

    if visualize:
        print("Vector Encoding Circuit drawn")
        encoded_vector.draw(output='mpl', filename='encode_vector_as_quantum_state.png')

    return encoded_vector, state

#**Swap Test**

The **swap_test()** function is a critical part of the quantum attention mechanism as it estimates the similarity between two quantum states, which is analogous to the dot product calculation in classical attention mechanisms. This is done to estimate the attention scores, which in classical attention mechanism, is obtained through dot product calculations between query and key vectors. This is important because it tells the mechanism which parts of the input it should pay more/less attention to and that is the central feature of attention mechanisms. The result of the swap test shows ow the value vectors will be weighted and combined, thus completing the quantum analog of the attention calculation.

###**The swap test steps:**
1.   The function first checks that the two input quantum circuits have the same number of qubits because the swap test requires that the states be comparable in dimension.
2.   An ancilla qubit is introduced at the beginning of the new circuit. Ancilla qubit is used to control the subsequent swap operations.
3.   The states of **qc1** and **qc2** are composed onto the new circuit. This combines the two circuits into one, with each original circuit's state occupying a separate register.
4.   The ancilla qubit is put into a superposition state using a **Hadamard gate (H)**. A series of controlled swap gates are applied, which are controlled by the ancilla qubit. The gates swap the corresponding qubits of **qc1** and **qc2** if the ancilla qubit is in the state **|1>**. Then, another Hadamard gate is applied to the ancilla qubit.
5.   The ancilla qubit is measured and the output of the measurement is used to infer the similarity between the quantum states of **qc1** and **qc2**.


###**Execution:**
**Input:** Two quantum circuits that represent the states for which we want to estimate the dot product. These correspond to the quantum states of **query (Q)** and **key (K)** vectors in the attention mechanism.

**Output:** Returns a **QuantumCircuit** Qiskit object that represents the quantum circuit  which performs the swap test between **qc1** and **qc2** (the two inputs).

###**Dynamics:**
The probability of measirng the ancilla qubit in state **|0>** is related to the inner product/similarity of the quantum states from **qc1** to **qc2**. If the states are identifcal, the probability of measing **|0>** is 1. If they are orthogonal, the probability is 0.5. For other cases, the probability is between 0.5 and 1.

In [None]:
def swap_test(qc1, qc2, visualize=False):
    if qc1.num_qubits != qc2.num_qubits:
        raise ValueError("Both quantum circuits must have the same number of qubits.")

    num_qubits = qc1.num_qubits

    qc = QuantumCircuit(num_qubits * 2 + 1, 1)
    qc.compose(qc1, qubits=range(1, num_qubits + 1), inplace=True)
    qc.compose(qc2, qubits=range(num_qubits + 1, 2 * num_qubits + 1), inplace=True)

    qc.h(0)
    for i in range(num_qubits):
        qc.cswap(0, i + 1, num_qubits + i + 1)
    qc.h(0)

    qc.measure(0, 0)

    if visualize:
        print("Swap Test Circuit drawn")
        qc.draw(output='mpl', filename='swap_test_circuit.png')

    return qc

#**Measurement**

The purpose of **measure_state()** function is simple. It measures the quantum state produced by the swap test and interprets the results to derive attention scores.

###**Measurement steps:**
1.   As standard, the function begins with input validation to see if it is indeed of type **QuantumCircuit**.
2.   The circuit is run on a backend simulator from **Aer** with a specified number of shots (number of iterations).
3.   The circuit includes measurements gates that measure the qubit states. The **execute()** funtion simulates the measurements and returns a result object containing the counts of each possible outcome.
4.   The counts are converted to probabilities/attention scores by dividing the count of each outcome by the total number of shots.

In [None]:
def measure_state(qc, num_shots=1024):
    # Conditional input validation
    if not isinstance(qc, QuantumCircuit):
        raise TypeError("Input must be a QuantumCircuit.")

    if not isinstance(num_shots, int) or num_shots <= 0:
        raise ValueError("num_shots must be a positive integer.")

    backend = Aer.get_backend('qasm_simulator')
    job = execute(qc, backend, shots=num_shots)
    result = job.result()
    counts = result.get_counts()

    attention_scores = {state: count / num_shots for state, count in counts.items()}

    return attention_scores

#**Classical Components: Softmax & Weighted Sum**

Standard Classical implementations of SoftMax and Weighted Sum.


In [None]:
def softmax(scores):
    exp_scores = np.exp(np.array(list(scores.values())))
    sum_exp_scores = np.sum(exp_scores)
    softmax_scores = exp_scores / sum_exp_scores
    return dict(zip(scores.keys(), softmax_scores))

def weighted_sum_of_values(attention_scores, value_vectors):
    # Conditional input validation
    if not isinstance(value_vectors, np.ndarray):
        raise TypeError("value_vectors must be a numpy array.")

    if not isinstance(attention_scores, dict):
        raise TypeError("attention_scores must be a dictionary.")

    # Initialize the weighted sum vector
    weighted_sum = np.zeros_like(value_vectors[0])

    # Iterate over attention scores and value vectors
    for state, score in attention_scores.items():
        # Map the quantum state to an index for the value vector
        index = int(state, 2)  # Assuming binary state representation
        weighted_sum += score * value_vectors[index]

    return weighted_sum

#**Inference with dummy values.**

In [None]:
# Orthogonal and identical vectors
Q1 = np.array([1, 0])
K1 = np.array([0, 1])  # Orthogonal to Q1
V1 = np.array([1, 1]) / np.sqrt(2)

Q2 = np.array([1, 0])
K2 = np.array([1, 0])  # Identical to Q2
V2 = np.array([1, 1]) / np.sqrt(2)

# Encoding both sets
qc_Q1 = amplitude_encode(Q1)
qc_K1 = amplitude_encode(K1)
qc_V1 = amplitude_encode(V1)

qc_Q2 = amplitude_encode(Q2)
qc_K2 = amplitude_encode(K2)
qc_V2 = amplitude_encode(V2)

# Quantum state vectors for both sets
qc_Q1_state_vector = quantum_state_vector(qc_Q1, visualize=True)
qc_K1_state_vector = quantum_state_vector(qc_K1, visualize=True)
qc_V1_state_vector = quantum_state_vector(qc_V1, visualize=True)

qc_Q2_state_vector = quantum_state_vector(qc_Q2, visualize=True)
qc_K2_state_vector = quantum_state_vector(qc_K2, visualize=True)
qc_V1_state_vector = quantum_state_vector(qc_V2, visualize=True)

# swap test for both tests
swap_test_1 = swap_test(qc_Q1, qc_K1, visualize=True)
swap_test_2 = swap_test(qc_Q2, qc_K2, visualize=True)

# Measure and analyze the results
measured_scores_1 = measure_state(swap_test_1)
measured_scores_2 = measure_state(swap_test_2)

print("Measured Scores for Orthogonal Vectors:", measured_scores_1)
print("Measured Scores for Identical Vectors:", measured_scores_2)

# Apply softmax to the measured attention scores
normalized_scores_1 = softmax(measured_scores_1)
normalized_scores_2 = softmax(measured_scores_2)

print("Normalized Scores for Orthogonal Vectors:", normalized_scores_1)
print("Normalized Scores for Identical Vectors:", normalized_scores_2)

# Prepare value vectors for each set
value_vectors_1 = np.array([V1, V1])  # Simplified for demonstration
value_vectors_2 = np.array([V2, V2])  # Simplified for demonstration

# Calculate the weighted sum of value vectors
output_vector_1 = weighted_sum_of_values(normalized_scores_1, value_vectors_1)
output_vector_2 = weighted_sum_of_values(normalized_scores_2, value_vectors_2)

print("Output Vector for Orthogonal Vectors:", output_vector_1)
print("Output Vector for Identical Vectors:", output_vector_2)

Vector Encoding Circuit drawn


  self._style, def_font_ratio = load_style(self._style)


Vector Encoding Circuit drawn
Vector Encoding Circuit drawn
Vector Encoding Circuit drawn
Vector Encoding Circuit drawn
Vector Encoding Circuit drawn
Swap Test Circuit drawn
Swap Test Circuit drawn
Measured Scores for Orthogonal Vectors: {'1': 0.4873046875, '0': 0.5126953125}
Measured Scores for Identical Vectors: {'0': 1.0}
Normalized Scores for Orthogonal Vectors: {'1': 0.49365268474729923, '0': 0.5063473152527008}
Normalized Scores for Identical Vectors: {'0': 1.0}
Output Vector for Orthogonal Vectors: [0.70710678 0.70710678]
Output Vector for Identical Vectors: [0.70710678 0.70710678]
