# CUDA Quantum

["With a unified and open programming model, NVIDIA CUDA-Q is an open-source platform for integrating and programming quantum processing units (QPUs), GPUs, and CPUs in one system. CUDA-Q enables GPU-accelerated system scalability and performance across heterogeneous QPU, CPU, GPU, and emulated quantum system elements." - NVIDIA](https://developer.nvidia.com/cuda-q)


## System Requirements 
(For version 0.7.0 as of 2024-04-30)

-   x86-64 / ARM64
-   Linux
-   Python 3.8 - 3.11
-   glibc 2.28+
-   **GPU NOT REQUIRED** (although recommended)

In [None]:
!pip install cuda_quantum==0.7.0

## [Kernels](https://nvidia.github.io/cuda-quantum/latest/using/basics/kernel_intro.html)

A *CUDA quantum kernel* is a function that can be ran on a quantum device. It is a modular arrangement of quantum circuits and classical programs / control structures, facilitating hybrid quantum-classical programs (e.g. quantum machine learning).

In [None]:
import cudaq
import numpy as np

num_qubits = 3

### Function Decorator

Define a kernel as a Python function. Since kernels are compiled, note that `h` and `x`, representing the Hadamard and NOT gates, respectively, are not defined in the Python scope. However, this gets resolved and compiled by `cuda_quantum`.

Supports dynamic generation of quantum circuits using runtime variables and classical structures (if statements, for loops)
Decorator construction cannot be composited with other kernels.

In [None]:
# Define our kernel.
@cudaq.kernel
def kernel(num_qubits: int):
    # Allocate our qubits.
    qvector = cudaq.qvector(num_qubits)
    # Place the first qubit in the superposition state.
    h(qvector[0])
    # Loop through the allocated qubits and apply controlled-X,
    # or CNOT, operations between them.
    for qubit in range(num_qubits - 1):
        x.ctrl(qvector[qubit], qvector[qubit + 1])
    
print(cudaq.draw(kernel, num_qubits))

### Object Definition

`cudaq.make_kernel` creates a `PyKernel` object the user can the interact with.
Kernel objects are largely defined statically, meaning if a user wants to leverage passed dynamic arguments, they will raise errors if using the default Python implementations (`if/else` statements, `for` loops). Need to use the structures defined in `cudaq.PyKernel`.

In [None]:
kernel, qubit_count = cudaq.make_kernel(int)

qvector = kernel.qalloc(qubit_count)

kernel.h(qvector[0])

# for i in range(qubit_count - 1):
#     kernel.cx(qvector[i], qvector[i + 1])
kernel.for_loop(0, qubit_count-1, lambda i: kernel.cx(qvector[i], qvector[i + 1]))

print(cudaq.draw(kernel, num_qubits))

#### Creating composite kernels

It is very difficult to create composite kernels since variables from the child kernel cannot be returns to the parent. For example, the child cannot create a `qvector` where the parent can also place gates on it.

Passing a `qvector` as a parameter means that the child cannot exist independently.

In [None]:
ghz, qvector, qubit_count = cudaq.make_kernel(cudaq.qvector, int)

ghz.h(qvector[0])
ghz.for_loop(0, qubit_count-1, lambda i: ghz.cx(qvector[i], qvector[i + 1]))

In [None]:
# print(cudaq.draw(ghz, cudaq.qvector(num_qubits), num_qubits))

In [None]:
measure, qubit_count = cudaq.make_kernel(int)

qvector = measure.qalloc(qubit_count)

# Insert ghz circuit
measure.apply_call(ghz, qvector, qubit_count)

# Apply measurement
measure.mz(qvector)

print(cudaq.draw(measure, num_qubits))

## Sampling and Expectation Values

Supports sampling the quantum circuit (reconstructing the statevector $\ket{\psi}$) and finding the expectation value of the quantum circuit ($\langle \psi \rangle$) from some Hamiltonian $H$: 
$\langle \psi \rangle = \langle \psi \vert H \vert \psi \rangle$.

In [None]:
# Note: if no measurement (mz, mx, my) operation is provided, measures all qubits in z-basis (mz) by default
results: cudaq.SampleResult = cudaq.sample(measure, num_qubits, shots_count=10000)
results.dump()

In [None]:
hamiltonian = cudaq.spin.z(0)
result: cudaq.ObserveResult = cudaq.observe(kernel, hamiltonian, num_qubits, shots_count=10000)
result.expectation()

### Targets / Backends

Users can run a kernel on simulated or hardware backends or "targets". Target selection is done using the command below, where `identifier` is the string identifer of the target shown in the below table.

```cudaq.set_target("identifier")```

| Name                          | Identifier        | Type       | Description                                                                             |
| --------                      | -------           | -------    | -------                                                                                 |
| Single-GPU                    | `"nvidia"`        | Simulation | A statevector simulator using the `cuStateVec` library.                                 |
| Multi-Node, Multi-GPU         | `"nvidia-mgpu"`   | Simulation | `cuStateVec` statevector simulator with support for multi-Node, multi-GPU distribution. |
| OpenMP CPU-only               | `"qpp-cpu"`       | Simulation | CPU-only statevector simulator that is OpenMP-threaded using `Q++`. |
| Tensor Network                | `"tensornet"`     | Simulation | Represents quantum states as tensor networks. Supports multi-node, multi-GPU distribution of tensor operations and contractions. |
| Matrix Product State          | `"tensornet-mps"` | Simulation | Single-GPU simulation of quantum circuits as matrix-product states. Exploits sparse tensor networks and numerical approximation. |
| NVIDIA Multi-Processor        | `"nvidia-mqpu"`   | Simulation | Creates 1 simulated quantum processing unit for every GPU on the system. |
| Remote Multi-Processor        | `"remote-mqpu"`   | Simulation | Like `"nvidia-mqpu"`, but wraps simulated QPUs into independent HTTP REST server instances. |
| NVIDIA Quantum Cloud (NVQC)   | `"nvqc"`          | Simulation | Simulation on the NVIDIA Quantum Cloud platorm. Supports the other simulation methods above. |
| Quantunuum                    | `"quantinuum"`    | Hardware   | Trapped-ion quantum computers. |
| IonQ                          | `"ionq"`          | Hardware   | Trapped-ion quantum computers. |
| Oxford Quantum Circuits (OQC) | `"oqc"`           | Hardware   | Supports the 8-qubit, ring-topology **Lucy** device and the 32-qubit, Kagome-lattice-topology **Toshiko** device. |
| IQM                           | `"iqm"`           | Hardware   | **Under development.** |



### [Noisy Simulation with Noise Models](https://nvidia.github.io/cuda-quantum/latest/examples/python/tutorials/noisy_simulations.html)

In [None]:
# First, we will define an out of the box noise channel. In this case,
# we choose depolarization noise. This depolarization will result in
# the qubit state decaying into a mix of the basis states, |0> and |1>,
# with our provided probability.
error_probability = 0.1
depolarization_channel = cudaq.DepolarizationChannel(error_probability)

# We can also define our own, custom noise channels through
# Kraus operators. Here we will define two operators representing
# bit flip errors.

# Define the Kraus Error Operator as a complex ndarray.
kraus_0 = np.sqrt(1 - error_probability) * np.array([[1.0, 0.0], [0.0, 1.0]], dtype=np.complex128)
kraus_1 = np.sqrt(error_probability) * np.array([[0.0, 1.0], [1.0, 0.0]], dtype=np.complex128)

# Add the Kraus Operator to create a quantum channel.
bitflip_channel = cudaq.KrausChannel([kraus_0, kraus_1])

# Add the two channels to our Noise Model.
noise_model = cudaq.NoiseModel()

# Apply the depolarization channel to any X-gate on the 0th qubit.
noise_model.add_channel("x", [0], depolarization_channel)
# Apply the bitflip channel to any X-gate on the 1st qubit.
noise_model.add_channel("x", [1], bitflip_channel)

In [None]:
# Due to the impact of noise, our measurements will no longer be uniformly
# in the |11> state.
noisy_counts = cudaq.sample(kernel, num_qubits, noise_model=noise_model, shots_count=1000)
noisy_counts.dump()

## Quantum Machine Learning (QML)

CUDA Quantum obviously places extreme emphasis on GPU-accelerated quantum simulation and QML. However, the platform is still immatures and QML would take too long to discuss. See tutorials below for sample QML implementations using CUDA Quantum.

- [Variational Quantum Neural Network](https://nvidia.github.io/cuda-quantum/latest/examples/python/tutorials/hybrid_qnns.html)
- [Variational Quantum Eigensolver](https://nvidia.github.io/cuda-quantum/latest/examples/python/tutorials/vqe.html)