# Executing Quantum Circuits 

One can excute CUDA Quantum kernels via the `sample` and `observe` function calls. 

Quantum states collapse upon measurement and hence need to be sampled from many times to gather statistics. The CUDA Quantum `sample` call enables this: 

## Sample

In [None]:
import cudaq

qubit_count = 2

# Define the simulation target.
cudaq.set_target("qpp-cpu")

# Define a quantum kernel function.
kernel = cudaq.make_kernel()

# Allocate our `qubit_count` to the kernel.
qubits = kernel.qalloc(qubit_count)

# 2-qubit GHZ state.
kernel.h(qubits[0])
for i in range(1, qubit_count):
    kernel.cx(qubits[0], qubits[i])

# If we dont specify measurements, all qubits are measured in
# the Z-basis by default.
kernel.mz(qubits)

result = cudaq.sample(kernel, shots_count=1000)

result.dump()

In simulation mode, the quantum state is built once and then sampled from $s$ times where $s$ equals the `shots_count` . In hardware execution mode, the quantum state collapses upon measurement and hence needs to be rebuilt over and over again. 





### Sample Async

Asynchronous programming is a technique that enables your program to start a potentially long-running task and still be able to be responsive to other events while that task runs, rather than having to wait until that task has finished. Once that task has finished, your program is presented with the result. 

`sample` can be a time intensive task. We can parallelize the execution of `sample` via the arguments it accepts. 

In [None]:
# Parallelize over the various kernels one would like to execute.

import cudaq

cudaq.set_target("nvidia-mqpu")
target = cudaq.get_target()
num_qpus = target.num_qpus()

qubit_count = 2

# Kernel 1

kernel1 = cudaq.make_kernel()
qubits = kernel1.qalloc(qubit_count)
# 2-qubit GHZ state.
kernel1.h(qubits[0])
for i in range(1, qubit_count):
    kernel1.cx(qubits[0], qubits[i])
kernel1.mz(qubits)

# Kernel 2

kernel2 = cudaq.make_kernel()
qubits = kernel2.qalloc(qubit_count)
# 2-qubit GHZ state.
kernel2.h(qubits[0])
for i in range(1, qubit_count):
    kernel2.cx(qubits[0], qubits[i])
kernel2.mz(qubits)

# Asynchronous execution on multiple qpus via nvidia gpus.

result1 = cudaq.sample_async(kernel1, shots_count=1000, qpu_id=0)
result2 = cudaq.sample_async(kernel2, shots_count=1000, qpu_id=1)

result1.get().dump()
result2.get().dump()

Similar to the above, one can also parallelize over the `shots_count` or the variational parameters of a quantum circuit.


## Observe

`observe` allows us to gather qubit statistics and calculate expectation values. We must supply a spin operator in the form of a hamiltonian from which we would like to calculate $\bra{\psi}H\ket{\psi}$.

In [None]:
import cudaq
from cudaq import spin

qubit_count = 2

# Define the simulation target.
cudaq.set_target("qpp-cpu")

# Define a quantum kernel function.
kernel = cudaq.make_kernel()

# Allocate our `qubit_count` to the kernel.
qubits = kernel.qalloc(qubit_count)

# 2-qubit GHZ state.
kernel.h(qubits[0])

for i in range(1, qubit_count):
    kernel.cx(qubits[0], qubits[i])

# Define a Hamiltonian in terms of Pauli Spin operators.
hamiltonian = spin.z(0) + spin.y(1) + spin.x(0) * spin.z(0)

# Compute the expectation value given the state prepared by the kernel.
result = cudaq.observe(kernel, hamiltonian).expectation()

print('<H> =', result)

### Observe Async

Similar to `sample_async` above, `observe` also supports asynchronous execution for the [arguments it accepts](https://nvidia.github.io/cuda-quantum/latest/api/languages/python_api.html#cudaq.sample_async:~:text=cudaq.observe_async(),%C2%B6). One can parallelize over various kernels, spin operators, variational parameters or even noise models.