# Preamble

This notebook uses jupyter-c-kernel which will need to be installed to run the code examples. Some configuration notes for running this tutorial locally.

* The start of each code cell has a comment starting with //%cflags: this contains C flag arguments passed to the kernel configured C compiler. Right now they contain paths to the shared library file and headers in my local development environment. These will need to be updated for you
* My kernel is configured to include the shared libraries used are in the search paths with LD_LIBRARY_PATH.
* This notebook is untested on Windows and Mac

This is all to say it might be easier to copy the code examples from this into standalone .c files and use your compiler locally. The notebook is just a convenient presention format.

# Building Qiskit's C API

To build the Qiskit C API right now the only option is to build it from source. You will have to check out the Qiskit source tree checkout the latest stable release (2.3.0). This also requires a Rust compiler is installed, Qiskit 2.3.0 release requires Rust version 1.85 or newer. The basic steps for building the C API are:

```bash
git clone https:://github.com/Qiskit/qiskit
pushd qiskit
git checkout 2.3.0
make c
```

This will put the shared object file `libqiskit.so` in `dist/c/lib` and the headers are located in `dist/c/include` (relative to the repo root) and you can then install them as necessary in your local environment.

The C API was first introduced in Qiskit 2.0 which was released at the end of March 2025. It is still undergoing active development and has been evolving every release. However, the interface is still considered unstable during this initial development phase which means it might have backwards incompatible changes between minor versions (e.g. 2.2.0 to 2.3.0) which would otherwise violate [Qiskit's stability policy](https://quantum.cloud.ibm.com/docs/en/guides/qiskit-sdk-version-strategy).

# Working with Qiskit's C API
Let look at the various Qiskit components that we used in the previous section that are available to the C API.


In [1]:
//%cflags:-lm -lqiskit -L/home/computertreker/git/qiskit/qiskit-core/dist/c/lib -I/home/computertreker/git/qiskit/qiskit-core/dist/c/include
#include <qiskit.h>
#include <string.h>
#include <stdio.h>

int main() {
    printf(QISKIT_VERSION);
    return 0;
}

2.3.0

## Quantum Circuits


In [2]:
//%cflags:-lm -lqiskit -L/home/computertreker/git/qiskit/qiskit-core/dist/c/lib -I/home/computertreker/git/qiskit/qiskit-core/dist/c/include
    
#include <qiskit.h>
#include <string.h>
#include <stdio.h>

void print_circuit(QkCircuit *qc);

int main() {
    // Create an empty circuit with 1000 qubits and 1000 clbits
    QkCircuit *qc = qk_circuit_new(1000, 1000);
    // Add a Hadamard Gate on Qubit 0
    uint32_t one_qubit[1] = {0,};
    qk_circuit_gate(qc, QkGate_H, one_qubit, NULL); // The NULL pointer is for the parameter array.
                                                    // Since hadamard doesn't have parameters it
                                                    // is never accessed.
    uint32_t two_qubits[2] = {0, 1};
    qk_circuit_gate(qc, QkGate_CX, two_qubits, NULL);
    print_circuit(qc);
    qk_circuit_free(qc);
    return 0;    
}

void print_circuit(QkCircuit *qc) {
    size_t num_instructions = qk_circuit_num_instructions(qc);
    QkCircuitInstruction *inst = malloc(sizeof(QkCircuitInstruction));
    for (size_t i = 0; i < num_instructions; i++) {
        qk_circuit_get_instruction(qc, i, inst);
        printf("%s: qubits: (", inst->name);
        for (uint32_t j = 0; j < inst->num_qubits; j++) {
            printf("%d,", inst->qubits[j]);
        }
        printf(")");
        uint32_t num_clbits = inst->num_clbits;
        if (num_clbits > 0) {
            printf(", clbits: (");
            for (uint32_t j = 0; j < num_clbits; j++) {
                printf("%d,", inst->clbits[j]);
            }
            printf(")");
        }
        printf("\n");
        qk_circuit_instruction_clear(inst);
    }
}


h: qubits: (0,)
cx: qubits: (0,1,)


There is also a circuit library in C, although it is still actively under construction so the circuits it contains are still quite limited in the 2.3.0 release

In [3]:
//%cflags:-lm -lqiskit -L/home/computertreker/git/qiskit/qiskit-core/dist/c/lib -I/home/computertreker/git/qiskit/qiskit-core/dist/c/include

#include <qiskit.h>

int main() {
    QkCircuit *qv = qk_circuit_library_quantum_volume(50, 50, 42);
    qk_circuit_free(qv);
}

## Observables

The Sparse Observable type exposed to python is available under the `QkObs` struct. We can construct an equivalent cost observable as we did in the previous section.


In [4]:
//%cflags:-lm -lqiskit -L/home/computertreker/git/qiskit/qiskit-core/dist/c/lib -I/home/computertreker/git/qiskit/qiskit-core/dist/c/include


#include <qiskit.h>
#include <string.h>
#include <stdio.h>


int main() {
    uint32_t num_qubits = 13;
    QkObs *cost_observable = qk_obs_zero(num_qubits);
    QkComplex64 coeff = {0, 0};
    QkBitTerm bit_terms[2] = {QkBitTerm_Z, QkBitTerm_Z};
    uint32_t indices[2] = {1, 0};
    QkObsTerm term = {coeff, 2, bit_terms, indices, num_qubits};
    QkExitCode exit_code = qk_obs_add_term(cost_observable, &term);
    char *string = qk_obs_str(cost_observable);
    printf(string);
}

SparseObservable { num_qubits: 13, coeffs: [Complex { re: 0.0, im: 0.0 }], bit_terms: [Z, Z], indices: [1, 0], boundaries: [0, 2] }

## Transpilation

The C API enables you to define targets and transpile circuits to them. Qiskit is a vendor agnostic library and doesn't include support for any vendor hardware on it's own (this is where the libraries that were mentioned in the previous tutorial section come in), so in this example we build a simple compilation target and then compile our circuit to it. There is also support for building custom transpilation passes but we will not cover that in this section.

In [5]:
//%cflags:-lqiskit -L /home/computertreker/git/qiskit/qiskit-terra/dist/c/lib -I /home/computertreker/git/qiskit/qiskit-terra/dist/c/include

#include <qiskit.h>
#include <string.h>
#include <stdio.h>

QkTarget * build_line_target(uint32_t num_qubits) {
    QkTarget *target = qk_target_new(num_qubits);
    QkTargetEntry *x_entry = qk_target_entry_new(QkGate_X);
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        double error = 0.8e-6 * (i + 1);
        double duration = 1.8e-9 * (i + 1);
        qk_target_entry_add_property(x_entry, qargs, 1, duration, error);
    }
    qk_target_add_instruction(target, x_entry);

    QkTargetEntry *sx_entry = qk_target_entry_new(QkGate_SX);
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        double error = 0.8e-6 * (i + 1);
        double duration = 1.8e-9 * (i + 1);
        qk_target_entry_add_property(sx_entry, qargs, 1, duration, error);
    }
    qk_target_add_instruction(target, sx_entry);

    QkTargetEntry *rz_entry = qk_target_entry_new(QkGate_RZ);
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        double error = 0.;
        double duration = 0.;
        qk_target_entry_add_property(rz_entry, qargs, 1, duration, error);
    }
    qk_target_add_instruction(target, rz_entry);

    QkTargetEntry *ecr_entry = qk_target_entry_new(QkGate_ECR);
    for (uint32_t i = 0; i < num_qubits - 1; i++) {
        uint32_t qargs[2] = {i, i + 1};
        double inst_error = 0.0090393 * (num_qubits - i);
        double inst_duration = 0.020039;

        qk_target_entry_add_property(ecr_entry, qargs, 2, inst_duration, inst_error);
    }
    qk_target_add_instruction(target, ecr_entry);
    return target;
}

QkCircuit* build_bv(uint32_t num_qubits) {
    QkCircuit *qc = qk_circuit_new(num_qubits, 0);
    uint32_t x_qargs[1] = {
        9,
    };
    qk_circuit_gate(qc, QkGate_X, x_qargs, NULL);
    for (uint32_t i = 0; i < qk_circuit_num_qubits(qc); i++) {
        uint32_t qargs[1] = {
            i,
        };
        qk_circuit_gate(qc, QkGate_H, qargs, NULL);
    }
    for (uint32_t i = 0; i < qk_circuit_num_qubits(qc) - 1; i += 2) {
        uint32_t qargs[2] = {i, num_qubits - 1};
        qk_circuit_gate(qc, QkGate_CX, qargs, NULL);
    }
    return qc;
}

int main() {
    QkCircuit *qc = build_bv(128);
    QkTarget *target = build_line_target(256);
    printf("Number of circuit instructions: %d\n", qk_circuit_num_instructions(qc));
    char *error = NULL;
    QkTranspileOptions options = qk_transpiler_default_options();
    options.seed = 20260203;
    QkTranspileResult transpile_result = {NULL, NULL};
    for (unsigned short i=0; i<4; i++) {
        options.optimization_level = i;
        int result = qk_transpile(qc, target, &options, &transpile_result, &error);
        printf("Optimization level: %d ", i);
        if (result != 0) {
            printf("Transpilation failed with: %s\n", error);
        }    
        printf(
            "Number of transpiled circuit instructions: %d\n",
            qk_circuit_num_instructions(transpile_result.circuit));
    }
    qk_circuit_free(qc);
    qk_target_free(target);
    return 0;
}

Number of circuit instructions: 193
Optimization level: 0 Number of transpiled circuit instructions: 11259
Optimization level: 1 Number of transpiled circuit instructions: 2481
Optimization level: 2 Number of transpiled circuit instructions: 1936
Optimization level: 3 Number of transpiled circuit instructions: 2053


### Working with layouts

As explained in the previous tutorial session the transpiler may introduce a permutation as part of layout and routing. As part of running QAOA we used the `SparseObservable.apply_layout()` method to adjust our observable mirror the layout transformation. The same is true when working with the transpiler in C.

In [6]:
//%cflags:-lqiskit -L /home/computertreker/git/qiskit/qiskit-terra/dist/c/lib -I /home/computertreker/git/qiskit/qiskit-terra/dist/c/include

#include <qiskit.h>
#include <string.h>
#include <stdio.h>

QkTarget * build_line_target(uint32_t num_qubits) {
    QkTarget *target = qk_target_new(num_qubits);
    QkTargetEntry *x_entry = qk_target_entry_new(QkGate_X);
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        double error = 0.8e-6 * (i + 1);
        double duration = 1.8e-9 * (i + 1);
        qk_target_entry_add_property(x_entry, qargs, 1, duration, error);
    }
    qk_target_add_instruction(target, x_entry);

    QkTargetEntry *sx_entry = qk_target_entry_new(QkGate_SX);
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        double error = 0.8e-6 * (i + 1);
        double duration = 1.8e-9 * (i + 1);
        qk_target_entry_add_property(sx_entry, qargs, 1, duration, error);
    }
    qk_target_add_instruction(target, sx_entry);

    QkTargetEntry *rz_entry = qk_target_entry_new(QkGate_RZ);
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        double error = 0.;
        double duration = 0.;
        qk_target_entry_add_property(rz_entry, qargs, 1, duration, error);
    }
    qk_target_add_instruction(target, rz_entry);

    QkTargetEntry *ecr_entry = qk_target_entry_new(QkGate_ECR);
    for (uint32_t i = 0; i < num_qubits - 1; i++) {
        uint32_t qargs[2] = {i, i + 1};
        double inst_error = 0.0090393 * (num_qubits - i);
        double inst_duration = 0.020039;

        qk_target_entry_add_property(ecr_entry, qargs, 2, inst_duration, inst_error);
    }
    qk_target_add_instruction(target, ecr_entry);
    return target;
}

QkCircuit* build_bv(uint32_t num_qubits) {
    QkCircuit *qc = qk_circuit_new(num_qubits, 0);
    uint32_t x_qargs[1] = {
        9,
    };
    qk_circuit_gate(qc, QkGate_X, x_qargs, NULL);
    for (uint32_t i = 0; i < qk_circuit_num_qubits(qc); i++) {
        uint32_t qargs[1] = {
            i,
        };
        qk_circuit_gate(qc, QkGate_H, qargs, NULL);
    }
    for (uint32_t i = 0; i < qk_circuit_num_qubits(qc) - 1; i += 2) {
        uint32_t qargs[2] = {i, num_qubits - 1};
        qk_circuit_gate(qc, QkGate_CX, qargs, NULL);
    }
    return qc;
}

int main() {
    QkCircuit *qc = build_bv(128);
    QkTarget *target = build_line_target(256);
    char *error = NULL;
    QkTranspileOptions options = qk_transpiler_default_options();
    options.seed = 42;
    QkTranspileResult transpile_result = {NULL, NULL};
    int result = qk_transpile(qc, target, &options, &transpile_result, &error);
    if (result != 0) {
        printf("Transpilation failed with: %s\n", error);
    }
    QkObs* obs = qk_obs_identity(5);
    uint32_t *layout = malloc(256 * sizeof(*layout));
    char *label = qk_obs_str(obs);
    printf("Observable before applying layout: %s\n", label);
    qk_transpile_layout_final_layout(transpile_result.layout, true, layout);
    qk_obs_apply_layout(obs, layout, 256);
    label = qk_obs_str(obs);
    printf("\nObservable after applying layout: %s\n", label);    
    qk_str_free(label);
    qk_circuit_free(qc);
    qk_target_free(target);
    return 0;
}

Observable before applying layout: SparseObservable { num_qubits: 5, coeffs: [Complex { re: 1.0, im: 0.0 }], bit_terms: [], indices: [], boundaries: [0, 0] }

Observable after applying layout: SparseObservable { num_qubits: 256, coeffs: [Complex { re: 1.0, im: 0.0 }], bit_terms: [], indices: [], boundaries: [0, 0] }


## Circuit Execution

While the C API is still under development the only backends that support execution from C are IBM backends. So we'll be using the [qiskit-ibm-runtime-c](https://github.com/Qiskit/qiskit-ibm-runtime-c/) package to execute our circuits on IBM backends. This is still early days for the C API so there are many limitations with the client library still, and there isn't as robust an ecosystem for clients from C yet. Similarly there isn't a simulator that uses Qiskit interfaces that exposes a C API yet.

To run this cell you will need to have an account with IBM Quantum and also have a credentials file setup. If you use the Python [qiskit-ibm-runtime](https://github.com/Qiskit/qiskit-ibm-runtime) package and save your credentials using that it will create the necessary json file in the expected location (improving the ergonomics of credentials management is being tracked here: https://github.com/Qiskit/qiskit-ibm-runtime-c/issues/12)

In [7]:
//%cflags:-lqiskit -lqiskit_ibm_runtime -L/home/computertreker/git/qiskit/qiskit-core/dist/c/lib -I/home/computertreker/git/qiskit/qiskit-core/dist/c/include -L/home/computertreker/git/qiskit/qiskit-ibm-runtime-rs/target/release -I/home/computertreker/git/qiskit/qiskit-ibm-runtime-rs/include

#include <qiskit.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

#include <qiskit_ibm_runtime/qiskit_ibm_runtime.h>

int main() {
    // Build a 5 qubit GHZ state
    QkCircuit *qc = qk_circuit_new(5, 5);
    uint32_t h_qargs[1] = {0, };
    qk_circuit_gate(qc, QkGate_H, h_qargs, NULL);

    for (int i = 1; i < 5; i++) {
        uint32_t qubits[2] = {0, i};
        qk_circuit_gate(qc, QkGate_CX, qubits, NULL);
    }
    for (int i = 0; i < 5; i++) {
        qk_circuit_measure(qc, i, i);
    }
    // Access IQP and select backend
    int res = 0;
    Service *service;
    res = qkrt_service_new(&service);
    if (res != 0) {
        printf("service new failed with code: %d\n", res);
        goto cleanup;
    }
    BackendSearchResults *results;
    res = qkrt_backend_search(&results, service);
    if (res != 0) {
        printf("backend search failed with code: %d\n", res);
        goto cleanup_service;
    }
    Backend *least_busy = qkrt_backend_search_results_least_busy(results);
    printf("The least busy is: %s (%s)\n", qkrt_backend_name(least_busy), qkrt_backend_instance_name(least_busy));
    // Transpile circuit for backend
    QkTarget *target = qkrt_get_backend_target(service, least_busy);
    QkTranspileResult transpile_result = {NULL, NULL};
    char *error = NULL;
    QkTranspileOptions options = qk_transpiler_default_options();
    options.seed = 42;
    int result_code = qk_transpile(qc, target, &options, &transpile_result, &error);
    if (result_code != 0) {
        printf("transpilation failed with: %s", error);
        goto cleanup_transpile;
    }
    // Run Job on backend
    int32_t shots = 10000;
    Job *job;
    res = qkrt_sampler_job_run(&job, service, least_busy, transpile_result.circuit, shots, NULL);
    if (res != 0) {
        printf("job submit failed with code: %d\n", res);
        goto cleanup_search;
    }
    printf("job submit successful!\n");
    uint32_t status;
    do {
        sleep(5);
        res = qkrt_job_status(&status, service, job);
        if (res != 0) {
            printf("status poll failed with code: %d\n", res);
            goto cleanup;
        }
        printf("current status: %d\n", status);
    } while (status == 0 || status == 1);
    printf("job terminated with status: %d\n", status);
    Samples *samples;
    res = qkrt_sampler_job_results(&samples, service, job);

    printf("Job has %d samples\nThe first sample is:\n", qkrt_samples_num_samples(samples));
    char *first_sample = qkrt_samples_get_sample(samples, 0);
    printf("%s\n", first_sample);
    qkrt_str_free(first_sample);
    qkrt_samples_free(samples);

    qkrt_job_free(job);

    cleanup_search:
    qkrt_backend_search_results_free(results);
    cleanup_transpile:
    qk_circuit_free(transpile_result.circuit);
    qk_transpile_layout_free(transpile_result.layout);
    cleanup_target:
    qk_target_free(target);
    cleanup_service:
    qkrt_service_free(service);
    cleanup:
    qk_circuit_free(qc);
}

The least busy is: ibm_pittsburgh (performance-premium-us)
job submit successful!
current status: 1
current status: 2
job terminated with status: 2
Job has 10000 samples
The first sample is:
0x0


# Running QAOA from C

As an exercise for running locally you can use the below program to experiment with running the full QAOA example from the previous section using only C. There are still some limitations with what's exposed in the C API, such as the circuit library in C not containing `qaoa_ansatz()`.

To run this program it requires the following dependencies be installed and the versions this was
tested with:

* qiskit (in standalone C mode) - 2.3.0
* qiskit-ibm-runtime-c (with WIP estimator PR) - https://github.com/Qiskit/qiskit-ibm-runtime-c/pull/11
* nlopt - 2.10.0 - https://github.com/stevengj/nlopt
* uthash - 2.3.0 - https://github.com/troydhanson/uthash


In [8]:
//%cflags:-lqiskit -lqiskit_ibm_runtime -L/home/computertreker/git/qiskit/qiskit-core/dist/c/lib -I/home/computertreker/git/qiskit/qiskit-core/dist/c/include -L/home/computertreker/git/qiskit/qiskit-ibm-runtime-rs/target/release -I/home/computertreker/git/qiskit/qiskit-ibm-runtime-rs/include -lm -lnlopt

// This code is licensed under the Apache License, Version 2.0. You may
// obtain a copy of this license in the LICENSE.txt file in the root directory
// of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
//
// Any modifications or derivative works of this code must retain this
// copyright notice, and modified files need to carry a notice indicating
// that they have been altered from the originals.

// This program is used to run the QAOA algorithm on an IBM quantum computer using
// Qiskit to find the max cut for an arbitrary graph. Right now the function only
// operates on a fixed in code edge_list. Eventually the goal is to make it arbitrary so
// that it takes an input to an edge list file (likely in rustworkx's csv format)
// so that you can run a max cut for any graph.

#include <math.h>
#ifdef __STDC_ALLOC_LIB__
#define __STDC_WANT_LIB_EXT2__ 1
#else
#define _POSIX_C_SOURCE 200809L
#endif

#include <nlopt.h>
#include <qiskit.h>
#include <qiskit_ibm_runtime/qiskit_ibm_runtime.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <uthash.h>


char * strdup (const char *s);
QkCircuit *build_qaoa_ansatz(QkObs *hamiltonian, uint32_t reps, const double *beta,
                             double *gamma);
QkObs *edge_list_to_cost_hamiltonian(uint32_t num_nodes, uint32_t num_edges, uint32_t *edge_list);
char *most_frequent_sampler_run(Service *service, Backend *backend, QkCircuit *circuit,
                                int32_t shots);
void optimize_parameter(Service *service, Backend *backend, const QkObs *hamiltonian, uint32_t reps,
                        double *params);

const double PI = 3.14159265358979323846264338327950288;

int main() {
    // Access IQP and select least busy backend
    int res = 0;
    Service *service;
    res = qkrt_service_new(&service);
    if (res != 0) {
        printf("service new failed with code: %d\n", res);
        exit(1);
    }
    BackendSearchResults *results;
    res = qkrt_backend_search(&results, service);
    if (res != 0) {
        printf("backend search failed with code: %d\n", res);
        exit(1);
    }
    Backend *backend = qkrt_backend_search_results_least_busy(results);
    // Construct hamiltonian from edge list
    // Edge list represents a barbell graph with weights on either end of 5 nodes and 3 bar nodes
    // Generated in Python using rustwork by calling:
    // ```python
    // import rustworkx as rx
    //
    // graph = rx.generators.barbell_graph(5, 3)
    // print(graph.edge_list())
    // ```
    uint32_t edge_list[][2] = {{0, 1},  {0, 2},  {0, 3},  {0, 4},   {1, 2},   {1, 3},
                               {1, 4},  {2, 3},  {2, 4},  {3, 4},   {4, 5},   {5, 6},
                               {6, 7},  {7, 8},  {8, 9},  {8, 10},  {8, 11},  {8, 12},
                               {9, 10}, {9, 11}, {9, 12}, {10, 11}, {10, 12}, {11, 12}};
    QkObs *cost_hamiltonian = edge_list_to_cost_hamiltonian(13, 24, (uint32_t *)edge_list);
    uint32_t reps = 3;
    double *params = malloc(sizeof(double) * 2 * reps);
    // Find best parameters using estimator and a nlopt
    optimize_parameter(service, backend, cost_hamiltonian, reps, params);
    // Build a final ansatz using the found parameters and transpile it
    printf("Final Params: [");
    for (int i = 0; i < 2 * reps; i++) {
        printf("%f, ", params[i]);
    }
    printf("]\n");
    QkCircuit *ansatz = build_qaoa_ansatz(cost_hamiltonian, reps, &params[reps], params);
    // Setup transpilation
    QkTarget *target = qkrt_get_backend_target(service, backend);
    QkTranspileResult transpile_result = {NULL, NULL};
    QkTranspileOptions options = qk_transpiler_default_options();
    char *transpile_error = NULL;
    options.seed = 42;
    QkExitCode ret = qk_transpile(ansatz, target, &options, &transpile_result, &transpile_error);
    if (ret != 0) {
        printf("Transpilation failed\n");
        exit(2);
    }
    // Find the best bit string of the final sampler execution
    char *most_frequent =
        most_frequent_sampler_run(service, backend, transpile_result.circuit, 100000);
    printf("Most frequent sample: %s\n", most_frequent);
    qkrt_service_free(service);
    qk_target_free(target);
    qkrt_backend_search_results_free(results);
    qk_obs_free(cost_hamiltonian);
    qk_circuit_free(ansatz);
    free(params);
}

/* Build a QAOA ansatz circuit with a bound parameter value
 *
 * @param hamiltonian The pointer to the sparse observable with the cost hamiltonian for max cut
 * problem
 * @param reps The number of layer repitions to generate the circuit with
 * @param beta The array of doubles representing the beta parameter values for each layer repition.
 *     This must at least have `reps` entries
 * @param gamma The array of doubles representing the gamma parameters for each layer repition. This
 *     must at least have `reps entries`. This array will be mutated in place to apply the scaling
 *     factor of 2. You should copy the array before passing it if you don't want the original array
 *     modified.
 *
 * @return A handle to a `QkCircuit` object containing the ansatz circuit bound with the specified
 * parameters.
 */
QkCircuit *build_qaoa_ansatz(QkObs *hamiltonian, uint32_t reps, const double *beta,
                             double *gamma) {
    uint32_t num_qubits = qk_obs_num_qubits(hamiltonian);
    QkCircuit *qc = qk_circuit_new(num_qubits, qk_obs_num_qubits(hamiltonian));
    for (uint32_t i = 0; i < num_qubits; i++) {
        uint32_t qargs[1] = {
            i,
        };
        qk_circuit_gate(qc, QkGate_H, qargs, NULL);
    }
    QkObsTerm term;
    for (int i = 0; i < reps; i++) {
        gamma[i] *= 2;
    }

    for (uint32_t rep = 0; rep < reps; rep++) {
        for (int32_t i = 0; i < qk_obs_num_terms(hamiltonian); i++) {
            qk_obs_term(hamiltonian, i, &term);
            // TODO: check that terms are actually zz and use the appropriate gate if not
            qk_circuit_gate(qc, QkGate_RZZ, term.indices, &beta[rep]);
        }
        for (uint32_t i = 0; i < num_qubits; i++) {
            qk_circuit_gate(qc, QkGate_RX, &i, &gamma[rep]);
        }
    }
    for (uint32_t i = 0; i < num_qubits; i++) {
        qk_circuit_measure(qc, i, i);
    }
    return qc;
}

/* Construct a cost hamiltonian from a given edge list
 *
 * @param num_nodes The number of nodes in the graph
 * @param num_edges The number of edges in the graph.
 * @param edge_list A pointer to an array for the edge list for the edges in the graph. This must
 * have at least `2 * num_edges` elements as this will be iterated over pair wise so that every 2
 *     elements are used to determine the edge endpoints in the graph.
 * @return A handle to the `QkObs` for the cost hamilonian for solving the max cut of the graph
 *     specified by the given edge list
 */
QkObs *edge_list_to_cost_hamiltonian(uint32_t num_nodes, uint32_t num_edges, uint32_t *edge_list) {
    QkObs *obs = qk_obs_zero(num_nodes);
    for (int i = 0; i < 2 * num_edges; i += 2) {
        QkComplex64 coeff = {1, 0};
        QkBitTerm bit_terms[2] = {QkBitTerm_Z, QkBitTerm_Z};
        uint32_t indices[2] = {edge_list[i], edge_list[i + 1]};
        QkObsTerm term = {coeff, 2, bit_terms, indices, num_nodes};
        qk_obs_add_term(obs, &term);
    }
    return obs;
}

struct SampleEntry {
    const char *sample;
    int count;
    UT_hash_handle hh;
};

int by_count(const struct SampleEntry *a, const struct SampleEntry *b) {
    // Reverse order here to make largest element first
    return (b->count - a->count);
}

struct MinimizeContext {
    Service *service;
    Backend *backend;
    const QkObs *obs;
};

unsigned int iteration_count = 0;

// The function to minimize with nlopt_minimize
double minimize(int n, const double *params, double *grad, void *f_data) {
    struct MinimizeContext *context = (struct MinimizeContext *)f_data;
    uint32_t reps = n / 2;
    QkObs *obs_copy = qk_obs_copy(context->obs);
    double *gamma = malloc(sizeof(double) * reps);
    for (uint32_t i=0; i < reps; i++) {
       gamma[i] = params[i];
    }
    QkCircuit *qc = build_qaoa_ansatz(obs_copy, n / 2, &params[reps], gamma);
    QkTarget *target = qkrt_get_backend_target(context->service, context->backend);
    QkTranspileResult transpile_result = {NULL, NULL};
    QkTranspileOptions options = qk_transpiler_default_options();
    char *transpile_error = NULL;
    options.seed = 42;
    QkExitCode ret = qk_transpile(qc, target, &options, &transpile_result, &transpile_error);

    uint32_t num_output_qubits = qk_transpile_layout_num_output_qubits(transpile_result.layout);
    if (ret != 0) {
        printf("transpilation failed");
        exit(1);
    }
    // get the layout including the ancillas (hence the ``false`` in the function call)
    uint32_t *layout = malloc(sizeof(uint32_t) * num_output_qubits);
    qk_transpile_layout_final_layout(transpile_result.layout, false, layout);
    qk_obs_apply_layout(obs_copy, layout, num_output_qubits);
    Job *job;
    printf("Submitting executor job for iteration %d\n", iteration_count);
    iteration_count++;
    int res = qkrt_estimator_job_run(&job, context->service, context->backend,
                                     transpile_result.circuit, obs_copy, NULL);
    if (res != 0) {
        printf("job submit failed with code: %d\n", res);
        exit(1);
    }
    printf("job submit successful!\n");
    uint32_t status;
    do {
        sleep(5);
        res = qkrt_job_status(&status, context->service, job);
        if (res != 0) {
            printf("status poll failed with code: %d\n", res);
        }
    } while (status == 0 || status == 1);
    if (status != 2) {
        printf("job terminated with error status: %d\n", status);
        exit(1);
    }
    ExpectationValues *evs;
    qkrt_estimator_job_results(&evs, context->service, job);
    double ev = qkrt_expectation_values_get_ev(evs, 0);
    free(gamma);
    free(layout);
    qk_target_free(target);
    qk_circuit_free(qc);
    qk_circuit_free(transpile_result.circuit);
    qk_transpile_layout_free(transpile_result.layout);
    qk_obs_free(obs_copy);
    qkrt_expectation_values_free(evs);
    qkrt_job_free(job);
    return ev;
}

/* Optimize the parameter set for running QAOA
 *
 * @param service A handle to the service to use when running the estimator
 * @param backend A handle to the backend to use when running the estimator
 * @param hamiltonian The cost hamiltonian representing the max cut problem
 * @param reps The number of layer repitions to use in the ansatz circuit
 * @param params the pointer to the parameter array that is found to minimize
 *      the cost for the ansatz.
 */
void optimize_parameter(Service *service, Backend *backend, const QkObs *hamiltonian, uint32_t reps,
                        double *params) {
    struct MinimizeContext context = {
        service,
        backend,
        hamiltonian,
    };
    for (uint32_t i = 0; i < reps; i++) {
        params[i + reps] = PI / 2;
        params[i] = PI;
    }
    double *lb = malloc(sizeof(double) * 2 * reps);
    double *ub = malloc(sizeof(double) * 2 * reps);
    for (int i = 0; i < 2 * reps; i++) {
        lb[i] = -2 * PI;
        ub[i] = 2 * PI;
    }
    double best_ev;
    nlopt_minimize(NLOPT_LN_COBYLA, reps * 2, minimize, &context, lb, ub, params, &best_ev,
                   -INFINITY, -1, 0.001, -1, NULL, 30, -1);
}

/* Return the sample with the highest occurance rate when running a circuit through the sampler
 *
 * @param service A handle to the service to use when running the sampler
 * @param backend A handle to the backend to run the sampler on
 * @param circuit A pointer to the already transpiled circuit to run through the sampler
 * @param shots The number of shots (i.e. the number of samples to collect)
 *
 * @return A string of which sample had the highest frequence in the output.
 */
char *most_frequent_sampler_run(Service *service, Backend *backend, QkCircuit *circuit,
                                int32_t shots) {
    // Run circuit through sampler
    Job *job;
    int res = qkrt_sampler_job_run(&job, service, backend, circuit, shots, NULL);
    uint32_t status;
    do {
        sleep(5);
        res = qkrt_job_status(&status, service, job);
        if (res != 0) {
            printf("Status poll failed with code %d\n", res);
            exit(1);
        }

    } while (status == 0 || status == 1);
    if (status != 2) {
        printf("Job status failed with: %d\n", status);
        exit(1);
    }
    Samples *samples;
    res = qkrt_sampler_job_results(&samples, service, job);
    if (res != 0) {
        printf("Getting sampler job results failed with code %d\n", res);
        exit(1);
    }
    struct SampleEntry *s, *tmp, *sample_entries = NULL;
    for (int i = 0; i < qkrt_samples_num_samples(samples); i++) {
        char *sample = qkrt_samples_get_sample(samples, i);
        HASH_FIND_STR(sample_entries, sample, s);
        if (s) {
            s->count += 1;
            continue;
        }
        s = (struct SampleEntry *)malloc(sizeof *s);
        s->sample = sample;
        s->count = 1;
        HASH_ADD_KEYPTR(hh, sample_entries, s->sample, strlen(sample), s);
    }
    HASH_SORT(sample_entries, by_count);
    char *out_sample = strdup(sample_entries->sample);
    HASH_ITER(hh, sample_entries, s, tmp) {
        HASH_DEL(sample_entries, s);
        free(s);
    }
    qkrt_samples_free(samples);
    return out_sample;
}

/tmp/tmp1ny6ya6h.c: In function ‘optimize_parameter’:
  286 |     nlopt_minimize(NLOPT_LN_COBYLA, reps * 2, minimize, &context, lb, ub, params, &best_ev,
      |     ^~~~~~~~~~~~~~
In file included from /tmp/tmp1ny6ya6h.c:24:
/usr/include/nlopt.h:320:28: note: declared here
  320 | NLOPT_EXTERN(nlopt_result) nlopt_minimize(nlopt_algorithm algorithm, int n, nlopt_func_old f, void *f_data,
      |                            ^~~~~~~~~~~~~~


Submitting executor job for iteration 0
job submit successful!
Submitting executor job for iteration 1
job submit successful!
Submitting executor job for iteration 2
job submit successful!
Submitting executor job for iteration 3
job submit successful!
Submitting executor job for iteration 4
job submit successful!
Submitting executor job for iteration 5
job submit successful!
Submitting executor job for iteration 6
job submit successful!
Submitting executor job for iteration 7
job submit successful!
Submitting executor job for iteration 8
job submit successful!
Submitting executor job for iteration 9
job submit successful!
Submitting executor job for iteration 10
job submit successful!
Submitting executor job for iteration 11
job submit successful!
Submitting executor job for iteration 12
job submit successful!
Submitting executor job for iteration 13
job submit successful!
Submitting executor job for iteration 14
job submit successful!
Submitting executor job for iteration 15
job submi