# Large-scale Constained Binary Optimization with Choco-Q 

**Author:** Debin Xiang & Qifan Jiang

**Date:** 02/02/2025

Based on paper "[Choco-Q: Commute Hamiltonian-based QAOA for Constrained Binary Optimization][1]" (Accepted by HPCA 2025)

[1]: https://ieeexplore.ieee.org/document/TBD

Corresponding to Table 1 in the original paper, but with the Choco-Q* part omitted.

For the large-scale program, refer to `../data/chocoq_examples/evaluate.py`. Since the runtime for all scales in the paper is relatively long (about one day in our 128-threads CPU), it is recommended to run it in the background.

In [1]:
import os
os.chdir("..")
import sys
sys.path.append('..')
import logging
logging.basicConfig(level=logging.WARN)

import pandas as pd
pd.set_option('display.max_rows', None)  # display all rows
pd.set_option('display.max_columns', None)  # display all columns.

In [2]:
file_path = "./data/chocoq_examples/large_scale"

df1 = pd.read_csv(f"{file_path}/evaluate_depth.csv", encoding='utf-8')

grouped_df1 = df1.groupby(['pkid', 'layers', 'method'], as_index=False).agg({
    "culled_depth": 'mean',
})

pivot_df1 = grouped_df1.pivot(index=['pkid'], columns='method', values=["culled_depth"])

method_order1 = ['PenaltySolver', 'CyclicSolver', 'HeaSolver', 'ChocoSolver']
pivot_df1 = pivot_df1.reindex(columns=pd.MultiIndex.from_product([["culled_depth"], method_order1]))

df2 = pd.read_csv(f"{file_path}/evaluate_other.csv")
df2 = df2.drop(columns=['pbid','ARG'])

## The detailed data of Table I

In [3]:

df2[['best_solution_probs', 'in_constraints_probs', 'iteration_count',
     'classcial', 'quantum', 'run_times']] = df2[['best_solution_probs', 'in_constraints_probs', 'iteration_count',
                                                  'classcial', 'quantum', 'run_times']].apply(pd.to_numeric, errors='coerce')
grouped_df2 = df2.groupby(['pkid', 'layers', 'variables', 'constraints', 'method'], as_index=False).agg({
    # "ARG": 'mean',
    'in_constraints_probs': 'mean',
    'best_solution_probs': 'mean',
    'iteration_count': 'mean',
    'classcial': 'mean',
    'run_times': 'mean',
})

pivot_df2 = grouped_df2.pivot(index=['pkid', 'variables', 'constraints'], columns='method', values=["best_solution_probs", 'in_constraints_probs'])

method_order2 = ['PenaltySolver', 'CyclicSolver', 'HeaSolver', 'ChocoSolver']
pivot_df2 = pivot_df2.reindex(columns=pd.MultiIndex.from_product([["best_solution_probs", 'in_constraints_probs'], method_order2]))

merged_df = pd.merge(pivot_df1, pivot_df2, on='pkid', how='inner')

merged_df = merged_df[['culled_depth', 'best_solution_probs', 'in_constraints_probs']]
merged_df = merged_df.rename(columns={
    'culled_depth': 'Circuit depth',
    'best_solution_probs': 'Success rate (%)',
    'in_constraints_probs': 'In-constraints rate (%)'
})

merged_df = merged_df.rename(columns={
    'PenaltySolver': 'Penalty',
    'CyclicSolver': 'Cyclic',
    'HeaSolver': 'HEA',
    'ChocoSolver': 'Choco-Q'
})

merged_df.index = ['F1', 'F2', 'F3', 'F4', 'G1', 'G2', 'G3', 'G4', 'K1', 'K2', 'K3', 'K4']

merged_df


Unnamed: 0_level_0,Circuit depth,Circuit depth,Circuit depth,Circuit depth,Success rate (%),Success rate (%),Success rate (%),Success rate (%),In-constraints rate (%),In-constraints rate (%),In-constraints rate (%),In-constraints rate (%)
Unnamed: 0_level_1,Penalty,Cyclic,HEA,Choco-Q,Penalty,Cyclic,HEA,Choco-Q,Penalty,Cyclic,HEA,Choco-Q
F1,40.0,65.0,30.0,44.0,4.423828,6.5625,4.121094,28.183594,14.052734,37.851562,17.607422,100.0
F2,64.0,94.0,75.0,172.0,0.087891,0.166016,0.0,50.566406,0.371094,3.603516,0.205078,100.0
F3,68.0,98.0,105.0,233.0,0.0,0.019531,0.0,23.330078,0.019531,1.894531,0.0,100.0
F4,80.0,118.0,140.0,337.0,0.0,0.0,0.0,35.439453,0.009766,0.400391,0.0,100.0
G1,135.2,200.0,60.0,167.3,0.136719,2.275391,0.107422,35.009766,0.625,38.056641,0.585938,100.0
G2,158.0,248.0,75.0,145.2,0.019531,0.0,0.039062,27.109375,0.068359,39.570312,0.185547,100.0
G3,472.2,563.4,120.0,337.8,0.0,0.0,0.0,0.332031,0.0,59.21875,0.0,100.0
G4,492.8,611.8,140.0,305.7,0.0,0.0,0.0,0.0,0.0,69.003906,0.0,99.980469
K1,82.8,140.2,40.0,114.6,1.015625,8.740234,1.494141,36.728516,3.90625,30.488281,3.710938,100.0
K2,142.6,204.2,90.0,384.9,0.0,0.009766,0.0,6.181641,0.039062,50.361328,0.039062,100.0


The results in this table do not completely align with those in Table I due to the random generation of benchmark configurations. To save time, we omitted reproducing Choco-Q* in Table I. Nevertheless, it is evident that Choco-Q demonstrates a significant advantage over other baselines. In the following, we calculate the improvement over the state-of-art baseline-- Cyclic.

# caculate the improvement over Cyclic

In [4]:
import pandas as pd

# Assuming 'merged_df' already contains the necessary data (after the previous steps)

# Calculate the improvement for each row

# Circuit depth improvement: cyclic / Choco-Q
merged_df['Circuit_depth_improvement'] = merged_df[('Circuit depth', 'Cyclic')] / merged_df[('Circuit depth', 'Choco-Q')]

# Success rate improvement: Choco-Q / cyclic
merged_df['Success_rate_improvement'] = merged_df[('Success rate (%)', 'Choco-Q')] / merged_df[('Success rate (%)', 'Cyclic')]

# In-constraints rate improvement: Choco-Q / cyclic
merged_df['In_constraints_rate_improvement'] = merged_df[('In-constraints rate (%)', 'Choco-Q')] / merged_df[('In-constraints rate (%)', 'Cyclic')]

# Filter out rows where any improvement column has a zero denominator or zero numerator (to avoid division by zero)
valid_rows = merged_df[(merged_df[('Circuit depth', 'Cyclic')] != 0) & (merged_df[('Circuit depth', 'Choco-Q')] != 0) &
                       (merged_df[('Success rate (%)', 'Cyclic')] != 0) & (merged_df[('Success rate (%)', 'Choco-Q')] != 0) &
                       (merged_df[('In-constraints rate (%)', 'Cyclic')] != 0) & (merged_df[('In-constraints rate (%)', 'Choco-Q')] != 0)]

# Calculate the average improvement for each metric
avg_circuit_depth_improvement = valid_rows['Circuit_depth_improvement'].mean()
avg_success_rate_improvement = valid_rows['Success_rate_improvement'].mean()
avg_in_constraints_rate_improvement = valid_rows['In_constraints_rate_improvement'].mean()

improvement_table = pd.DataFrame({
    'Circuit Depth': [avg_circuit_depth_improvement],
    'Success Rate': [avg_success_rate_improvement],
    'In-constraints Rate': [avg_in_constraints_rate_improvement]
}, index=['Improvement relative to Cyclic'])

improvement_table


Unnamed: 0,Circuit Depth,Success Rate,In-constraints Rate
Improvement relative to Cyclic,0.898959,359.328563,15.178224
