# Formatting JSON results to Latex table directly

This should be easier for data comparison and collecting results directly to paper.

## Load the corresponding json file from the experiment
There are json files generated from each one of the exeperiment file with 'save' interface, these files should be loaded here and be parsed and concatenated.

_**Warning**_: 
Mockturtle collects different version of results on the same experiment, so clear the json file before running.

TODO: 
- [x] Change spaces(column name) and '_'(case name) to '\_' is for future Latex rendering.
- [ ] Automatically get the latest version of the entry of the result.
- [ ] Make the function more arbitrary rather than hard code the names.
- [ ] Try to give a version with multilevel column names.

In [51]:
import pandas as pd
import json
import os
home_path = os.path.expanduser('~')

with open(home_path + '/Downloads/Github/mockturtle/experiments/reader_loss.json') as f:
    dataOri = json.load(f)
dfOri = pd.DataFrame(dataOri[0]['entries'])
selected_columns = ['benchmark', 'nodes num before', 'num back to aig']
dfOri_filtered = dfOri[selected_columns]
dfOri_filtered.columns = dfOri_filtered.columns.str.replace(' ', '\_')
dfOri_filtered['benchmark'] = dfOri_filtered['benchmark'].str.replace('_', '\_')
print(dfOri_filtered)

     benchmark  nodes\_num\_before  num\_back\_to\_aig
0        adder                1020                1631
1          bar                3336                5114
2          div               57247               50489
3         log2               32060               40526
4          max                2865                3691
5   multiplier               27062               51787
6          sin                5416                7625
7         sqrt               24618               21285
8       square               18484               25341
9      arbiter               11839               12158
10       cavlc                 693                 987
11        ctrl                 174                 158
12         dec                 304                 508
13         i2c                1342                1736
14   int2float                 260                 310
15   mem\_ctrl               46836               54773
16    priority                 978                 922
17      ro

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dfOri_filtered['benchmark'] = dfOri_filtered['benchmark'].str.replace('_', '\_')


In [52]:
with open(home_path + '/Downloads/Github/mockturtle/experiments/reader_partition_data.json') as f:
    dataCur = json.load(f)
dfCur = pd.DataFrame(dataCur[0]['entries'])
selected_columns = ['benchmark', 'nodes num before', 'num back to aig']
dfCur_filtered = dfCur[selected_columns]
dfCur_filtered.columns = dfCur_filtered.columns.str.replace(' ', '\_')
dfCur_filtered['benchmark'] = dfCur_filtered['benchmark'].str.replace('_', '\_')
print(dfCur_filtered)

     benchmark  nodes\_num\_before  num\_back\_to\_aig
0        adder                1020                1631
1          bar                3336                4796
2          div               57247               79779
3         log2               32060               62536
4          max                2865                4576
5   multiplier               27062               69042
6          sin                5416                9471
7         sqrt               24618               35288
8       square               18484               33333
9      arbiter               11839               12452
10       cavlc                 693                 943
11        ctrl                 174                 154
12         dec                 304                 515
13         i2c                1342                1741
14   int2float                 260                 314
15   mem\_ctrl               46836               54825
16    priority                 978                1272
17      ro

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dfCur_filtered['benchmark'] = dfCur_filtered['benchmark'].str.replace('_', '\_')


## Concatenated the table
The columns with same names should be side by side for a general view.

In [53]:
# Step 1: Add suffixes to distinguish columns
df1_suffixed = dfOri_filtered.add_suffix('\_df1')
df2_suffixed = dfCur_filtered.add_suffix('\_df2')

# Step 2: Concatenate side by side
combined_df = pd.concat([df1_suffixed, df2_suffixed], axis=1)

# Step 3: Sort columns to group matching fields
# Extract base column names (without suffix)
columns_sorted = sorted(
    combined_df.columns,
    key=lambda x: x.split('\_')[0]  # Sort by base name (e.g., 'depth before')
)

# Reorder DataFrame columns
combined_df = combined_df[columns_sorted]
print(combined_df)

   benchmark\_df1 benchmark\_df2  nodes\_num\_before\_df1  \
0           adder          adder                     1020   
1             bar            bar                     3336   
2             div            div                    57247   
3            log2           log2                    32060   
4             max            max                     2865   
5      multiplier     multiplier                    27062   
6             sin            sin                     5416   
7            sqrt           sqrt                    24618   
8          square         square                    18484   
9         arbiter        arbiter                    11839   
10          cavlc          cavlc                      693   
11           ctrl           ctrl                      174   
12            dec            dec                      304   
13            i2c            i2c                     1342   
14      int2float      int2float                      260   
15      mem\_ctrl      m

The check and merge process is important, the check is also necessary, since sometimes we run the experiment only on partical cases or in different order, this will give us a hint. If all are the same, then the column such as **benchmark** or **nodes_num_before** will be automatically merged.

In [54]:
column_pairs = [
    ('benchmark\_df1', 'benchmark\_df2', 'benchmark'),
    ('nodes\_num\_before\_df1', 'nodes\_num\_before\_df2', 'nodes\_num\_before')
]

for col1, col2, new_name in column_pairs:
    if col1 in combined_df.columns and col2 in combined_df.columns:  # Check if columns exist
        if combined_df[col1].equals(combined_df[col2]):
            print(f"Merging '{col1}' and '{col2}' into '{new_name}'.")
            merged_col = combined_df[col1]
            combined_df = combined_df.drop([col1, col2], axis=1)
            # if new_name != 'nodes_num_before':
            combined_df.insert(0, new_name, merged_col)
        else:
            print(f"'{col1}' and '{col2}' are different. Keeping both.")
    else:
        print(f"Columns '{col1}' or '{col2}' not found. Skipping.")
print("\nFinal DataFrame:")
print(combined_df)

Merging 'benchmark\_df1' and 'benchmark\_df2' into 'benchmark'.
Merging 'nodes\_num\_before\_df1' and 'nodes\_num\_before\_df2' into 'nodes\_num\_before'.

Final DataFrame:
    nodes\_num\_before   benchmark  num\_back\_to\_aig\_df1  \
0                 1020       adder                     1631   
1                 3336         bar                     5114   
2                57247         div                    50489   
3                32060        log2                    40526   
4                 2865         max                     3691   
5                27062  multiplier                    51787   
6                 5416         sin                     7625   
7                24618        sqrt                    21285   
8                18484      square                    25341   
9                11839     arbiter                    12158   
10                 693       cavlc                      987   
11                 174        ctrl                      158   
12      

In [55]:
def swap_columns(df, col1, col2):
    columns = df.columns.tolist()
    idx1, idx2 = columns.index(col1), columns.index(col2)
    columns[idx1], columns[idx2] = columns[idx2], columns[idx1]
    return df[columns]

# Swap 'A' and 'B'
df_final = swap_columns(combined_df, 'benchmark', 'nodes\_num\_before')
print(df_final)

     benchmark  nodes\_num\_before  num\_back\_to\_aig\_df1  \
0        adder                1020                     1631   
1          bar                3336                     5114   
2          div               57247                    50489   
3         log2               32060                    40526   
4          max                2865                     3691   
5   multiplier               27062                    51787   
6          sin                5416                     7625   
7         sqrt               24618                    21285   
8       square               18484                    25341   
9      arbiter               11839                    12158   
10       cavlc                 693                      987   
11        ctrl                 174                      158   
12         dec                 304                      508   
13         i2c                1342                     1736   
14   int2float                 260                     

In [56]:
latex_table = df_final.to_latex(index=False)  # Remove index column
print(latex_table)

\begin{tabular}{lrrr}
\toprule
benchmark & nodes\_num\_before & num\_back\_to\_aig\_df1 & num\_back\_to\_aig\_df2 \\
\midrule
adder & 1020 & 1631 & 1631 \\
bar & 3336 & 5114 & 4796 \\
div & 57247 & 50489 & 79779 \\
log2 & 32060 & 40526 & 62536 \\
max & 2865 & 3691 & 4576 \\
multiplier & 27062 & 51787 & 69042 \\
sin & 5416 & 7625 & 9471 \\
sqrt & 24618 & 21285 & 35288 \\
square & 18484 & 25341 & 33333 \\
arbiter & 11839 & 12158 & 12452 \\
cavlc & 693 & 987 & 943 \\
ctrl & 174 & 158 & 154 \\
dec & 304 & 508 & 515 \\
i2c & 1342 & 1736 & 1741 \\
int2float & 260 & 310 & 314 \\
mem\_ctrl & 46836 & 54773 & 54825 \\
priority & 978 & 922 & 1272 \\
router & 257 & 273 & 278 \\
voter & 13758 & 21357 & 20846 \\
\bottomrule
\end{tabular}



## Color the table
Color the cells will make the results much more easier to compare.

In [57]:
def highlight_larger(row):
    """
    Returns LaTeX formatting for cells where the value is larger
    in the comparison between Column A and Column B
    """
    a = row['num\_back\_to\_aig\_df1']
    b = row['num\_back\_to\_aig\_df2']
    
    if a > b:
        return ['', f'\\cellcolor{{green!25}}']
    elif b > a:
        return [f'\\cellcolor{{red!25}}', '']
    else:
        return ['', '']

# Apply the highlighting
formatted_cells = df_final.apply(highlight_larger, axis=1, result_type='expand')
formatted_cells.columns = ['num\_back\_to\_aig\_df1', 'num\_back\_to\_aig\_df2']

# Combine with original values
for col in ['num\_back\_to\_aig\_df1', 'num\_back\_to\_aig\_df2']:
    df_final[col] = formatted_cells[col] + df_final[col].astype(str)

# Generate LaTeX
latex_code = df_final.to_latex(
    index=False,
    escape=False,  # Important: allows LaTeX commands
    column_format='cccc'  # Adjust as needed
)

print(latex_code)

\begin{tabular}{cccc}
\toprule
benchmark & nodes\_num\_before & num\_back\_to\_aig\_df1 & num\_back\_to\_aig\_df2 \\
\midrule
adder & 1020 & 1631 & 1631 \\
bar & 3336 & 5114 & \cellcolor{green!25}4796 \\
div & 57247 & \cellcolor{red!25}50489 & 79779 \\
log2 & 32060 & \cellcolor{red!25}40526 & 62536 \\
max & 2865 & \cellcolor{red!25}3691 & 4576 \\
multiplier & 27062 & \cellcolor{red!25}51787 & 69042 \\
sin & 5416 & \cellcolor{red!25}7625 & 9471 \\
sqrt & 24618 & \cellcolor{red!25}21285 & 35288 \\
square & 18484 & \cellcolor{red!25}25341 & 33333 \\
arbiter & 11839 & \cellcolor{red!25}12158 & 12452 \\
cavlc & 693 & 987 & \cellcolor{green!25}943 \\
ctrl & 174 & 158 & \cellcolor{green!25}154 \\
dec & 304 & \cellcolor{red!25}508 & 515 \\
i2c & 1342 & \cellcolor{red!25}1736 & 1741 \\
int2float & 260 & \cellcolor{red!25}310 & 314 \\
mem\_ctrl & 46836 & \cellcolor{red!25}54773 & 54825 \\
priority & 978 & \cellcolor{red!25}922 & 1272 \\
router & 257 & \cellcolor{red!25}273 & 278 \\
voter & 13758