*Copyright (c) Meta Platforms, Inc. and affiliates. This source code is licensed under the license found in the LICENSE file in the root directory of this source tree.*

This notebook includes the even simpler code for generating the "box-over-box" data as presented in Figure 1 of our **Part 2.2** paper (https://arxiv.org/pdf/2408.16293v1). It provides an even simpler math problem set which requires topological sort (just like iGSM) but has maximally removed the surrounding English. As one can see from our Part 2.2 paper, even GPT-4o can fail on such data and cannot correct its own mistakes.

In [1]:
import random
import string
import networkx as nx

def generate_random_dag(n):
    # Create an empty directed graph
    dag = nx.DiGraph()
    
    # Add nodes to the graph
    dag.add_nodes_from(range(n))
    
    # Iterate over each node
    for node in range(n):
        # Generate a list of possible nodes it can connect to
        possible_targets = range(node + 1, n)
        
        # Randomly select up to 4 nodes from the possible targets
        mmax = min(4, len(possible_targets))
        if mmax==0:
            targets = []
        else:
            targets = random.sample(possible_targets, random.randint(1, mmax))
        
        # Add edges from the current node to the selected targets
        dag.add_edges_from([(node, target) for target in targets])
    
    return dag


In [2]:

# Parameters
N = 26  # Number of nodes

# Generate a random DAG
dag = generate_random_dag(N)

#node_names = random.sample(string.ascii_letters, N)
node_names = random.sample(string.ascii_uppercase, N)
node_values = [random.randint(0, 9) for _ in range(N)]

node_total = [0]*N
for node in reversed(list(dag.nodes())):
    successors = list(dag.successors(node))
    node_total[node] = node_values[node]
    for suc in successors:
        node_total[node] += node_total[suc]

all_s = []
for node in reversed(list(dag.nodes())):
    successors = list(dag.successors(node))
    all_s += [f"Each box {node_names[node]} weights {node_values[node]} pounds on its own. "]
    for suc in successors:
        all_s += [f"Each box {node_names[node]} has a box {node_names[suc]} inside it. "]
random.shuffle(all_s)
print()
print("".join(all_s))
print()
print(f"What is the total weight of box {node_names[0]}?")
print(f"Answer = {node_total[0]}")

for node in dag.nodes():
    successors = list(dag.successors(node))
    #print(f"Node {node}: {', '.join(map(str, successors)) if successors else 'No successors'}")
    print(f"Node {node}/{node_names[node]}/{node_total[node]}: ", end='')
    for suc in successors:
        print(f"{suc}/{node_names[suc]}, ", end='')
        #{', '.join(map(str, successors)) if successors else 'No successors'}")
    print()

print(node_total)


Each box I has a box N inside it. Each box N has a box H inside it. Each box J has a box X inside it. Each box Y has a box N inside it. Each box T has a box B inside it. Each box K has a box P inside it. Each box Z weights 7 pounds on its own. Each box D weights 6 pounds on its own. Each box Q has a box J inside it. Each box X weights 8 pounds on its own. Each box E has a box V inside it. Each box G has a box M inside it. Each box K has a box F inside it. Each box R has a box K inside it. Each box H weights 7 pounds on its own. Each box Y weights 9 pounds on its own. Each box F has a box H inside it. Each box V has a box F inside it. Each box E weights 3 pounds on its own. Each box R has a box U inside it. Each box X has a box F inside it. Each box I has a box P inside it. Each box W has a box C inside it. Each box M has a box C inside it. Each box Z has a box X inside it. Each box I weights 2 pounds on its own. Each box A has a box N inside it. Each box O weights 5 pounds on its own.