#### Constraint Solving Problems

In [6]:
puzzle_generation_system_prompt = """
You are a question generator that creates challenging analytical reasoning questions similar to those used in logic-based assessments or civil service exams. Each question should involve reasoning, deduction, or analysis of conditions, and be followed by four answer options (A–D).

Use the following examples as a guide for format, style, and logic complexity:

Example 1:

Question:
Fair use refers to the non-commercial use of works published by others without the permission of the copyright owner, and without having to pay remuneration under the circumstances specified in the law. The "cases specified in the law" mainly include:
(1) Personal study, research or appreciation, using published works of others;
(2) Performing published works for free;
(3) Copying, painting, photography, video recording of artistic works installed or displayed in outdoor public places;
(4) Translating published works created in Chinese into minority languages and publishing them.

According to the above provisions, which of the following are examples of fair use?
A. A sang an unpublished song at the class party
B. B translates an English work into Mongolian work and publishes it
C. Company C took the sculptures in the public square and made them into pictures
D. Ding Wei wrote a paper and copied a paper published by Geng in a journal for reference


Example 2:

Question:
A.B, and C are from the school football team, table tennis team, and basketball team. There is only one correct statement:
(1) A is on the football team;
(2) B is not on the football team;
(3) C is not on the basketball team.

Which team are A, B, and C on?
A. A is on the football team; B is on the basketball team; C is on the table tennis team
B. A is on the basketball team; B is on the football team; C is on the table tennis team
C. A is on the table tennis team; B is on the football team; C is on the basketball team
D. A is on the table tennis team; B is on the basketball team; C is on the football team

Example 3:

Question:
There are five volcanic islands E, F, G, H, and I arranged in a straight line from north to south. It is known:
(1) F is adjacent to H and is to the north of H;
(2) I and E are adjacent;
(3) G is somewhere to the north of F.

If G is the northernmost island, how many possible arrangements are there for the islands?
A. 2
B. 3
C. 4
D. 5

Format the answer in JSON. Here is an example:
[
    {"question": "generated question 1"},
    {"question": "generated question 2"}
]
"""

puzzle_generation_prompt = """
Now, generate 20 new question in the same format. It must:
- Involve logic or deductive reasoning
- Provide 4 answer options
- Include a clearly correct answer
- Be original and not copied from the examples

Do not solve the problem just generate a new question.
"""

In [13]:
import os
from utils.AzureAdapter import AzureAdapter
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("AZURE_API_KEY")
api_endpoint = os.getenv("AZURE_API_ENDPOINT")
api_version = os.getenv("AZURE_API_VERSION")
deployment_name = "gpt-4o"

llm = AzureAdapter(api_key=api_key, api_endpoint=api_endpoint, api_version=api_version)
puzzles = llm.call_model(system_prompt=puzzle_generation_system_prompt, prompt=puzzle_generation_prompt, deployment_name=deployment_name)

In [11]:
import os
import json
import time

def save_output_as_json(output, file_name, output_dir):
    try:
        # Ensure the output directory exists
        os.makedirs(output_dir, exist_ok=True)

        # Parse the output if it's a string
        if isinstance(output, str):
            output = json.loads(output)

        # Construct the full file path
        file_path = os.path.join(output_dir, file_name)

        # Save the output to a JSON file
        with open(file_path, 'w') as json_file:
            json.dump(output, json_file, indent=4)
        print(f"Output successfully saved to {file_path}")
    except (json.JSONDecodeError, TypeError) as e:
        print(f"Failed to save output as JSON: {e}")

save_output_as_json(puzzles, f"puzzles_{time.time()}.json", "data/synthetic_data")

Output successfully saved to ../data/synthetic_data/puzzles_1744466876.273887.json


In [12]:
puzzles

'[\n    {"question": "Three friends, X, Y, and Z, each own one of the three pets: a dog, a cat, and a rabbit. There is only one correct statement:\\n(1) X owns the dog;\\n(2) Y does not own the cat;\\n(3) Z does not own the rabbit.\\nWhich pets do X, Y, and Z own?\\nA. X owns the dog; Y owns the rabbit; Z owns the cat\\nB. X owns the cat; Y owns the dog; Z owns the rabbit\\nC. X owns the rabbit; Y owns the dog; Z owns the cat\\nD. X owns the dog; Y owns the cat; Z owns the rabbit"},\n    {"question": "Five colleagues M, N, O, P, and Q are seated in a row facing a stage. It is known:\\n(1) M is immediately to the left of P;\\n(2) N is next to Q on the right;\\n(3) O is somewhere to the left of M.\\nIf N is seated at one end of the row, who is seated exactly in the middle?\\nA. M\\nB. N\\nC. O\\nD. P"},\n    {"question": "In a competition, there are five finalists: A, B, C, D, and E. It is known:\\n(1) A\'s score is higher than B\'s;\\n(2) C\'s score is lower than both D\'s and E\'s but 

#### Extract Statements

In [13]:
condition_extraction_system_prompt = """
Extract all the logical statements from the question text below. A "statement" is any individual sentence or clause presenting a condition, background fact, or relation. Ignore answer choices. Return the output as a JSON object with two keys:
- "question": the original full text.
- "question_parsing": an array of strings where each string is one extracted statement.

Example:
Input:
"There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. Each person goes to one of these two countries. (1) If G goes to the UK, then H goes to the United States. (2) If L goes to the UK, both M and U go to the US. (3) The country W went to was different from the country Z went to. (4) The country where U goes is different from the country where G goes. (5) If Z goes to the UK, then H also goes to the UK. If G goes to the United States, which of the following must be true? A. H go to the UK; B. L go to America; C. M go to the UK; D. W go to America"
Output:
{
  "question": "There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. Each person goes to one of these two countries. (1) If G goes to the UK, then H goes to the United States. (2) If L goes to the UK, both M and U go to the US. (3) The country W went to was different from the country Z went to. (4) The country where U goes is different from the country where G goes. (5) If Z goes to the UK, then H also goes to the UK. If G goes to the United States, which of the following must be true? A. H go to the UK; B. L go to America; C. M go to the UK; D. W go to America",
  "question_parsing": [
    "There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection.",
    "Each person goes to one of these two countries.",
    "If G goes to the UK, then H goes to the United States.",
    "If L goes to the UK, both M and U go to the US.",
    "The country W went to was different from the country Z went to.",
    "The country where U goes is different from the country where G goes.",
    "If Z goes to the UK, then H also goes to the UK.",
    "G goes to the United States."
  ]
}
"""

condition_extraction_prompt = "Now, extract the statements from the given question text: \n" + str(puzzles)

In [14]:
question_parsing = llm.call_model(system_prompt=condition_extraction_system_prompt, prompt=condition_extraction_prompt, deployment_name=deployment_name)

In [15]:
save_output_as_json(question_parsing, f"question_parsing_{time.time()}.json", "data/synthetic_data")


Output successfully saved to ../data/synthetic_data/question_parsing_1744469886.10512.json


In [70]:
import json

# Path to the JSON file
# file_path = '../data/synthetic_data/question_parsing_1744469886.10512.json'
file_path = '../data/train.json'

# Load the JSON file
with open(file_path, 'r') as file:
    data = json.load(file)

# Print the loaded data
question_parsing = data[2].get('question_parsing')

#### Solver

In [71]:
solver_system_prompt = """
You are an expert logic puzzle parser. Your task is to extract from a given natural language puzzle description all variables and constraints, and output a JSON object that can later be used to construct a constraint solver with the python‑constraint package.

Please output the JSON object following this exact schema:

{
    "variables": [
        {
            "name": "<variable_name_as_string>",
            "domain": [ <list_of_possible_values> ]
        },
        ...
    ],
    "constraints": [
        {
            "variables": [ "<var1>", "<var2>", ... ],
            "expression": "<Python expression string representing the constraint>"
        },
        ...
    ]
}

Guidelines:
1. **Variables:**
   - Identify all variables mentioned (typically represented by letters or names).
   - For each variable, if the domain (the list of possible values) is explicitly defined in the puzzle, use that; otherwise, infer a default domain based on context.
     For example, if the puzzle involves countries and mentions "United Kingdom" and "United States", use the domain `["UK", "US"]`.
2. **Constraints:**
   - Convert each condition statement into a constraint expression.
   - Use a Python logical expression format that can be passed to the python‑constraint package.
   - Reference variables exactly as they appear.
   - For example, given the statement:
     "If G goes to the UK, then H goes to the United States."
     a possible constraint expression is:
     `"G != 'UK' or H == 'US'"`
     and the corresponding constraint object should include the variable names `["G", "H"]`.
3. **Output:**
   - Do not include any additional text or explanation; output only the JSON object.

For example, if the puzzle description is:

"There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. The school has only 7 students participating in this activity, and each person happens to go to one of these two countries. Considering the specialty of each student, the activity must meet the following conditions:
(1) If G goes to the UK, then H goes to the United States.
(2) If L goes to the UK, then both M and U go to the US.
(3) The country W went to is different from the country Z went to.
(4) The country where U goes is different from the country where G goes.
(5) If Z goes to the UK, then H also goes to the UK."

Your output should include:
- Variables: G, H, L, M, U, W, Z (each with an appropriate domain, e.g., `["UK", "US"]`)
- Constraints:
   - For condition (1): `{"variables": ["G", "H"], "expression": "G != 'UK' or H == 'US'"}`
   - For condition (2): `{"variables": ["L", "M", "U"], "expression": "L != 'UK' or (M == 'US' and U == 'US')"}`
   - For condition (3): `{"variables": ["W", "Z"], "expression": "W != Z"}`
   - For condition (4): `{"variables": ["U", "G"], "expression": "U != G"}`
   - For condition (5): `{"variables": ["Z", "H"], "expression": "Z != 'UK' or H == 'UK'"}`

Now, please process the following puzzle description and output the corresponding JSON:

"There's the puzzle description here:
'There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. The school has only 7 students participating in this activity, and each person happens to go to one of these two countries. Considering the specialty of each student, the activity must meet the following conditions: (1) If G goes to the UK, then H goes to the United States. (2) If L goes to the UK, then both M and U go to the US. (3) The country W went to is different from the country Z went to. (4) The country where U goes is different from the country where G goes. (5) If Z goes to the UK, then H also goes to the UK.'"

Output only the JSON.
"""

solver_prompt = "Here are the statements: " + str(question_parsing)

In [72]:
solver_structure = llm.call_model(system_prompt=solver_system_prompt, prompt=solver_prompt, deployment_name=deployment_name)

In [73]:
solver_structure

'{\n    "variables": [\n        {\n            "name": "G_position",\n            "domain": ["front_1", "front_2", "middle_1", "middle_2", "back_1", "back_2", "not_playing"]\n        },\n        {\n            "name": "H_position",\n            "domain": ["front_1", "front_2", "middle_1", "middle_2", "back_1", "back_2", "not_playing"]\n        },\n        {\n            "name": "K_position",\n            "domain": ["front_1", "front_2", "middle_1", "middle_2", "back_1", "back_2", "not_playing"]\n        },\n        {\n            "name": "L_position",\n            "domain": ["front_1", "front_2", "middle_1", "middle_2", "back_1", "back_2", "not_playing"]\n        },\n        {\n            "name": "N_position",\n            "domain": ["front_1", "front_2", "middle_1", "middle_2", "back_1", "back_2", "not_playing"]\n        },\n        {\n            "name": "P_position",\n            "domain": ["front_1", "front_2", "middle_1", "middle_2", "back_1", "back_2", "not_playing"]\n        },

In [74]:
json.loads(solver_structure)

{'variables': [{'name': 'G_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',
    'back_2',
    'not_playing']},
  {'name': 'H_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',
    'back_2',
    'not_playing']},
  {'name': 'K_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',
    'back_2',
    'not_playing']},
  {'name': 'L_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',
    'back_2',
    'not_playing']},
  {'name': 'N_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',
    'back_2',
    'not_playing']},
  {'name': 'P_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',
    'back_2',
    'not_playing']},
  {'name': 'Q_position',
   'domain': ['front_1',
    'front_2',
    'middle_1',
    'middle_2',
    'back_1',


In [75]:
import json
from constraint import Problem

def build_solver_from_json(config):
    problem = Problem()

    # Get the set of defined variable names.
    defined_vars = {var["name"] for var in config["variables"]}

    # Verify that all variables referenced in constraints are defined.
    for constr in config["constraints"]:
        for var in constr["variables"]:
            if var not in defined_vars:
                raise ValueError(f"Constraint references variable '{var}' which is not defined in the variables list.")

    # Add all defined variables to the problem.
    for var in config["variables"]:
        name = var["name"]
        domain = var["domain"]
        problem.addVariable(name, domain)

    # Add constraints.
    for constr in config["constraints"]:
        var_list = constr["variables"]
        expression = constr["expression"]
        # Create a lambda function string.
        args = ", ".join(var_list)
        lambda_str = f"lambda {args}: {expression}"
        try:
            constraint_func = eval(lambda_str)
        except Exception as e:
            raise ValueError(f"Error evaluating lambda expression: {lambda_str}") from e
        problem.addConstraint(constraint_func, var_list)

    return problem

In [76]:
solver = build_solver_from_json(config=json.loads(solver_structure))
solutions = solver.getSolutions()
print("Solutions found:")
print(solutions)

Solutions found:
[{'N_position': 'not_playing', 'P_position': 'not_playing', 'Q_position': 'not_playing', 'H_position': 'not_playing', 'K_position': 'not_playing', 'G_position': 'not_playing', 'L_position': 'not_playing'}, {'N_position': 'not_playing', 'P_position': 'not_playing', 'Q_position': 'not_playing', 'H_position': 'not_playing', 'K_position': 'not_playing', 'G_position': 'not_playing', 'L_position': 'back_1'}, {'N_position': 'not_playing', 'P_position': 'not_playing', 'Q_position': 'not_playing', 'H_position': 'not_playing', 'K_position': 'not_playing', 'G_position': 'not_playing', 'L_position': 'middle_1'}, {'N_position': 'not_playing', 'P_position': 'not_playing', 'Q_position': 'not_playing', 'H_position': 'not_playing', 'K_position': 'not_playing', 'G_position': 'not_playing', 'L_position': 'front_1'}, {'N_position': 'not_playing', 'P_position': 'not_playing', 'Q_position': 'not_playing', 'H_position': 'not_playing', 'K_position': 'not_playing', 'G_position': 'front_2', 'L_

#### Generate cot and cot parsing

In [77]:
# 1. Generate cot -> wait and take a break than generate a reasoning trace step by step ...
# 2. Generate statement, evidence and verification for each of the statements

In [79]:
cot_system_prompt = """
You are given a complex logic puzzle with a detailed question and a parsed list of statements. Your task is to generate a complete chain-of-thought explanation that shows every reasoning step you take to arrive at the correct answer. **However, the final chain-of-thought explanation for each question must be output as a single string.** This way, when generating multiple chain-of-thought explanations in one batch, you will output a JSON array where each element is the full chain-of-thought for a different question.

Please follow these instructions:

1. **Pause and Reflect:** Take a moment to understand the overall scenario of the puzzle.
2. **Evaluate Each Condition:** For each parsed statement, explain:
   - Whether the condition applies given the current assumption.
   - What conclusion (if any) is drawn from that condition.
3. **Integrate Your Findings:** Combine the observations from each condition into one coherent explanation.
4. **Format Your Answer:** Your final output must be a JSON array. Each element of this array should be a single string containing the complete chain-of-thought explanation for one question. Do not output multiple strings for one question; instead, merge all the reasoning steps into one continuous explanation per question.

For example, given the following puzzle:

**Question:**
"There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. The school has only 7 students participating in this activity, and each person happens to go to one of these two countries. Considering the specialty of each student, this activity must meet the following conditions:
(1) If G goes to the UK, then H goes to the United States.
(2) If L goes to the UK, then both M and U go to the US.
(3) The country W went to is different from the country Z went to.
(4) The country where U goes is different from the country where G goes.
(5) If Z goes to the UK, then H also goes to the UK.
If G goes to the United States, which of the following must be true?
A. H go to the UK
B. L go to America
C. M go to the UK
D. W go to America"

**Question Parsing:**
- "There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection."
- "each person happens to go to one of these two countries"
- "If G goes to the UK, then H goes to the United States"
- "If L goes to the UK, then both M and U go to the US"
- "The country W went to is different from the country Z went to"
- "The country where U goes is different from the country where G goes"
- "If Z goes to the UK, then H also goes to the UK"
- "G goes to the United States"

**Desired Chain-of-Thought (Single String Example):**
"Since G goes to the United States, condition (1) does not apply because it only activates if G is in the UK. Condition (2) is irrelevant as L's destination is unspecified. Condition (3) provides no direct information regarding the other students. From condition (4), because U must be in a different country than G, and G is in the US, it follows that U must be in the UK. Condition (5) is not activated since Z's destination is unknown. Overall, the only deducible outcome is that U goes to the United Kingdom."

**Your Output:**
Generate a JSON array where each element is a single string containing the full chain-of-thought for a given question. For instance, if you are generating three chain-of-thought explanations in one batch, your output should look like:

[
  "Your complete Chain-of-thought explanation as one string for question 1.",
  "Your complete Chain-of-thought explanation as one string for question 2.",
  "Your complete Chain-of-thought explanation as one string for question 3."
]

Now, please generate your chain-of-thought explanations using the above format and output ONLY the JSON array of strings.
"""

cot_prompt = str(data[2].get('question_parsing')) + data[2].get('question')

In [80]:
cot = llm.call_model(system_prompt=cot_system_prompt, prompt=cot_prompt, deployment_name=deployment_name)

In [81]:
cot

'{\n  "A: Before? Q In? L After? N": [\n    "Let\'s evaluate if the position Before? Q In? L After? N is possible for Team 1: Given that if L is scheduled to play, he must be on team 1, and he is correctly positioned in Team 1, this arrangement satisfies condition (3). Q is in the front, L is in the middle, and N is at the back.",\n    "Since conditions (1) and (2) refer to the positions of G, H, and K, none of which are involved in Option A, those conditions don\'t apply.",\n    "Conditions (4) and (5) concern team associations with N, P, and K, but since P and K are not part of the team in this configuration, these don\'t apply either.",\n    "Condition (6) only discusses H in Team 2 and Q\'s position in Team 1, but H is not mentioned in Option A.",\n    "Thus, there are no conflicts or unmet conditions in this arrangement.",\n    "Therefore, option A is a valid arrangement."\n  ],\n  "B: Before? L Middle? K After? Q": [\n    "Evaluating option B: Before? L Middle? K After? Q.",\n   

In [82]:
cot_parsing_system_prompt = """
You are an expert in analyzing and verifying logical reasoning. Your task is to take a single chain-of-thought explanation and break it down into individual reasoning steps. For each step, you must extract or generate the following:

- **statement**: A concise summary of a single claim or conclusion extracted from the chain-of-thought.
- **evidence**: The specific part of the original condition(s) or input that supports the statement.
- **Verification**: A value of "true" or "false" indicating whether the statement is correct (based on the known conditions and facts).

Your output must be a JSON array where each element is an object with the keys "statement", "evidence", and "Verification".

For example, given the following chain-of-thought:

"Since G goes to the United States, we need to analyze the conditions that follow. Condition (1) is not applicable since G is going to the US. Condition (2) is also not applicable since L's destination is not specified. Condition (3) does not provide any information about H, M, U, or W. Condition (4) states that U's destination is different from G's, which is the US, so U must go to the UK. Condition (5) is not applicable since Z's destination is not specified."

Your output should be exactly in the following format:

[
  {
    "statement": "Condition (1) is not applicable",
    "evidence": "Condition (1): If G goes to the UK, then H goes to the United States. | G is going to the US",
    "Verification": "false"
  },
  {
    "statement": "Condition (2) is also not applicable",
    "evidence": "Condition (2): If L goes to the UK, then both M and U go to the US. | L's destination is not specified",
    "Verification": "false"
  },
  {
    "statement": "Condition (3) does not provide any information about H, M, U, or W",
    "evidence": "Condition (3): The country W went to is different from the country Z went to.",
    "Verification": "false"
  },
  {
    "statement": "U must go to the UK",
    "evidence": "Condition (4): The country where U goes is different from the country where G goes. | Since G is in the US, U must be in the UK",
    "Verification": "true"
  },
  {
    "statement": "Condition (5) is not applicable",
    "evidence": "Condition (5): If Z goes to the UK, then H also goes to the UK. | Z's destination is not specified",
    "Verification": "true"
  }
]

Now, given an input chain-of-thought (a single string), produce the corresponding JSON array of parsed reasoning steps exactly in the format specified above. Do not include any extra text or commentary—output only the JSON array.
"""

cot_parsing_prompt = "cot prompt: \n " + cot_prompt + "\ncot trace: \n" + str(cot)

In [83]:
cot_parsing_prompt

'cot prompt: \n [\'In a magic show, from the seven magicians-G.H.K.L.N.P and Q, choose 6 people to play, and the performance is divided into two teams? 1 team and 2 teams\', \'Each team consists of three positions? front, middle, and back.\', \'The magicians on the field happen to occupy one position each.\', \'If G or H is arranged to play, they must be in the front.\', \'If K is scheduled to play, he must be in the middle.\', \'If L is scheduled to play, he must be on team 1.\', \'Neither P nor K can be in the same team as N.\', \'P cannot be in the same team as Q.\', \'If H is in team 2, Q is in the middle of team 1.\']In a magic show, from the seven magicians-G.H.K.L.N.P and Q, choose 6 people to play, and the performance is divided into two teams? 1 team and 2 teams.Each team consists of three positions? front, middle, and back.The magicians on the field happen to occupy one position each.The choice and location of the magician must meet the following conditions? (1) If G or H is 

In [84]:
cot_parsing = llm.call_model(system_prompt=cot_parsing_system_prompt, prompt=cot_parsing_prompt, deployment_name=deployment_name)

In [85]:
cot_parsing

'{\n  "statement": "Option A is a valid arrangement",\n  "evidence": [\n    "If L is scheduled to play, he must be on team 1, and he is correctly positioned in Team 1",\n    "Conditions (1), (2), (4), (5), and (6) don\'t apply since relevant magicians are not involved",\n    "No conflicts or unmet conditions in this arrangement"\n  ],\n  "Verification": "true"\n}\n  \n '

In [86]:
json.loads(cot_parsing)

{'statement': 'Option A is a valid arrangement',
 'evidence': ['If L is scheduled to play, he must be on team 1, and he is correctly positioned in Team 1',
  "Conditions (1), (2), (4), (5), and (6) don't apply since relevant magicians are not involved",
  'No conflicts or unmet conditions in this arrangement'],
 'Verification': 'true'}