#### Constraint Solving Problems

In [6]:
puzzle_generation_system_prompt = """
You are a question generator that creates challenging analytical reasoning questions similar to those used in logic-based assessments or civil service exams. Each question should involve reasoning, deduction, or analysis of conditions, and be followed by four answer options (A–D).

Use the following examples as a guide for format, style, and logic complexity:

Example 1:

Question:
Fair use refers to the non-commercial use of works published by others without the permission of the copyright owner, and without having to pay remuneration under the circumstances specified in the law. The "cases specified in the law" mainly include:
(1) Personal study, research or appreciation, using published works of others;
(2) Performing published works for free;
(3) Copying, painting, photography, video recording of artistic works installed or displayed in outdoor public places;
(4) Translating published works created in Chinese into minority languages and publishing them.

According to the above provisions, which of the following are examples of fair use?
A. A sang an unpublished song at the class party
B. B translates an English work into Mongolian work and publishes it
C. Company C took the sculptures in the public square and made them into pictures
D. Ding Wei wrote a paper and copied a paper published by Geng in a journal for reference


Example 2:

Question:
A.B, and C are from the school football team, table tennis team, and basketball team. There is only one correct statement:
(1) A is on the football team;
(2) B is not on the football team;
(3) C is not on the basketball team.

Which team are A, B, and C on?
A. A is on the football team; B is on the basketball team; C is on the table tennis team
B. A is on the basketball team; B is on the football team; C is on the table tennis team
C. A is on the table tennis team; B is on the football team; C is on the basketball team
D. A is on the table tennis team; B is on the basketball team; C is on the football team

Example 3:

Question:
There are five volcanic islands E, F, G, H, and I arranged in a straight line from north to south. It is known:
(1) F is adjacent to H and is to the north of H;
(2) I and E are adjacent;
(3) G is somewhere to the north of F.

If G is the northernmost island, how many possible arrangements are there for the islands?
A. 2
B. 3
C. 4
D. 5

Format the answer in JSON. Here is an example:
[
    {"question": "generated question 1"},
    {"question": "generated question 2"}
]
"""

puzzle_generation_prompt = """
Now, generate 20 new question in the same format. It must:
- Involve logic or deductive reasoning
- Provide 4 answer options
- Include a clearly correct answer
- Be original and not copied from the examples

Do not solve the problem just generate a new question.
"""

In [13]:
import os
from AzureAdapter import AzureAdapter
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("AZURE_API_KEY")
api_endpoint = os.getenv("AZURE_API_ENDPOINT")
api_version = os.getenv("AZURE_API_VERSION")
deployment_name = "gpt-4o"

llm = AzureAdapter(api_key=api_key, api_endpoint=api_endpoint, api_version=api_version)
puzzles = llm.call_model(system_prompt=puzzle_generation_system_prompt, prompt=puzzle_generation_prompt, deployment_name=deployment_name)

In [11]:
import os
import json
import time

def save_output_as_json(output, file_name, output_dir):
    try:
        # Ensure the output directory exists
        os.makedirs(output_dir, exist_ok=True)

        # Parse the output if it's a string
        if isinstance(output, str):
            output = json.loads(output)

        # Construct the full file path
        file_path = os.path.join(output_dir, file_name)

        # Save the output to a JSON file
        with open(file_path, 'w') as json_file:
            json.dump(output, json_file, indent=4)
        print(f"Output successfully saved to {file_path}")
    except (json.JSONDecodeError, TypeError) as e:
        print(f"Failed to save output as JSON: {e}")

save_output_as_json(puzzles, f"puzzles_{time.time()}.json", "../data/synthetic_data")

Output successfully saved to ../data/synthetic_data/puzzles_1744466876.273887.json


In [12]:
puzzles

'[\n    {"question": "Three friends, X, Y, and Z, each own one of the three pets: a dog, a cat, and a rabbit. There is only one correct statement:\\n(1) X owns the dog;\\n(2) Y does not own the cat;\\n(3) Z does not own the rabbit.\\nWhich pets do X, Y, and Z own?\\nA. X owns the dog; Y owns the rabbit; Z owns the cat\\nB. X owns the cat; Y owns the dog; Z owns the rabbit\\nC. X owns the rabbit; Y owns the dog; Z owns the cat\\nD. X owns the dog; Y owns the cat; Z owns the rabbit"},\n    {"question": "Five colleagues M, N, O, P, and Q are seated in a row facing a stage. It is known:\\n(1) M is immediately to the left of P;\\n(2) N is next to Q on the right;\\n(3) O is somewhere to the left of M.\\nIf N is seated at one end of the row, who is seated exactly in the middle?\\nA. M\\nB. N\\nC. O\\nD. P"},\n    {"question": "In a competition, there are five finalists: A, B, C, D, and E. It is known:\\n(1) A\'s score is higher than B\'s;\\n(2) C\'s score is lower than both D\'s and E\'s but 

#### Extract Statements

In [13]:
condition_extraction_system_prompt = """
Extract all the logical statements from the question text below. A "statement" is any individual sentence or clause presenting a condition, background fact, or relation. Ignore answer choices. Return the output as a JSON object with two keys:
- "question": the original full text.
- "question_parsing": an array of strings where each string is one extracted statement.

Example:
Input:
"There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. Each person goes to one of these two countries. (1) If G goes to the UK, then H goes to the United States. (2) If L goes to the UK, both M and U go to the US. (3) The country W went to was different from the country Z went to. (4) The country where U goes is different from the country where G goes. (5) If Z goes to the UK, then H also goes to the UK. If G goes to the United States, which of the following must be true? A. H go to the UK; B. L go to America; C. M go to the UK; D. W go to America"
Output:
{
  "question": "There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. Each person goes to one of these two countries. (1) If G goes to the UK, then H goes to the United States. (2) If L goes to the UK, both M and U go to the US. (3) The country W went to was different from the country Z went to. (4) The country where U goes is different from the country where G goes. (5) If Z goes to the UK, then H also goes to the UK. If G goes to the United States, which of the following must be true? A. H go to the UK; B. L go to America; C. M go to the UK; D. W go to America",
  "question_parsing": [
    "There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection.",
    "Each person goes to one of these two countries.",
    "If G goes to the UK, then H goes to the United States.",
    "If L goes to the UK, both M and U go to the US.",
    "The country W went to was different from the country Z went to.",
    "The country where U goes is different from the country where G goes.",
    "If Z goes to the UK, then H also goes to the UK.",
    "G goes to the United States."
  ]
}
"""

condition_extraction_prompt = "Now, extract the statements from the given question text: \n" + str(puzzles)

In [14]:
question_parsing = llm.call_model(system_prompt=condition_extraction_system_prompt, prompt=condition_extraction_prompt, deployment_name=deployment_name)

In [15]:
save_output_as_json(question_parsing, f"question_parsing_{time.time()}.json", "../data/synthetic_data")


Output successfully saved to ../data/synthetic_data/question_parsing_1744469886.10512.json


In [21]:
import json

# Path to the JSON file
# file_path = '../data/synthetic_data/question_parsing_1744469886.10512.json'
file_path = '../data/train.json'

# Load the JSON file
with open(file_path, 'r') as file:
    data = json.load(file)

# Print the loaded data
question_parsing = data[0].get('question_parsing')

#### Solver

In [22]:
solver_system_prompt = """
You are an expert logic puzzle parser. Your task is to extract from a given natural language puzzle description all variables and constraints, and output a JSON object that can later be used to construct a constraint solver with the python‑constraint package.

Please output the JSON object following this exact schema:

{
    "variables": [
        {
            "name": "<variable_name_as_string>",
            "domain": [ <list_of_possible_values> ]
        },
        ...
    ],
    "constraints": [
        {
            "variables": [ "<var1>", "<var2>", ... ],
            "expression": "<Python expression string representing the constraint>"
        },
        ...
    ]
}

Guidelines:
1. **Variables:**
   - Identify all variables mentioned (typically represented by letters or names).
   - For each variable, if the domain (the list of possible values) is explicitly defined in the puzzle, use that; otherwise, infer a default domain based on context.
     For example, if the puzzle involves countries and mentions "United Kingdom" and "United States", use the domain `["UK", "US"]`.
2. **Constraints:**
   - Convert each condition statement into a constraint expression.
   - Use a Python logical expression format that can be passed to the python‑constraint package.
   - Reference variables exactly as they appear.
   - For example, given the statement:
     "If G goes to the UK, then H goes to the United States."
     a possible constraint expression is:
     `"G != 'UK' or H == 'US'"`
     and the corresponding constraint object should include the variable names `["G", "H"]`.
3. **Output:**
   - Do not include any additional text or explanation; output only the JSON object.

For example, if the puzzle description is:

"There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. The school has only 7 students participating in this activity, and each person happens to go to one of these two countries. Considering the specialty of each student, the activity must meet the following conditions:
(1) If G goes to the UK, then H goes to the United States.
(2) If L goes to the UK, then both M and U go to the US.
(3) The country W went to is different from the country Z went to.
(4) The country where U goes is different from the country where G goes.
(5) If Z goes to the UK, then H also goes to the UK."

Your output should include:
- Variables: G, H, L, M, U, W, Z (each with an appropriate domain, e.g., `["UK", "US"]`)
- Constraints:
   - For condition (1): `{"variables": ["G", "H"], "expression": "G != 'UK' or H == 'US'"}`
   - For condition (2): `{"variables": ["L", "M", "U"], "expression": "L != 'UK' or (M == 'US' and U == 'US')"}`
   - For condition (3): `{"variables": ["W", "Z"], "expression": "W != Z"}`
   - For condition (4): `{"variables": ["U", "G"], "expression": "U != G"}`
   - For condition (5): `{"variables": ["Z", "H"], "expression": "Z != 'UK' or H == 'UK'"}`

Now, please process the following puzzle description and output the corresponding JSON:

"There's the puzzle description here:
'There are 7 outstanding students G, H, L, M, U, W and Z in a school. During the summer vacation, the school will send them to the United Kingdom and the United States for inspection. The school has only 7 students participating in this activity, and each person happens to go to one of these two countries. Considering the specialty of each student, the activity must meet the following conditions: (1) If G goes to the UK, then H goes to the United States. (2) If L goes to the UK, then both M and U go to the US. (3) The country W went to is different from the country Z went to. (4) The country where U goes is different from the country where G goes. (5) If Z goes to the UK, then H also goes to the UK.'"

Output only the JSON.
"""

solver_prompt = "Here are the statements: " + str(question_parsing)

In [23]:
solver_structure = llm.call_model(system_prompt=solver_system_prompt, prompt=solver_prompt, deployment_name=deployment_name)

In [24]:
solver_structure

'{\n    "variables": [\n        {\n            "name": "G",\n            "domain": ["UK", "US"]\n        },\n        {\n            "name": "H",\n            "domain": ["UK", "US"]\n        },\n        {\n            "name": "L",\n            "domain": ["UK", "US"]\n        },\n        {\n            "name": "M",\n            "domain": ["UK", "US"]\n        },\n        {\n            "name": "U",\n            "domain": ["UK", "US"]\n        },\n        {\n            "name": "W",\n            "domain": ["UK", "US"]\n        },\n        {\n            "name": "Z",\n            "domain": ["UK", "US"]\n        }\n    ],\n    "constraints": [\n        {\n            "variables": ["G", "H"],\n            "expression": "G != \'UK\' or H == \'US\'"\n        },\n        {\n            "variables": ["L", "M", "U"],\n            "expression": "L != \'UK\' or (M == \'US\' and U == \'US\')"\n        },\n        {\n            "variables": ["W", "Z"],\n            "expression": "W != Z"\n        },

In [28]:
json.loads(solver_structure)

{'variables': [{'name': 'G', 'domain': ['UK', 'US']},
  {'name': 'H', 'domain': ['UK', 'US']},
  {'name': 'L', 'domain': ['UK', 'US']},
  {'name': 'M', 'domain': ['UK', 'US']},
  {'name': 'U', 'domain': ['UK', 'US']},
  {'name': 'W', 'domain': ['UK', 'US']},
  {'name': 'Z', 'domain': ['UK', 'US']}],
 'constraints': [{'variables': ['G', 'H'],
   'expression': "G != 'UK' or H == 'US'"},
  {'variables': ['L', 'M', 'U'],
   'expression': "L != 'UK' or (M == 'US' and U == 'US')"},
  {'variables': ['W', 'Z'], 'expression': 'W != Z'},
  {'variables': ['U', 'G'], 'expression': 'U != G'},
  {'variables': ['Z', 'H'], 'expression': "Z != 'UK' or H == 'UK'"},
  {'variables': ['G'], 'expression': "G == 'US'"}]}

In [31]:
import json
from constraint import Problem

def build_solver_from_json(config):
    problem = Problem()

    # Get the set of defined variable names.
    defined_vars = {var["name"] for var in config["variables"]}

    # Verify that all variables referenced in constraints are defined.
    for constr in config["constraints"]:
        for var in constr["variables"]:
            if var not in defined_vars:
                raise ValueError(f"Constraint references variable '{var}' which is not defined in the variables list.")

    # Add all defined variables to the problem.
    for var in config["variables"]:
        name = var["name"]
        domain = var["domain"]
        problem.addVariable(name, domain)

    # Add constraints.
    for constr in config["constraints"]:
        var_list = constr["variables"]
        expression = constr["expression"]
        # Create a lambda function string.
        args = ", ".join(var_list)
        lambda_str = f"lambda {args}: {expression}"
        try:
            constraint_func = eval(lambda_str)
        except Exception as e:
            raise ValueError(f"Error evaluating lambda expression: {lambda_str}") from e
        problem.addConstraint(constraint_func, var_list)

    return problem

In [32]:
solver = build_solver_from_json(config=json.loads(solver_structure))
solutions = solver.getSolutions()
print("Solutions found:")
print(solutions)

Solutions found:
[{'G': 'US', 'U': 'UK', 'H': 'US', 'Z': 'US', 'W': 'UK', 'L': 'US', 'M': 'US'}, {'G': 'US', 'U': 'UK', 'H': 'US', 'Z': 'US', 'W': 'UK', 'L': 'US', 'M': 'UK'}, {'G': 'US', 'U': 'UK', 'H': 'UK', 'Z': 'UK', 'W': 'US', 'L': 'US', 'M': 'US'}, {'G': 'US', 'U': 'UK', 'H': 'UK', 'Z': 'UK', 'W': 'US', 'L': 'US', 'M': 'UK'}, {'G': 'US', 'U': 'UK', 'H': 'UK', 'Z': 'US', 'W': 'UK', 'L': 'US', 'M': 'US'}, {'G': 'US', 'U': 'UK', 'H': 'UK', 'Z': 'US', 'W': 'UK', 'L': 'US', 'M': 'UK'}]


#### Generate cot and cot parsing