<a href="https://colab.research.google.com/github/fovi-llc/refactor-python/blob/main/konwinski-dspy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This Colab notebook is a copy of the [NeurIPS HackerCup AI Competition (HAC) 2024](https://hackercupai.github.io/) [DSPy](https://github.com/stanfordnlp/dspy) coding example by [Krista Opsahl-Ong](https://www.linkedin.com/in/krista-opsahl-ong-86b096103/), PhD candidate at Stanford hacking on DSPy, from here: https://github.com/stanfordnlp/dspy/tree/main/examples/coding

Weights & Biases interview of Krista about this example:
https://youtu.be/gpe-rtJN8z8

NeurIPS HackerCup AI Competition (HAC) 2024:
https://hackercupai.github.io/

https://www.facebook.com/codingcompetitions/hacker-cup/2024

Hacker Cup AI dataset at Hugging Face:
https://huggingface.co/datasets/hackercupai/hackercup

DSPy:
https://github.com/stanfordnlp/dspy


In [None]:
%load_ext autoreload
%autoreload 2

import sys
import os

os.environ["DSP_NOTEBOOK_CACHEDIR"] = "./cache"

import pkg_resources # Install the package if it's not installed
if "dspy-ai" not in {pkg.key for pkg in pkg_resources.working_set}:
    !pip install -U pip
    !pip install dspy-ai==2.4.17
    !pip install "openai>1,<2"

import dspy

./cache/compiler


In [None]:
from google.colab import userdata
# os.environ["HF_TOKEN"] = userdata.get('HF_TOKEN')
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

import openai

openai.api_key = os.environ.get("OPENAI_API_KEY")
openai.api_base = os.environ.get("OPENAI_API_BASE")

len(openai.api_key)

51

In [None]:
import datasets
import os
import openai
import dspy
from dspy import InputField, OutputField, Signature
from dspy import Example
from dspy.evaluate.evaluate import Evaluate
from dspy.teleprompt.random_search import BootstrapFewShotWithRandomSearch
from datetime import datetime
import random
from dspy.teleprompt import MIPROv2
# from hackercup_utils import extract_code, run, check_solution

In [None]:
import re
import asyncio
import multiprocessing
import concurrent.futures
from typing import List, Optional

"""
Note that this code is largely based off of the code here:
https://github.com/HackerCupAI/starter-kits/blob/main/submit_first_solution/01_one_shot.py

by @tcapelle, with some adaptations for this workflow.
"""

def extract_code(code_str):
    # Regex pattern to extract the code between ```python and ```
    pattern = r"```python\s*([\s\S]*?)\s*```"

    # Use re.search to find the code inside the code block
    match = re.search(pattern, code_str)

    if match:
        # Extract the matched group (the code part inside ```python ... ```)
        code = match.group(1).strip()
    else:
        # Fallback: Assume we still need to strip ``` and ```python if match fails
        code = code_str

    # Remove any leading/trailing ``` and ```python manually, in case the input doesn't match exactly
    code = re.sub(r"^```python\s*", "", code)  # Remove starting ```python
    code = re.sub(r"^```", "", code)  # Remove starting ```
    code = re.sub(r"```$", "", code)  # Remove ending ``

    return code.strip()


import traceback


def run_with_timeout(code: str, input: Optional[str], timeout: int):
    def target_fn(input, return_dict):
        vars = {}
        try:
            exec(code, vars)
            fn = vars.get("solve", lambda x: x)
            return_dict["result"] = fn(input)
        except Exception as e:
            return_dict["error"] = str(e)
            return_dict["stack_trace"] = (
                traceback.format_exc()
            )  # Capture the stack trace

    manager = multiprocessing.Manager()
    return_dict = manager.dict()

    process = multiprocessing.Process(target=target_fn, args=(input, return_dict))
    process.start()
    process.join(timeout)  # Wait for the process to finish or timeout

    if process.is_alive():
        process.terminate()
        process.join()
        return {
            "error": f"Execution exceeded the timeout of {timeout} seconds",
            "result": None,
            "stack_trace": None,
        }

    if "error" in return_dict:
        # Return the error and the stack trace for feedback
        return {
            "error": return_dict["error"],
            "result": None,
            "stack_trace": return_dict.get("stack_trace"),
        }

    return {"result": return_dict.get("result"), "error": None, "stack_trace": None}


async def arun(
    code: Optional[str] = None, input: Optional[str] = None, timeout: int = 60
):
    loop = asyncio.get_running_loop()
    try:
        with concurrent.futures.ThreadPoolExecutor() as executor:
            future = loop.run_in_executor(
                executor, run_with_timeout, code, input, timeout
            )
            result_dict = await asyncio.wait_for(future, timeout=timeout)
        return result_dict
    except asyncio.TimeoutError:
        return {
            "error": f"Function call timed out after {timeout} seconds",
            "result": None,
            "stack_trace": f"Function call timed out. Code needs to be more efficient.",
        }
    except Exception as e:
        return {"error": str(e), "result": None}


# Function to run code synchronously
def run(code: Optional[str] = None, input: Optional[str] = None, timeout: int = 5):
    return asyncio.run(arun(code, input, timeout))

# Function to check the solution
def check_solution(expected: str, actual: str) -> dict:
    "Check the solution against the expected output"
    matches = 0
    expected_lines = expected.strip().split("\n")
    actual_lines = actual.strip().split("\n")
    offending_cases = []
    for expected_line, actual_line in zip(expected_lines, actual_lines):
        expected_line = expected_line.strip()
        actual_line = actual_line.strip()

        if expected_line == actual_line:
            matches += 1  # +1 for the whole line match
        else:
            offending_cases.append((expected_line, actual_line))
    return {
        "matches": matches == len(expected_lines),
        "total": len(expected_lines),
        "offending_cases": offending_cases,
    }

In [None]:
### DEFINING HELPER FUNCTIONS ###

def format_mistakes(solution_results):
    offending_cases = ""
    for offending_case in solution_results["offending_cases"]:
        offending_cases += f"Incorrect Output: {offending_case[1]} -> Expected Output: {offending_case[0]}\n"
    return offending_cases


def get_expected_behavior_str(sample_input, sample_output):
    return f"""
input = {str(sample_input)} # input is of type {type(sample_input)}
output = solve(input)
print(output) # Output is of type string
# Prints: {sample_output}
"""


### DEFINE SIMPLE PIPELINE ###

class GenerateCodeSignature(Signature):
    """You are an expert problem solver. Your task is creating the code to solve the problem at hand in python.

    The program should have a single `solve` method that has the following signature:
    input: [str]: The same Input provided above
    output [str]: The same Output provided above

    Here's an example of the format we'd expect for a simple python program that adds 1 to a number:
    ```python def solve(x: int):\n    return x + 1```

    Note:
    * Do NOT print the Output, instead return it.
    * Make sure that your proposed solution is both time and memory efficient.
    """

    problem_description: str = InputField(format=str)
    expected_behavior: str = InputField(format=str)
    solution: str = OutputField(
        format=str, desc="A plan for how we should go about solving this problem."
    )
    python_program: str = OutputField(format=str)


class SimpleGenerateCode(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate_code = dspy.Predict(GenerateCodeSignature)

    def forward(self, problem_description, sample_input, sample_output):
        expected_behavior = get_expected_behavior_str(sample_input, sample_output)
        python_code = extract_code(
            self.generate_code(
                problem_description=problem_description,
                expected_behavior=expected_behavior,
            ).python_program
        )

        return dspy.Prediction(solution=python_code)


### DEFINE ADVANCED PIPELINE ###

class GenerateCode(dspy.Module):
    def __init__(self, max_tries=2, num_ensembles=3):
        super().__init__()
        # Initialize variables
        self.max_tries = max_tries
        self.num_ensembles = num_ensembles

        # Initialize layers
        self.generate_code = dspy.ChainOfThought("problem_description, expected_behavior -> python_program", n=self.num_ensembles)
        self.fix_code = dspy.ChainOfThought("problem_description, current_code, expected_behavior, current_incorrect_results -> fixed_code")

    def forward(self, problem_description, sample_input, sample_output):

        expected_behavior = get_expected_behavior_str(sample_input, sample_output)
        solutions = self.generate_code(
            problem_description=problem_description, expected_behavior=expected_behavior
        )
        python_solutions = [
             extract_code(solution.python_program) for solution in solutions.completions
        ]

        for i, python_code in enumerate(python_solutions):
            for try_iter in range(self.max_tries):
                # Test our generated code, get a result
                result_dict = run(code=python_code, input=sample_input, timeout=5)
                error, result, stack_trace = (
                    result_dict["error"],
                    result_dict["result"],
                    result_dict["stack_trace"],
                )
                if error:  # Running code led to an exception, fix code
                    python_code = extract_code(
                        self.fix_code(
                            problem_description=problem_description,
                            current_code=python_code,
                            expected_behavior=expected_behavior,
                            current_incorrect_results=stack_trace,
                        ).fixed_code
                    )
                elif result is None:  # Nothing was returned by program
                    python_code = extract_code(
                        self.fix_code(
                            problem_description=problem_description,
                            current_code=python_code,
                            expected_behavior=expected_behavior,
                            current_incorrect_results="Nothing was returned!",
                        ).fixed_code
                    )
                elif not isinstance(result, str):  # Wrong type returned by program
                    python_code = extract_code(
                        self.fix_code(
                            problem_description=problem_description,
                            current_code=python_code,
                            expected_behavior=expected_behavior,
                            current_incorrect_results=f"Returned type {type(result)}, but the result should be a string.",
                        ).fixed_code
                    )
                elif check_solution(sample_output, result)[  # Found a solutiond!
                    "matches"
                ]:
                    print(
                        f"CORRECT SOLN FOUND! CODE OPTION {i}/{len(python_solutions)} | DEBUGGING ITER: {try_iter}/{self.max_tries-1}."
                        )
                    return dspy.Prediction(solution=python_code)
                else:  # Otherwise, we should be able to check the solution
                    solution_results = check_solution(sample_output, result)
                    # with dspy.context(lm=gpt4):
                    python_code = extract_code(
                        self.fix_code(
                            problem_description=problem_description,
                            current_code=python_code,
                            expected_behavior=expected_behavior,
                            current_incorrect_results=format_mistakes(solution_results),
                        ).fixed_code
                    )
        return dspy.Prediction(solution=python_code)

### OPTIMIZATION ###

def optimize_with_mipro(program, prompt_model, task_model, metric, trainset):
    teleprompter = MIPROv2(
        prompt_model=prompt_model,
        task_model=task_model,
        metric=metric,
        num_candidates=5,
        init_temperature=0.5,
        verbose=False,
        log_dir="/lfs/0/kristaoo/dspy/examples/functional/logs",
    )

    optimized_program = teleprompter.compile(
        program.deepcopy(),
        trainset=trainset,
        eval_kwargs=dict(num_threads=16),
        max_bootstrapped_demos=0, # 0-shot optimization
        max_labeled_demos=0,
        num_batches=20,
        minibatch=False, # turning this off bc we have a small trainset already
        seed=9
    )

    now = datetime.now()
    date_time = now.strftime("%Y%m%d_%H%M%S")

    optimized_program.save(f"mipro_optimized_{date_time}")

    return optimized_program

def optimize_with_bootstrap_fewshot(program, task_model, teacher_model, metric, trainset):
    rs_optimizer = BootstrapFewShotWithRandomSearch(
        metric=test_code(timeout=5),
        num_threads=8,
        num_candidate_programs=5,
        max_labeled_demos=0,
        max_bootstrapped_demos=2,
        max_errors =10000,
        teacher_settings=dict(lm=teacher_model)
    )

    optimized_program = rs_optimizer.compile(
        program,
        trainset=trainset,
    )

    now = datetime.now()
    date_time = now.strftime("%Y%m%d_%H%M%S")

    optimized_program.save(f"fewshot_optimized_{date_time}")


    return optimized_program

### DEFINING FUNCTION FOR TESTING CODE TO USE AS METRIC ###
### TODO: why this syntax?
def test_code(timeout=5):
    def metric(example, pred, trace=None):
        if pred.solution is None:
            return 0
        solution_code = pred.solution
        result_dict = run(
            code=solution_code, input=example.sample_input, timeout=timeout
        )
        if not result_dict["result"]:
            return 0
        return int(
            check_solution(example.sample_output, result_dict["result"])["matches"]
        )

    return metric


In [None]:
### LOAD AND PREPARE DATA ###
ds = datasets.load_dataset("hackercupai/hackercup")

# Shuffle data
ds_full_list = list(ds["full"])
rng = random.Random(0)
rng.shuffle(ds_full_list)

# Format dataset to use in DSPy
# TODO: what does this syntax mean
sample_ds = [
    Example(
        problem_description=example["statement"],
        sample_input=example["sample_input"].strip().split("\n"),
        sample_output=example["sample_output"],
    ).with_inputs("problem_description", "sample_input", "sample_output")
    for example in ds["sample"]
    if example["sample_input"]
]

full_ds = [
    Example(
        problem_description=example["statement"],
        sample_input=example["sample_input"].strip().split("\n"),
        sample_output=example["sample_output"],
    ).with_inputs("problem_description", "sample_input", "sample_output")
    for example in ds_full_list
    if example["sample_input"]
]

trainset = sample_ds + full_ds[0:40] # use sample in train because it's easier
testset = full_ds[40:60]

# Configure our dspy settings (particularly LM we're using)
lm = dspy.OpenAI(
    model="gpt-4o-mini-2024-07-18", # Note: didn't find much a difference btwn mini & full gpt-4o
    max_tokens=4000,
    temperature=0.1,
)

dspy.settings.configure(lm=lm)
dspy.configure(experimental=True)

# Setup evaluation function
evaluate = Evaluate(
    devset=testset,
    num_threads=16, # Note: Set this to 1 for debugging purposes
    display_progress=True,
    display_table=5,
    metric=test_code(timeout=5)
)

In [None]:
# Try out a simple program (7.5% on 40 ex)
simple_program = SimpleGenerateCode()
print("Evaluating Simple Program on test...")
evaluate(program=simple_program, devset=testset)

Evaluating Simple Program on test...


Average Metric: 3 / 20  (15.0): 100%|██████████| 20/20 [00:05<00:00,  3.85it/s]


Unnamed: 0,problem_description,sample_input,sample_output,solution,metric
0,"Alice and Bob are servers at *Nim Sum Dim Sum*, a bustling dumpling restaurant. For a staff meal, the manager has generously provided \(N\) plates...","['6', '3', '4 1 2', '2', '1 1', '2', '2 4', '3', '1 3 2', '6', '2 2 3 3 4 4', '8', '6 2...",Case #1: 6 Case #2: 0 Case #3: 2 Case #4: 0 Case #5: 0 Case #6: 19,"def solve(input: list) -> str: T = int(input[0]) results = [] index = 1 for case_number in range(1, T + 1): N = int(input[index]) A...",
1,"**Note: The only difference between this chapter and [chapter 2](https://www.facebook.com/codingcompetitions/hacker-cup/2022/round-1/problems/A2) is that here, all card values are guaranteed to be distinct and only up to...","['4', '5 1', '5 1 2 4 3', '2 4 3 5 1', '4 10', '3 1 4 2', '1 2 3 4', '4 0',...",Case #1: YES Case #2: NO Case #3: NO Case #4: YES,"def solve(input: list) -> str: T = int(input[0]) results = [] index = 1 for t in range(1, T + 1): N, K = map(int,...",
2,"In the rapidly growing towns of Silicon Valley, traffic congestion is becoming a major problem. The governor has contracted Travis, the rock star traffic engineer,...","['4', '2 2 999 999', '2 3 12 11', '4 3 6 6', '50 50 1 1']",Case #1: Possible 333 333 333 333 Case #2: Possible 5 3 1 3 4 3 Case #3: Possible 1 1 1 1 2 1...,"def solve(input: list) -> str: T = int(input[0]) results = [] for i in range(1, T + 1): N, M, A, B = map(int, input[i].split())...",
3,"Alfredo Spaghetti really likes soup, especially when it contains alphabet pasta. Every day he constructs a sentence from letters, places the letters into a bowl...","['5', 'WELCOME TO FACEBOOK HACKERCUP', 'CUP WITH LABEL HACKERCUP BELONGS TO HACKER', 'QUICK CUTE BROWN FOX JUMPS OVER THE LAZY DOG', 'MOVE FAST BE BOLD',...",Case #1: 1 Case #2: 2 Case #3: 1 Case #4: 0 Case #5: 1,"def solve(input: list) -> str: from collections import Counter target_word = ""HACKERCUP"" target_count = Counter(target_word) T = int(input[0]) results = [] for i in range(1,...",✔️ [1]
4,"Every week you take a socially-distanced trip to Overwaitea, your favourite grocery store, to stock up on food. Out front, there are \(N\) shopping carts,...","['5', '1', '2 3', '3', '4 5', '1 4', '5 1', '4', '3 2', '4 6', '5 5', '1 3', '6', '5 6', '3 5',...",Case #1: 11 3 2 Case #2: 190 20 0 16 30 Case #3: 560 0 5 0 24 36 Case #4: 1900 67 0...,"def solve(input: list) -> str: from collections import defaultdict T = int(input[0]) index = 1 results = [] for case_number in range(1, T + 1):...",


15.0

In [None]:
# Try out more advanced pipeline | ~25%
multi_stage_program = GenerateCode(max_tries=3, num_ensembles=3)
print("Evaluating Multi-Stage Program on test...")
evaluate(program=multi_stage_program, devset=testset)

Evaluating Multi-Stage Program on test...


  0%|          | 0/20 [00:00<?, ?it/s]

Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 0
Case #2: 12
Case #3: 2500
Case #4: 499999888
Case #5: 83924361
Case #1: 1
Case #2: 0
Case #3: 0
Case #4: 1
Case #5: 5Case #1: NO
Case #2: NO
Case #3: NO
Case #4: YES

Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 2
Case #2: 3
Case #3: -1
Case #1: 2
Case #2: 2
Case #3: 3
Case #4: 3
Case #5: 5
Case #1: 16
Case #2: 12
Case #3: 4
Case #4: 28
Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1
Case #1: 1 2
Case #2: 2 4 6 8
Case #3: 3 6 9 12 15
Case #4: 4 8 12 16 20 24 28
Case #5: 5 10 15 20 25 30 35 40 45 50 55Case #1: NYYY
Case #2: YNYYY
Case #3: YYYY
Case #4: NNYYYYNNY
Case #5: YYYYYYYYYN
Case #6: NNNNNNYNNNNNNNNNNNNNNNNNNNNYNN

Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 5
Case #2: 14
Case #3: 20
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/2.
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 5
Case #2: 14
Case #3: 20
Case #1: 28

Average Metric: 1 / 1  (100.0):   5%|▌         | 1/20 [00:08<02:34,  8.11s/it]

Case #1: -1
Case #2: -1
Case #3: 4
Case #4: -1
Case #5: -1
Case #1: 28
Case #2: 32
Case #3: 32
Case #4: 809
Case #5: 4600
Case #1: 2
Case #2: 3
Case #3: 4
Case #4: 4
Case #5: -1
Case #1: 28
Case #2: 32
Case #3: 32
Case #4: 809
Case #5: 4600
Case #1: 2
Case #2: 3
Case #3: 4
Case #4: 4
Case #5: -1
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 2
Case #2: 5
Case #3: -1
Case #4: -1
Case #5: -1


Average Metric: 1 / 2  (50.0):  10%|█         | 2/20 [00:11<01:32,  5.16s/it]

Case #1: 6
Case #2: 10
Case #3: 15
Case #4: 3
Case #5: 9
Case #1: 2
Case #2: 3
Case #3: -1
Case #4: -1
Case #5: -1
Case #1: 7
Case #2: 16
Case #3: 9
Case #4: 15
Case #5: 8
Case #1: 0
Case #2: 9
Case #3: 1250
Case #4: 499999972
Case #5: 440560626
CORRECT SOLN FOUND! CODE OPTION 1/3 | DEBUGGING ITER: 0/2.


Average Metric: 2 / 3  (66.7):  15%|█▌        | 3/20 [00:13<01:06,  3.89s/it]

Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO
Case #1: 2 1
Case #2: 2 2
Case #3: 6 1
Case #4: 0 0
Case #5: 14 5
Case #6: 660 10
Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 0 1
Case #2: 0 27 27 27
Case #3: 0 64 64 64 64
Case #4: 0 216 216 216 216 216 216
Case #5: 0 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1
Case #1: 2
Case #2: 3
Case #3: -1
Case #1: 1
Case #2: 0
Case #3: 0
Case #4: 1
Case #5: 5
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/2.
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/2.
Case #1: 2
Case #2: 3
Case #3: -1Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1

Case #1: 1
Case #2: 1
Case #3: 2
Case #4: 2
Case #5: 1


Average Metric: 4 / 5  (80.0):  20%|██        | 4/20 [00:17<01:04,  4.06s/it]

Case #1: 3
Case #2: 1
Case #3: 1
Case #4: 3
Case #5: 5
Case #1: 2 1
Case #2: 2 2
Case #3: 6 1
Case #4: 0 0
Case #5: 14 5
Case #6: 660 10
Case #1: 0
Case #2: 6
Case #3: 500
Case #4: 900000000
Case #5: 244056064
Case #1: 16
Case #2: 12
Case #3: 4
Case #4: 29
Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 1
Case #5: 2Case #1: 2 1
Case #2: 2 2
Case #3: 6 1
Case #4: 0 0
Case #5: 14 12
Case #6: 660 38

Case #1: NYYY
Case #2: YNYYY
Case #3: YYYY
Case #4: NNYYYYNNY
Case #5: YYYYYYYYYN
Case #6: NNNNNNYNNNNNNNNNNNNNNNNNNNNYNN
Case #1: 30
Case #2: 4
Case #3: 3
Case #4: 16
Case #5: 43
Case #1: 4
Case #2: 5
Case #3: 6
Case #4: 1
Case #5: 3
Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 3
Case #2: 0
Case #3: 0
Case #4: 3
Case #5: 5
Case #1: 30
Case #2: 4
Case #3: 3
Case #4: 16
Case #5: 43
Case #1: 0
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 3
Case #5: 2
Case #6: 3
Case #7: 7
Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 1
Case

Average Metric: 4 / 6  (66.7):  30%|███       | 6/20 [00:29<01:09,  4.99s/it]

Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 13

Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0Case #1: YES
Case #2: YES
Case #3: NO
Case #4: YES
Case #1: 0
Case #2: 4
Case #3: 20
Case #4: 165
Case #5: 1330
Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 13

Average Metric: 4 / 7  (57.1):  35%|███▌      | 7/20 [00:32<00:56,  4.33s/it]


Case #1: 0
Case #2: 6
Case #3: 40
Case #4: 405
Case #5: 3610
Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 10
Case #1: YES
Case #2: YES
Case #3: NO
Case #4: YES
Case #1: NYYY
Case #2: YNYYY
Case #3: YYYY
Case #4: NNYYYYNNY
Case #5: YYYYYYYYYN
Case #6: NNNNNNYNNNNNNNNNNNNNNNNNNNNYNN
Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 2
Case #5: 2
Case #6: 2
Case #7: 6
Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO


Average Metric: 4 / 8  (50.0):  40%|████      | 8/20 [00:37<00:55,  4.59s/it]

Case #1: 3
Case #2: 2
Case #3: 4
Case #4: 1
Case #5: 2
Case #1: 0
Case #2: 54
Case #3: 25000
Case #4: 999997172
Case #5: 413036711
Case #1: NYYY
Case #2: YNYNY
Case #3: NYYY
Case #4: NYNNYYNNY
Case #5: NYNNNYNYYN
Case #6: NNNNNNNNNNNYNNNNNYNNNNNNNNNYNN


Average Metric: 4 / 9  (44.4):  45%|████▌     | 9/20 [00:38<00:40,  3.67s/it]

Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO
Case #1: 50
Case #2: 5
Case #3: 3
Case #4: 19
Case #5: 52

Case #1: 0
Case #2: 18
Case #3: 2500
Case #4: 999999944
Case #5: 881121252
Case #1: 3
Case #2: 2
Case #3: 7
Case #4: 1
Case #5: 3Case #1: NYYY
Case #2: YNYNY
Case #3: NYYY
Case #4: NYNNYYNNY
Case #5: NYNNNYNYYN
Case #6: NNNNNNNNNNNYNNNNNYNNNNNNNNNYNN


Average Metric: 4 / 10  (40.0):  50%|█████     | 10/20 [00:40<00:30,  3.04s/it]

Case #1: 70
Case #2: 6
Case #3: 4
Case #4: 21
Case #5: 58
Case #1: 1
Case #2: 1
Case #3: 0
Case #4: 2
Case #5: 1
Case #6: 2
Case #7: 7
Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO


Average Metric: 4 / 11  (36.4):  55%|█████▌    | 11/20 [00:41<00:22,  2.54s/it]

Case #1: 70
Case #2: 7
Case #3: 6
Case #4: 35
Case #5: 120
Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO
Case #1: 3
Case #2: 2
Case #3: 7
Case #4: 1
Case #5: 3Case #1: 1
Case #2: 1
Case #3: 0
Case #4: 2
Case #5: 1
Case #6: 2
Case #7: 7

Case #1: Possible
998 998
1 1
Case #1: Possible
10 1 9
1 1 1
Case #3: Possible
2 1 2
1 1 1
1 1 1
1 1 1
Case #4: Impossible
Case #1: 30
Case #2: 4
Case #3: 4
Case #4: 19
Case #5: 58


Average Metric: 4 / 12  (33.3):  60%|██████    | 12/20 [00:43<00:20,  2.50s/it]

Case #1: 3
Case #2: 2
Case #3: 3
Case #4: 1
Case #5: 2


Average Metric: 4 / 13  (30.8):  65%|██████▌   | 13/20 [00:45<00:15,  2.24s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 30
Case #2: 4
Case #3: 4
Case #4: 19
Case #5: 58

Average Metric: 4 / 14  (28.6):  70%|███████   | 14/20 [00:46<00:10,  1.77s/it]


Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0


Average Metric: 4 / 16  (25.0):  80%|████████  | 16/20 [00:50<00:07,  1.85s/it]

Case #1: 10
Case #2: inf
Case #3: 1
Case #4: inf
Case #1: Possible
998 998
1 1
Case #1: Possible
10 1 9
1 1 1
Case #3: Possible
2 1 2
1 1 1
1 1 1
1 1 1
Case #4: Impossible
Case #1: 8
Case #2: 6
Case #3: 1
Case #4: 16
CORRECT SOLN FOUND! CODE OPTION 2/3 | DEBUGGING ITER: 1/2.
Case #1: 8
Case #2: 6
Case #3: 1
Case #4: 16


Average Metric: 5 / 18  (27.8):  90%|█████████ | 18/20 [00:56<00:04,  2.09s/it]

Case #1: 0 3 6
Case #2: 0 10 20 30 40
Case #3: 0 15 30 45 60 75
Case #4: 0 28 56 84 112 140 168 196
Case #5: 0 66 132 198 264 330 396 462 528 594 660 726
Case #1: 0 3 6
Case #2: 0 10 20 30 40
Case #3: 0 15 30 45 60 75
Case #4: 0 28 56 84 112 140 168 196
Case #5: 0 66 132 198 264 330 396 462 528 594 660 726
Case #1: 0 3 6
Case #2: 0 10 10 10 20
Case #3: 0 15 15 15 15 30
Case #4: 0 28 28 28 28 28 28 56
Case #5: 0 66 66 66 66 66 66 66 66 66 66 132
Case #1: 3 2 11
Case #2: 3 2 0 0 11
Case #3: 3 2 0 0 0 11
Case #4: 3 2 0 0 0 0 0 11
Case #5: 3 2 0 0 0 0 0 0 0 0 0 11
Case #1: 4 4 8
Case #2: 16 48 48 48 96
Case #3: 25 100 100 100 100 200
Case #4: 49 294 294 294 294 294 294 588
Case #5: 121 1210 1210 1210 1210 1210 1210 1210 1210 1210 1210 2420
Case #1: Possible
998 1
998 1
Case #2: Possible
10 1 1
9 1 1
Case #3: Possible
2 1 1
1 1 1
1 1 1
2 1 1
Case #4: Impossible
Case #1: 0 4 12
Case #2: 0 48 48 48 112
Case #3: 0 80 80 80 80 305
Case #4: 0 294 294 294 294 294 294 637
Case #5: 0 1210 1210 1210

Average Metric: 5 / 19  (26.3):  95%|█████████▌| 19/20 [01:02<00:03,  3.40s/it]

Case #1: Possible
998 1
998 1
Case #2: Possible
10 1 1
9 1 1
Case #3: Possible
2 1 1
1 1 1
1 1 1
2 1 1
Case #4: Impossible


Average Metric: 5 / 20  (25.0): 100%|██████████| 20/20 [01:03<00:00,  3.15s/it]


Unnamed: 0,problem_description,sample_input,sample_output,solution,metric
0,"Alice and Bob are servers at *Nim Sum Dim Sum*, a bustling dumpling restaurant. For a staff meal, the manager has generously provided \(N\) plates...","['6', '3', '4 1 2', '2', '1 1', '2', '2 4', '3', '1 3 2', '6', '2 2 3 3 4 4', '8', '6 2...",Case #1: 6 Case #2: 0 Case #3: 2 Case #4: 0 Case #5: 0 Case #6: 19,"def solve(input): T = int(input[0]) results = [] for i in range(1, T + 1): N = int(input[2 * i - 1]) A = list(map(int,...",
1,"**Note: The only difference between this chapter and [chapter 2](https://www.facebook.com/codingcompetitions/hacker-cup/2022/round-1/problems/A2) is that here, all card values are guaranteed to be distinct and only up to...","['4', '5 1', '5 1 2 4 3', '2 4 3 5 1', '4 10', '3 1 4 2', '1 2 3 4', '4 0',...",Case #1: YES Case #2: NO Case #3: NO Case #4: YES,"def solve(input): T = int(input[0]) results = [] index = 1 for t in range(1, T + 1): N, K = map(int, input[index].split()) A =...",
2,"In the rapidly growing towns of Silicon Valley, traffic congestion is becoming a major problem. The governor has contracted Travis, the rock star traffic engineer,...","['4', '2 2 999 999', '2 3 12 11', '4 3 6 6', '50 50 1 1']",Case #1: Possible 333 333 333 333 Case #2: Possible 5 3 1 3 4 3 Case #3: Possible 1 1 1 1 2 1...,"def solve(input): from sys import stdout input = input[1:] # Skip the first line which is T results = [] for case_number, line in enumerate(input,...",
3,"Alfredo Spaghetti really likes soup, especially when it contains alphabet pasta. Every day he constructs a sentence from letters, places the letters into a bowl...","['5', 'WELCOME TO FACEBOOK HACKERCUP', 'CUP WITH LABEL HACKERCUP BELONGS TO HACKER', 'QUICK CUTE BROWN FOX JUMPS OVER THE LAZY DOG', 'MOVE FAST BE BOLD',...",Case #1: 1 Case #2: 2 Case #3: 1 Case #4: 0 Case #5: 1,"def solve(input): from collections import Counter # The target word we want to form target_word = ""HACKERCUP"" # Count the frequency of letters in the...",✔️ [1]
4,"Every week you take a socially-distanced trip to Overwaitea, your favourite grocery store, to stock up on food. Out front, there are \(N\) shopping carts,...","['5', '1', '2 3', '3', '4 5', '1 4', '5 1', '4', '3 2', '4 6', '5 5', '1 3', '6', '5 6', '3 5',...",Case #1: 11 3 2 Case #2: 190 20 0 16 30 Case #3: 560 0 5 0 24 36 Case #4: 1900 67 0...,"def solve(input): from collections import defaultdict T = int(input[0]) index = 1 results = [] for case_number in range(1, T + 1): N = int(input[index])...",


25.0

In [None]:
# Try out more advanced pipeline | ~30%
multi_stage_program = GenerateCode(max_tries=5, num_ensembles=5)
print(f"Evaluating Multi-Stage Program on test...")
evaluate(program=multi_stage_program, devset=testset)

Evaluating Multi-Stage Program on test...


  0%|          | 0/20 [00:00<?, ?it/s]

Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 2
Case #5: 2
Case #6: 4
Case #7: 8
Case #1: 2
Case #2: 3
Case #3: -1
Case #1: 0
Case #2: 3
Case #3: 30
Case #4: 360
Case #5: 3420Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 7
Case #5: 31

Case #1: 2
Case #2: 1
Case #3: 1
Case #4: 2
Case #5: 5
Case #1: 10
Case #2: 10
Case #3: 32
Case #4: 1
Case #5: 4600Case #1: Possible
333 333
333 333
Case #2: Possible
3 3 3
3 3 3
Case #3: Possible
1 1 1
1 1 1
1 1 1
1 1 1
Case #4: Impossible

Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 0
Case #2: 0
Case #3: 5
Case #4: 0
Case #1: 4
Case #2: 5
Case #3: 4
Case #4: 1
Case #5: 6Case #1: NYYY
Case #2: YNYNY
Case #3: NYYY
Case #4: NYNNYYNNY
Case #5: NYNNNYNYYN
Case #6: NNNNNNNNNNNYNN

Average Metric: 1 / 1  (100.0):   5%|▌         | 1/20 [00:07<02:18,  7.30s/it]

Case #1: -1
Case #2: -1
Case #3: 2
Case #4: -1
Case #5: 8
Case #1: 5
Case #2: 5
Case #3: 7
Case #4: 1
Case #5: 10
Case #1: 2
Case #2: 5
Case #3: 6
Case #4: 7
Case #5: -1
Case #1: 2
Case #2: 5
Case #3: 6
Case #4: 7
Case #5: -1
Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1
CORRECT SOLN FOUND! CODE OPTION 0/5 | DEBUGGING ITER: 1/4.
CORRECT SOLN FOUND! CODE OPTION 0/5 | DEBUGGING ITER: 1/4.
Case #1: 2
Case #2: 5
Case #3: 6
Case #4: 7
Case #5: -1
Case #1: Possible
333 333
333 333
Case #2: Possible
3 3 3
3 3 3
Case #3: Possible
1 1 1
1 1 1
1 1 1
1 1 1
Case #4: Impossible
Case #1: 1
Case #2: 2
Case #3: 1
Case #4: 0
Case #5: 1
Case #1: 4
Case #2: 4
Case #3: 4
Case #4: 3
Case #5: 9
Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104


Average Metric: 2 / 2  (100.0):  10%|█         | 2/20 [00:12<01:53,  6.28s/it]

Case #1: 2
Case #2: 5
Case #3: 6
Case #4: 7
Case #5: -1
Case #1: 7
Case #2: 2
Case #3: 5
Case #4: 3
Case #5: 17
Case #1: 0
Case #2: 1
Case #3: 6
Case #4: 36
Case #5: 171
Case #1: 3
Case #2: 5
Case #3: 5
Case #4: 5
Case #5: 10
Case #1: 1
Case #2: 1
Case #3: 0
Case #4: 2
Case #5: 1
Case #6: 2
Case #7: 9
CORRECT SOLN FOUND! CODE OPTION 0/5 | DEBUGGING ITER: 0/4.
Case #1: 2
Case #2: 3
Case #3: -1
Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 13
CORRECT SOLN FOUND! CODE OPTION 0/5 | DEBUGGING ITER: 1/4.


Average Metric: 3 / 3  (100.0):  10%|█         | 2/20 [00:16<01:53,  6.28s/it]

CORRECT SOLN FOUND! CODE OPTION 0/5 | DEBUGGING ITER: 1/4.
Case #1: 3
Case #2: 5
Case #3: 5
Case #4: 5
Case #5: 10

Average Metric: 3 / 3  (100.0):  15%|█▌        | 3/20 [00:16<01:24,  4.99s/it]


Case #1: 10.0
Case #2: 10.0
Case #3: 20.0
Case #4: 1.0
Case #5: 3000.0Case #1: 2
Case #2: 1
Case #3: 1
Case #4: 2
Case #5: 5

Case #1: 2
Case #2: 3
Case #3: -1
Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 7
Case #5: 31
Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 13


Average Metric: 4 / 4  (100.0):  20%|██        | 4/20 [00:18<01:04,  4.02s/it]

Case #1: 0
Case #2: 3
Case #3: 10
Case #4: 45
Case #5: 190
Case #1: -1
Case #2: -1
Case #3: -1
Case #4: -1
Case #5: -1

Case #1: 30
Case #2: 4
Case #3: 4
Case #4: 19
Case #5: 58Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 10
Case #2: 0
Case #3: 1
Case #4: 0
Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 13
Case #1: -1
Case #2: -1
Case #3: -1
Case #4: -1
Case #5: -1
Case #1: 30
Case #2: 4
Case #3: 4
Case #4: 19
Case #5: 58
Case #1: 6
Case #2: 3
Case #3: 6
Case #4: 2
Case #5: 14
Case #1: NYYY
Case #2: YNYNY
Case #3: NYYY
Case #4: NYNNYYNNY
Case #5: NYNNNYNYYN
Case #6: NNNNNNNNNNNYNNNNNYNNNNNNNNNYNN
Case #1: -1
Case #2: -1
Case #3: -1
Case #4: -1
Case #5: -1
Case #1: 0
Case #2: 2
Case #3: 24
Case #4: 362880
Case #5: 557316307
Case #1: 30
Case #2: 4
Case #3: 4
Case #4: 19
Case #5: 58
Case #1: 5
Case #2: 2
Case #3: 5
Case #4: 1
Case #5: 13
Case #1: 1
Case #2: 2
Case #3: 5
Case #4: 3
Case #5: 13
Case #1: 6
Case #2: 3
Case #3: 6
Case #4: 2
Case

Average Metric: 5 / 5  (100.0):  25%|██▌       | 5/20 [00:51<03:33, 14.23s/it]

Case #1: 0
Case #2: 12
Case #3: 7680
Case #4: 940071351
Case #5: 815596301
Case #1: 30
Case #2: 6
Case #3: 7
Case #4: 31
Case #5: 125
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #4: Impossible
Case #1: 0
Case #2: 2
Case #3: 72
Case #4: 2903040
Case #5: 31693456Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 7
Case #5: 31

Case #1: 30
Case #2: 6
Case #3: 7
Case #4: 31
Case #5: 125
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 1
Case #2: 1
Case #3: 2
Case #4: 1
Case #5: 1
Case #1: 0
Case #2: 4
Case #3: 96
Case #4: 3265920
Case #5: 589009763
Case #1: 30
Case #2: 6
Case #3: 7
Case #4: 31
Case #5: 125
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 2
Case #5: 3
Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 2
Case #5: 2
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case #5: 1000
Case #1: 30
Case #2: 6
Case #3: 7
Case #4: 31
Case #5: 125
Case #1: 0
Case #2: 4
Case #3: 96
Case #4: 3265920
Case #5: 589009

Average Metric: 5 / 6  (83.3):  30%|███       | 6/20 [01:33<05:31, 23.65s/it]

Case #1: 5 5
Case #2: 2 2
Case #3: 12 2
Case #4: 0 0
Case #5: 16 9
Case #6: 728 325
Case #1: 4
Case #2: 2
Case #3: 2
Case #4: 4
Case #5: 7
Case #1: -1
Case #2: 1
Case #3: -1
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: YES
Case #2: NO
Case #3: YES
Case #4: YES
Case #1: 5 5
Case #2: 2 2
Case #3: 12 2
Case #4: 0 0
Case #5: 16 9
Case #6: 728 325
Case #1: 4
Case #2: 2
Case #3: 2
Case #4: 4
Case #5: 7
Case #1: 0
Case #2: 4
Case #3: 141
Case #4: 4420891
Case #5: 235397979
Case #1: -1
Case #2: 1
Case #3: -1
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 5 5
Case #2: 2 2
Case #3: 12 2
Case #4: 0 0
Case #5: 16 9
Case #6: 728 325
Case #1: 0
Case #2: 4
Case #3: 141
Case #4: 4420891
Case #5: 235397979
Case #1: 3
Case #2: 2
Case #3: 2
Case #4: 4
Case #5: 6
Case #1: -1
Case #2: 1
Case #3: -1
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 5 2
Case #2: 2 2
Case #3: 12 2
Case #4: 0 0
Case #5: 16 6
Case #6: 728 11
Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 

ERROR:dspy.evaluate.evaluate:[2m2024-09-25T06:41:38.667219Z[0m [[31m[1merror    [0m] [1mError for example in dev set: 		 'list' object has no attribute 'strip'. Set `provide_traceback=True` to see the stack trace.[0m [[0m[1m[34mdspy.evaluate.evaluate[0m][0m [36mfilename[0m=[35mevaluate.py[0m [36mlineno[0m=[35m203[0m
Average Metric: 5.0 / 7  (71.4):  35%|███▌      | 7/20 [01:42<04:06, 18.95s/it]

Case #1: 0
Case #2: 1
Case #3: 30
Case #4: 493200
Case #5: 823872173
Case #1: 5 2
Case #2: 2 2
Case #3: 12 2
Case #4: 0 0
Case #5: 16 6
Case #6: 728 11
Case #1: YES
Case #2: YES
Case #3: YES
Case #4: YES
Case #1: 4
Case #2: 2
Case #3: 2
Case #4: 4
Case #5: 6
Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 6
Case #2: 2
Case #3: 5
Case #4: 6
Case #5: 18
Case #6: 36
Case #1: 0
Case #2: 1
Case #3: 30
Case #4: 493200
Case #5: 823872173
Case #1: 5 2
Case #2: 2 2
Case #3: 12 2
Case #4: 0 0
Case #5: 16 6
Case #6: 728 11
Case #1: Possible
998 998
1 1
Case #2: Possible
10 1 9
1 1 1
Case #3: Possible
2 1 2
1 1 1
1 1 1
1 1 1
Case #4: Impossible
Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 4
Case #2: 2
Case #3: 2
Case #4: 4
Case #5: 7
Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 1
Case #5: 1
Case #6: 2
Case #7: 7
Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 7
Case #5: 31
Case #1: 10
Case #2: 10
Case #3: 10
Case #4: 1
Case 

Average Metric: 5.0 / 8  (62.5):  40%|████      | 8/20 [01:53<03:15, 16.33s/it]

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 8
Case #5: 6
Case #6: 3
Case #7: 104
Case #1: 0
Case #2: 1
Case #3: 30
Case #4: 493200
Case #5: 823872173


Average Metric: 5.0 / 11  (45.5):  55%|█████▌    | 11/20 [01:55<00:58,  6.46s/it]

Case #1: Possible
998 998
1 1
Case #2: Possible
10 1 9
1 1 1
Case #3: Possible
2 1 2
1 1 1
1 1 1
1 1 1
Case #4: Impossible
Case #1: 6
Case #2: 2
Case #3: 5
Case #4: 6
Case #5: 18
Case #6: 36
Case #1: NO
Case #2: NO
Case #3: NO
Case #4: NO
Case #1: 2
Case #2: 2
Case #3: 2
Case #4: 2
Case #5: 2


Average Metric: 5.0 / 12  (41.7):  60%|██████    | 12/20 [02:01<00:49,  6.23s/it]

Case #1: 2
Case #2: 5
Case #3: 8
Case #4: 7
Case #5: 19
Case #1: 2
Case #2: 5
Case #3: 8
Case #4: 7
Case #5: 19


Average Metric: 5.0 / 13  (38.5):  65%|██████▌   | 13/20 [02:04<00:38,  5.54s/it]

Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 2
Case #2: 2
Case #3: 2
Case #4: 3
Case #5: 7
Case #6: 6
Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 2
Case #2: 0
Case #3: 2
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 0 0
Case #2: 0 0 0 0
Case #3: 0 0 0 0 0
Case #4: 0 0 0 0 0 0 0
Case #5: 0 0 0 0 0 0 0 0 0 0 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #5: 0
Case #6: 1
Case #7: 1


Average Metric: 5.0 / 14  (35.7):  70%|███████   | 14/20 [02:09<00:31,  5.24s/it]

Case #1: 1
Case #2: 0
Case #3: 1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #5: 0
Case #6: 1
Case #7: 1
Case #1: 2
Case #2: 0
Case #3: 2
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #5: 0
Case #6: 1
Case #7: 1


Average Metric: 5.0 / 15  (33.3):  75%|███████▌  | 15/20 [02:11<00:21,  4.30s/it]

Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #4: Impossible
Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 5
Case #5: 9
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #5: 0
Case #6: 1
Case #7: 1
Case #1: Impossible
Case #2: Impossible
Case #3: Possible
2 1 1
1 1 1
1 1 1
2 1 1
Case #4: Impossible
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #5: 0
Case #6: 1
Case #7: 1
Case #1: Impossible
Case #2: Impossible
Case #3: Possible
2 1 1
1 1 1
1 1 1
2 1 1
Case #4: Impossible


Average Metric: 5.0 / 17  (29.4):  85%|████████▌ | 17/20 [02:15<00:09,  3.21s/it]

Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 5
Case #5: 9
Case #1: 1 1
Case #2: 2 2
Case #3: 1 1
Case #4: 0 0
Case #5: 12 12
Case #6: 38 38
Case #1: 2
Case #2: 3
Case #3: 3
Case #4: 5
Case #5: 9
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 6
Case #5: 30
Case #1: 1
Case #2: 2
Case #3: 3
Case #4: 9
Case #5: 36
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 8
Case #5: 30
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 8
Case #5: 30
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 8
Case #5: 30
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 8
Case #5: 30


Average Metric: 5.0 / 18  (27.8):  90%|█████████ | 18/20 [02:37<00:16,  8.46s/it]

Case #1: 1 1
Case #2: 2 2
Case #3: 1 1
Case #4: 0 0
Case #5: 12 12
Case #6: 38 38
Case #1: 1 1
Case #2: 2 2
Case #3: 1 1
Case #4: 0 0
Case #5: 12 12
Case #6: 38 38
Case #1: 1 1
Case #2: 2 2
Case #3: 1 1
Case #4: 0 0
Case #5: 12 12
Case #6: 38 38
Case #1: 3 2
Case #2: 3 2 -1 11
Case #3: 3 2 -1 -1 11
Case #4: 3 2 -1 -1 -1 -1 11
Case #5: 3 2 -1 -1 -1 -1 -1 -1 -1 -1 11
Case #1: 3 2
Case #2: 3 2 -1 11
Case #3: 3 2 -1 -1 11
Case #4: 3 2 -1 -1 -1 -1 11
Case #5: 3 2 -1 -1 -1 -1 -1 -1 -1 -1 11
Case #1: 0 0
Case #2: 0 0
Case #3: 0 0
Case #4: 0 0
Case #5: 0 0
Case #6: 0 0
Case #1: 3 2
Case #2: 3 2 -1 11
Case #3: 3 2 -1 -1 11
Case #4: 3 2 -1 -1 -1 -1 11
Case #5: 3 2 -1 -1 -1 -1 -1 -1 -1 -1 11
Case #1: 0 0
Case #2: 0 0
Case #3: 0 0
Case #4: 0 0
Case #5: 0 0
Case #6: 0 0
Case #1: 0 0
Case #2: 0 0
Case #3: 0 0
Case #4: 0 0
Case #5: 0 0
Case #6: 0 0
Case #1: 0 0
Case #2: 0 0
Case #3: 0 0
Case #4: 0 0
Case #5: 0 0
Case #6: 0 0


Average Metric: 5.0 / 19  (26.3):  95%|█████████▌| 19/20 [02:52<00:10, 10.44s/it]

Case #1: 3 2
Case #2: 3 2 1 0
Case #3: 3 2 3 1 0
Case #4: 3 2 10 6 3 1 0
Case #5: 3 2 36 28 21 15 10 6 3 1 0
Case #1: 3 2
Case #2: 3 2 1 0
Case #3: 3 2 3 1 0
Case #4: 3 2 10 6 3 1 0
Case #5: 3 2 36 28 21 15 10 6 3 1 0
Case #1: 3 2
Case #2: 3 2 1 0
Case #3: 3 2 3 1 0
Case #4: 3 2 10 6 3 1 0
Case #5: 3 2 36 28 21 15 10 6 3 1 0
Case #1: 3 2
Case #2: 3 2 1 0
Case #3: 3 2 3 1 0
Case #4: 3 2 10 6 3 1 0
Case #5: 3 2 36 28 21 15 10 6 3 1 0
Case #1: 3 2
Case #2: 3 2 1 0
Case #3: 3 2 3 1 0
Case #4: 3 2 10 6 3 1 0
Case #5: 3 2 36 28 21 15 10 6 3 1 0
Case #1: 3 2
Case #2: 3 2 1 0
Case #3: 3 2 3 1 0
Case #4: 3 2 10 6 3 1 0
Case #5: 3 2 36 28 21 15 10 6 3 1 0


Average Metric: 5.0 / 20  (25.0): 100%|██████████| 20/20 [02:54<00:00,  8.72s/it]


Unnamed: 0,problem_description,sample_input,sample_output,solution,metric
0,"Alice and Bob are servers at *Nim Sum Dim Sum*, a bustling dumpling restaurant. For a staff meal, the manager has generously provided \(N\) plates...","['6', '3', '4 1 2', '2', '1 1', '2', '2 4', '3', '1 3 2', '6', '2 2 3 3 4 4', '8', '6 2...",Case #1: 6 Case #2: 0 Case #3: 2 Case #4: 0 Case #5: 0 Case #6: 19,"def nim_sum(a): result = 0 for num in a: result ^= num return result def calculate_winning_moves(N, A): total_winning_moves = 0 for i in range(N): for...",
1,"**Note: The only difference between this chapter and [chapter 2](https://www.facebook.com/codingcompetitions/hacker-cup/2022/round-1/problems/A2) is that here, all card values are guaranteed to be distinct and only up to...","['4', '5 1', '5 1 2 4 3', '2 4 3 5 1', '4 10', '3 1 4 2', '1 2 3 4', '4 0',...",Case #1: YES Case #2: NO Case #3: NO Case #4: YES,"def solve(input): T = int(input[0]) results = [] index = 1 for t in range(1, T + 1): N, K = map(int, input[index].split()) A =...",
2,"In the rapidly growing towns of Silicon Valley, traffic congestion is becoming a major problem. The governor has contracted Travis, the rock star traffic engineer,...","['4', '2 2 999 999', '2 3 12 11', '4 3 6 6', '50 50 1 1']",Case #1: Possible 333 333 333 333 Case #2: Possible 5 3 1 3 4 3 Case #3: Possible 1 1 1 1 2 1...,"def solve(input): T = int(input[0]) results = [] for case_number in range(1, T + 1): N, M, A, B = map(int, input[case_number].split()) grid = [[1]...",
3,"Alfredo Spaghetti really likes soup, especially when it contains alphabet pasta. Every day he constructs a sentence from letters, places the letters into a bowl...","['5', 'WELCOME TO FACEBOOK HACKERCUP', 'CUP WITH LABEL HACKERCUP BELONGS TO HACKER', 'QUICK CUTE BROWN FOX JUMPS OVER THE LAZY DOG', 'MOVE FAST BE BOLD',...",Case #1: 1 Case #2: 2 Case #3: 1 Case #4: 0 Case #5: 1,"def solve(input): from collections import Counter # The target word target_word = ""HACKERCUP"" target_count = Counter(target_word) # Prepare the output list output = [] T...",✔️ [1.0]
4,"Every week you take a socially-distanced trip to Overwaitea, your favourite grocery store, to stock up on food. Out front, there are \(N\) shopping carts,...","['5', '1', '2 3', '3', '4 5', '1 4', '5 1', '4', '3 2', '4 6', '5 5', '1 3', '6', '5 6', '3 5',...",Case #1: 11 3 2 Case #2: 190 20 0 16 30 Case #3: 560 0 5 0 24 36 Case #4: 1900 67 0...,"def solve(input): from collections import defaultdict T = int(input[0]) results = [] index = 1 for case_number in range(1, T + 1): N = int(input[index])...",


25.0

In [None]:
# OPTIONAL: Optimize program w/ MIPROv2 (0-shot)
multi_stage_program = GenerateCode()
mipro_optimized_multi_stage_program = optimize_with_mipro(multi_stage_program, lm, lm, test_code(timeout=5), trainset)
print(f"Evaluating MIPRO optimized Multi-Stage Program on test...")
evaluate(program=mipro_optimized_multi_stage_program, devset=testset)

TypeError: MIPROv2.compile() got an unexpected keyword argument 'eval_kwargs'

In [None]:
# OPTIONAL: Optimize program w/ Bootstrap Few-Shot
multi_stage_program = GenerateCode()
bootstrap_optimized_multi_stage_program = optimize_with_bootstrap_fewshot(multi_stage_program, lm, lm, test_code(timeout=5), trainset)
print(f"Evaluating Bootstrap Few-Shot optimized Multi-Stage Program on test...")
evaluate(program=bootstrap_optimized_multi_stage_program, devset=testset)

Going to sample between 1 and 2 traces per predictor.
Will attempt to bootstrap 5 candidate sets.
Case #1: 4
Case #2: 2
Case #3: 1

  0%|          | 0/50 [00:00<?, ?it/s]


Case #1: -1
Case #2: -1
Case #3: 1
Case #4: -1
Case #5: -1
Case #6: -1
Case #7: -1
Case #1: 0
Case #2: 2
Case #1: YES
Case #2: NO
Case #3: YES
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: YES
Case #2: NO
Case #3: YES
Case #4: YES
Case #5: YES
Case #6: YES
Case #7: NO
Case #1: -1
Case #2: -1
Case #3: 1
Case #4: -1
Case #5: -1
Case #6: -1
Case #7: -1
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.
Case #1: NO
Case #2: YES
Case #3: NO
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 6
Case #2: 2
Case #1: YES
Case #2: NO
Case #3: YES
Case #4: YES
Case #5: YES
Case #6: YES
Case #7: NO
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: YES
Case #2: YES
Case #3: NO
Case #4: NO
Case #5: YES


Average Metric: 1 / 1  (100.0):   2%|▏         | 1/50 [00:04<03:29,  4.28s/it]

CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/1.CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/1.

Case #1: YES
Case #2: YES
Case #3: NO
Case #4: NO
Case #5: YES
Case #1: NO
Case #2: YES
Case #3: NO
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 3
Case #2: 0
Case #1: 0
Case #2: 0
Case #3: 0

Average Metric: 2 / 2  (100.0):   4%|▍         | 2/50 [00:07<02:44,  3.42s/it]


Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 0
Case #2: 0
Case #3: 0


Average Metric: 2 / 3  (66.7):   6%|▌         | 3/50 [00:11<02:57,  3.78s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #1:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #3:
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4:
2 4 4 5 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 6 6 6 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 8 8 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Average Metric: 2 / 4  (50.0):   8%|▊         | 4/50 [00:15<02:54,  3.80s/it]

Case #1: 3
Case #2: 0
Case #3: 9
Case #4: 4


Average Metric: 3 / 5  (60.0):  10%|█         | 5/50 [00:16<02:03,  2.75s/it]

Case #1: 1
Case #2: 2
Case #3: 0
Case #4: 1
Case #5: 1
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.
Case #1: 1
Case #2: 2
Case #3: 0
Case #4: 1
Case #5: 1


Average Metric: 4 / 6  (66.7):  12%|█▏        | 6/50 [00:20<02:23,  3.25s/it]

Case #1: 3
Case #2: 0
Case #1: 7
Case #2: 15
Case #3: 31
Case #4: 206
Case #5: 18


Average Metric: 4 / 7  (57.1):  14%|█▍        | 7/50 [00:23<02:13,  3.11s/it]

Case #1: 7
Case #2: 15
Case #3: 32
Case #4: 197
Case #5: 16
Case #1: 0
Case #2: 0
Case #3: 0
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/1.
Case #1: 7
Case #2: 15
Case #3: 32
Case #4: 197
Case #5: 16


Average Metric: 4 / 8  (50.0):  16%|█▌        | 8/50 [00:25<02:06,  3.01s/it]

Case #1: 2
Case #2: 4
Case #3: 6
Case #4: 0
Case #5: 2


Average Metric: 5 / 9  (55.6):  18%|█▊        | 9/50 [00:26<01:35,  2.33s/it]

Case #1: Y
Case #2: Y
Case #3: N
Case #4: Y
Case #5: Y
Case #6: Y
Case #7: Y
Case #8: Y
Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 49
Case #5: 5
Case #6: 67867216
Case #1: 3
Case #2: 0
Case #1: 1/1
Case #2: 1/1
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
Case #1: Y
Case #2: Y
Case #3: N
Case #4: Y
Case #5: Y
Case #6: Y
Case #7: Y
Case #8: Y
Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 49
Case #5: 5
Case #6: 67867216
Case #1: 3
Case #2: 0
Case #1: N
Case #2: N
Case #3: N
Case #4: Y
Case #5: Y
Case #6: N
Case #7: N
Case #8: Y
Case #1: 1
Case #2: 1
Case #3: 958333341
Case #4: 1
Case #5: 1
Case #6: 1
Case #1: 1/1
Case #2: 1/1
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
Case #1: 0
Case #2: 0


Average Metric: 5 / 10  (50.0):  20%|██        | 10/50 [00:32<02:10,  3.27s/it]

Case #1: 500000004
Case #2: 250000002
Case #3: 0
Case #4: 850000006
Case #5: 879464292
Case #6: 118156154
Case #1:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #3:
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4:
2 4 4 5 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 6 6 6 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 8 8 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 

Average Metric: 5 / 11  (45.5):  22%|██▏       | 11/50 [00:35<02:08,  3.29s/it]


Case #1: 500000004
Case #2: 437500004
Case #3: 583333338
Case #4: 438019804
Case #5: 339009800
Case #6: 770815060
Case #1: 20/7
Case #2: 20/7
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1


Average Metric: 5 / 12  (41.7):  24%|██▍       | 12/50 [00:36<01:42,  2.69s/it]

Case #1: 4
Case #2: 0
Case #3: 2
Case #4: 8
Case #5: 48
Case #6: 358Case #1: 0.0
Case #2: 0.004975985528504535

Case #1: 20/7
Case #2: 20/7
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
Case #1: 500000004
Case #2: 437500004
Case #3: 583333338
Case #4: 438019804
Case #5: 339009800
Case #6: 770815060
CORRECT SOLN FOUND! CODE OPTION 1/3 | DEBUGGING ITER: 0/1.
Case #1: 4
Case #2: 0
Case #3: 2
Case #4: 8
Case #5: 48
Case #6: 358


Average Metric: 5 / 14  (35.7):  28%|██▊       | 14/50 [00:40<01:16,  2.12s/it]

Case #1: 100.060954430
Case #2: 150.060954430
Case #3: 0.005195678
Case #4: nan
Case #5: 0.759385113


Average Metric: 6 / 15  (40.0):  30%|███       | 15/50 [00:40<00:55,  1.58s/it]

Case #1: 1
Case #2: 3
Case #3: 4
Case #4: 2
Case #5: 9
Case #1: 0.0000000000
Case #2: 0.0049759855
Case #1:
..
---
Case #2:
..
---
....
Case #3:
..
---
Case #1: 3
Case #2: 1
Case #3: 0
Case #4: 191
Case #5: 67
Case #6: 1999999999999
Case #1: 100.060954430
Case #2: 150.060954430
Case #3: 0.005195678
Case #4: nan
Case #5: 0.759385113


Average Metric: 6 / 16  (37.5):  32%|███▏      | 16/50 [00:42<00:56,  1.67s/it]

Case #1: 4
Case #2: 4
Case #3: 5
Case #4: 8
Case #5: 15
Case #1:
..
---
Case #2:
..
---
....
Case #3:
...
---
Case #1: 0.000000000
Case #2: 0.000000000
Case #3: 0.584415584
Case #4: 909.994348697
Case #5: 1.873420486
Case #1: 1
Case #2: 3
Case #3: 4
Case #4: 2
Case #5: 9
Case #1: 0.000000000
Case #2: 0.000000000
Case #3: 0.019480519
Case #4: -909.181818182
Case #5: -1.519379845
Case #1: 7
Case #2: 6
Case #3: 7
Case #4: 12
Case #5: 19
Case #1: 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 0 2 0 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0
Case #1: 1
Case #2: 3
Case #3: 4
Case #4: 2
Case #5: 9
Case #1: 0.000000000
Case #2: 100.060954430
Case #3: 0.000000000
Case #4: -951.146012869
Case #5: -3.588658615
Case #1: 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0

Average Metric: 6 / 17  (35.3):  34%|███▍      | 17/50 [00:47<01:27,  2.65s/it]

Case #1: 1
Case #2: 3
Case #3: 5
Case #4: 2
Case #5: 13
Case #1: 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 0
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 3 6 4 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0
Case #1: 1
Case #2: 3
Case #3: 5
Case #4: 2
Case #5: 13
Case #1: 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 0
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 3 6 4 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0


Average Metric: 6 / 18  (33.3):  36%|███▌      | 18/50 [00:49<01:16,  2.40s/it]

Case #1: 9
Case #2: 9
Case #3: 21
Case #4: 19
Case #5: 44


Average Metric: 6 / 19  (31.6):  38%|███▊      | 19/50 [00:50<00:58,  1.88s/it]

Case #1: N
Case #2: N
Case #3: Y
Case #4: N
Case #5: N
Case #6: N
Case #7: N
Case #8: N
Case #1: 9
Case #2: 9
Case #3: 21
Case #4: 19
Case #5: 44
Case #1: N
Case #2: N
Case #3: Y
Case #4: N
Case #5: N
Case #6: N
Case #7: N
Case #8: N
Case #1:
..
---
Case #2:
..
---
....
Case #3:
..
---
Case #1: 9
Case #2: 9
Case #3: 21
Case #4: 19
Case #5: 44
Case #1: 0
Case #2: 0
Case #3: 8.0
Case #4: 500000011.0
Case #5: 11.0
Case #1: N
Case #2: N
Case #3: Y
Case #4: N
Case #5: N
Case #6: N
Case #7: N
Case #8: N
Case #1: 0
Case #2: 0
Case #3: -1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1:
..
---
Case #2:
..
---
....
Case #3:
---


Average Metric: 6 / 20  (30.0):  40%|████      | 20/50 [00:56<01:40,  3.36s/it]

Case #1: 0
Case #2: 0
Case #3: 8
Case #4: 500000011
Case #5: 11
Case #1: 0
Case #2: 4
Case #3: 0
Case #4: 9
Case #5: 9
Case #1:
.
..
Case #2:
.
..
...
Case #3:
.
..Case #1: 0
Case #2: 0
Case #3: -1
Case #4: 0
Case #5: 0
Case #6: 0

Case #1: 9
Case #2: 6
Case #3: 15
Case #4: 14
Case #5: 31
Case #1: 3
Case #2: 4
Case #3: 1
Case #4: 9
Case #5: 10
Case #1:
--
...
Case #2:
--
...
----
Case #3:
--
...
Case #1: 11
Case #2: 12
Case #3: 24
Case #4: 21
Case #5: 50
Case #1:
--
...
Case #2:
--
...
----
Case #3:
--
...


Average Metric: 6 / 22  (27.3):  44%|████▍     | 22/50 [01:04<01:34,  3.39s/it]

Case #1: 1
Case #2: 5
Case #3: 12
Case #4: 4
Case #5: 0
Case #1: 11
Case #2: 12
Case #3: 24
Case #4: 21
Case #5: 50


Average Metric: 6 / 23  (26.1):  46%|████▌     | 23/50 [01:05<01:08,  2.54s/it]

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 4
Case #5: 3
Case #1: 0.000000000
Case #2: 0.000000000


Average Metric: 6 / 24  (25.0):  48%|████▊     | 24/50 [01:06<00:58,  2.26s/it]

Case #1: 4
Case #2: 2
Case #3: 1
Case #1: 11
Case #2: 12
Case #3: 24
Case #4: 21
Case #5: 50
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.
Case #1: 1
Case #2: 5
Case #3: 12
Case #4: 4
Case #5: 0


Average Metric: 6 / 25  (24.0):  50%|█████     | 25/50 [01:09<01:00,  2.44s/it]

Case #1: 9
Case #2: 9
Case #3: 14
Case #4: 22
Case #5: 56
Case #6: 160
Case #7: 136
Case #8: 1000000000


Average Metric: 7 / 26  (26.9):  52%|█████▏    | 26/50 [01:10<00:49,  2.05s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 3
Case #2: 14
Case #3: 68
Case #4: 4
Case #5: 2
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia
Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 4
Case #5: 3
Case #1: 0
Case #2: 0
Case #3: 0


Average Metric: 7 / 27  (25.9):  54%|█████▍    | 27/50 [01:13<00:53,  2.34s/it]

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 4
Case #5: 3
Case #1: 1
Case #2: 4
Case #3: 8
Case #4: 1
Case #5: 1
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia
Case #1: 0
Case #2: 0
Case #3: 0


Average Metric: 7 / 28  (25.0):  56%|█████▌    | 28/50 [01:15<00:45,  2.09s/it]

Case #1: 1
Case #2: 8
Case #3: 36
Case #4: 4
Case #5: 0
Case #1: 0
Case #2: 0
Case #3: 0
CORRECT SOLN FOUND! CODE OPTION 2/3 | DEBUGGING ITER: 0/1.
Case #1: 850000006
Case #2: 616666671
Case #3: 467255066
Case #4: 23809524
Case #5: 169595187
Case #6: 83333334
Case #1: 1
Case #2: 8
Case #3: 36
Case #4: 4
Case #5: 0
Case #1: 0
Case #2: 0
Case #3: 0


Average Metric: 8 / 29  (27.6):  58%|█████▊    | 29/50 [01:19<00:57,  2.73s/it]

Case #1: 5.099020
Case #2: 0.000000
Case #3: 4.242641
Case #4: 8.544004
Case #5: 8.485281
Case #1: 9
Case #2: 9
Case #3: 14
Case #4: 17
Case #5: 47
Case #6: 80
Case #7: 69
Case #8: 1000000000
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia
Case #1: 700000005
Case #2: 700000005
Case #3: 40859919
Case #4: 17857143
Case #5: 423665072
Case #6: 166666668
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 2.236068
Case #2: 0.000000
Case #3: 3.605551
Case #4: 2.828427
Case #5: 8.485281
Case #1: 700000005
Case #2: 700000005
Case #3: 40859919
Case #4: 17857143
Case #5: 423665072
Case #6: 166666668


Average Metric: 8 / 31  (25.8):  60%|██████    | 30/50 [01:26<01:20,  4.02s/it]

Case #1: 5.830952
Case #2: 5.000000
Case #3: 5.000000
Case #4: 8.544004
Case #5: 8.485281

Average Metric: 8 / 32  (25.0):  64%|██████▍   | 32/50 [01:26<00:40,  2.26s/it]


Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
Case #1: 0.000000
Case #2: 0.000000
Case #3: 0.000000
Case #4: 0.000000
Case #5: 0.000000


Average Metric: 8 / 33  (24.2):  66%|██████▌   | 33/50 [01:30<00:45,  2.70s/it]

Case #1: 5.830952
Case #2: 5.000000
Case #3: 5.000000
Case #4: 8.544004
Case #5: 8.485281
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible


Average Metric: 8 / 34  (23.5):  68%|██████▊   | 34/50 [01:32<00:37,  2.36s/it]

Case #1: 2.000000000
Case #2: 6.000000000
Case #3: 6.000000000
Case #4: 2951.000000000
Case #5: 2951.000000000
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #1: 3.605551
Case #2: 5.000000
Case #3: 3.605551
Case #4: 4.242641
Case #5: 8.485281
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
None
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
Case #1: 5.830952
Case #2: 5.000000
Case #3: 5.000000
Case #4: 4.242641
Case #5: 8.485281
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5


Average Metric: 9 / 35  (25.7):  70%|███████   | 35/50 [01:37<00:48,  3.22s/it]

Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #1: 1
Case #2: 1
Case #3: 6
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
Case #1: 1
Case #2: 1
Case #3: 6
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia


Average Metric: 9 / 36  (25.0):  72%|███████▏  | 36/50 [01:45<01:03,  4.53s/it]

Case #1: 0
Case #2: 0
Case #3: -1
Case #1: 0
Case #2: 0
Case #3: -1


Average Metric: 9 / 38  (23.7):  74%|███████▍  | 37/50 [01:48<00:53,  4.14s/it]

Case #1: 0 0 1
Case #2: 0 1 2 3 4 5 6
Case #3: 2 0 3 1
Case #4: 0 1 2 3 0 1
Case #5: 5 0 1 3 3 7 4 0 1 6 2 0
Case #1: 0
Case #2: 1
Case #3: 6
None
Case #1: 0
Case #2: 1
Case #3: 6
Case #1: 0 0 1
Case #2: 0 1 2 3 4 5 6
Case #3: 2 0 3 1
Case #4: 0 1 2 3 0 1
Case #5: 5 0 1 3 3 7 4 0 1 6 2 0
Case #1: 0
Case #2: 1
Case #3: 6
Case #1: 1.000000 1.000000 1.333333 0.333333
Case #2: 0.500000 0.500000 0.500000 0.500000
Case #3: 0.500000 1.000000 0.500000 0.500000
Case #4: 1.000000 1.000000 1.000000 1.000000
Case #5: 1.000000 1.000000 1.000000 1.000000
Case #1: 1
Case #2: -1
Case #3: 6
Case #1: 1.000000 1.000000 1.333333 0.333333
Case #2: 0.500000 0.500000 0.500000 0.500000
Case #3: 0.500000 1.000000 0.500000 0.500000
Case #4: 1.000000 1.000000 1.000000 1.000000
Case #5: 1.000000 1.000000 1.000000 1.000000


Average Metric: 9 / 39  (23.1):  78%|███████▊  | 39/50 [01:53<00:36,  3.33s/it]

Case #1: 0 0 1
Case #2: 0 1 2 3 4 5 6
Case #3: 1 0 3 2
Case #4: 0 1 2 3 0 1
Case #5: 5 0 2 6 1 2 4 0 1 7 3 0
Case #1: 0 0 1
Case #2: 0 1 2 3 0 1 2
Case #3: 1 0 3 2
Case #4: 0 1 2 3 0 1
Case #5: 5 0 2 6 1 2 4 0 1 7 3 0


Average Metric: 9 / 40  (22.5):  80%|████████  | 40/50 [01:59<00:39,  3.96s/it]

Case #1: 3
Case #2: 3
Case #3: 0
Case #4: 2
Case #5: 5
Case #6: 17

Average Metric: 9 / 41  (22.0):  82%|████████▏ | 41/50 [02:01<00:30,  3.37s/it]


Case #1: 0 0 1
Case #2: 0 1 2 3 0 1 2
Case #3: 1 0 3 2
Case #4: 0 1 2 3 0 1
Case #5: 5 0 2 6 1 2 4 0 1 7 3 0
Case #1: 4
Case #2: 3
Case #3: 22
Case #4: 33
Case #5: 27
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia


Average Metric: 9 / 42  (21.4):  84%|████████▍ | 42/50 [02:05<00:28,  3.51s/it]

Case #1: 3
Case #2: 3
Case #3: 0
Case #4: 1
Case #5: 3
Case #6: 12


Average Metric: 9 / 43  (20.9):  86%|████████▌ | 43/50 [02:06<00:20,  2.87s/it]

Case #1: 4
Case #2: 3
Case #3: 22
Case #4: 33
Case #5: 27
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.


Average Metric: 10 / 46  (21.7):  92%|█████████▏| 46/50 [02:12<00:08,  2.20s/it]

Case #1: 4
Case #2: 3
Case #3: 22
Case #4: 33
Case #5: 27
Case #1: 5
Case #2: 9
Case #3: 67
Case #4: 78
Case #5: 72
Case #1: 0
Case #2: 5
Case #3: 28
Case #4: 17
Case #5: 23


Average Metric: 10 / 48  (20.8):  96%|█████████▌| 48/50 [02:18<00:05,  2.56s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0


Average Metric: 10 / 50  (20.0): 100%|██████████| 50/50 [02:37<00:00,  3.16s/it]

New best score: 20.0 for seed -3
Scores so far: [20.0]
Best score so far: 20.0



  0%|          | 0/50 [00:00<?, ?it/s]

Case #1: 4
Case #2: 2
Case #3: 1
Case #1: YES
Case #2: NO
Case #3: YES
Case #4: YES
Case #5: YES
Case #6: YES
Case #7: NO
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: YES
Case #2: NO
Case #3: YES
Case #1: 0
Case #2: 2
Case #1: -1
Case #2: -1
Case #3: 1
Case #4: -1
Case #5: -1
Case #6: -1
Case #7: -1
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.
Case #1: YES
Case #2: NO
Case #3: YES
Case #4: YES
Case #5: YES
Case #6: YES
Case #7: NO
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: -1
Case #2: -1
Case #3: 1
Case #4: -1
Case #5: -1
Case #6: -1
Case #7: -1Case #1: NO
Case #2: YES
Case #3: NO

Case #1: YES
Case #2: YES
Case #3: NO
Case #4: NO
Case #5: YES
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 6
Case #2: 2
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/1.CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/1.

Case #1: YES
Case #2: YES
Case #3: NO
Case #4: NO
Case #5: YES


Average Metric: 1 / 1  (100.0):   2%|▏         | 1/50 [00:04<03:29,  4.28s/it]

Case #1: NO
Case #2: YES
Case #3: NO
Case #1: 3
Case #2: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 0
Case #2: 0
Case #3: 0


Average Metric: 3 / 3  (100.0):   6%|▌         | 3/50 [00:07<01:37,  2.07s/it]

Case #1: 3
Case #2: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #1:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #3:
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4:
2 4 4 5 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 6 6 6 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 8 8 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Average Metric: 3 / 5  (60.0):   8%|▊         | 4/50 [00:18<04:18,  5.63s/it]

Case #1: 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 0 2 0 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0
Case #1: 3
Case #2: 0
Case #3: 9
Case #4: 4
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 1
Case #2: 2
Case #3: 0
Case #4: 1
Case #5: 1


Average Metric: 3 / 6  (50.0):  12%|█▏        | 6/50 [00:20<02:17,  3.13s/it]

CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.


Average Metric: 4 / 7  (57.1):  14%|█▍        | 7/50 [00:21<01:48,  2.52s/it]

Case #1: 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 0
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 3 6 4 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0
Case #1: 7
Case #2: 15
Case #3: 31
Case #4: 206
Case #5: 18Case #1: 1
Case #2: 2
Case #3: 0
Case #4: 1
Case #5: 1



Average Metric: 5 / 8  (62.5):  16%|█▌        | 8/50 [00:23<01:44,  2.49s/it]

Case #1: 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 0
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 3 6 4 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0
Case #1: 7
Case #2: 15
Case #3: 32
Case #4: 197
Case #5: 16

Case #1: 2
Case #2: 4
Case #3: 6
Case #4: 0
Case #5: 2Case #1: Y
Case #2: Y
Case #3: N
Case #4: Y
Case #5: Y
Case #6: Y
Case #7: Y
Case #8: Y
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 1/1.
Case #1: 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 -1 0
Case #3: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #4: 3 6 4 5 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Case #5: 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0
Case #1: 7
Case #2: 15
Case #3: 32
Case #4: 197
Case #5: 16
Case #1: Y
Case #2: Y
Case #3: N
Case #4: Y
Case #5: Y
Case #6: Y


Average Metric: 5 / 9  (55.6):  18%|█▊        | 9/50 [00:30<02:34,  3.78s/it]

Case #1: N
Case #2: N
Case #3: N
Case #4: Y
Case #5: Y
Case #6: N
Case #7: N
Case #8: Y


Average Metric: 5 / 10  (50.0):  20%|██        | 10/50 [00:31<01:57,  2.93s/it]

Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 49
Case #5: 5
Case #6: 67867216
Case #1: 2
Case #2: 0
Case #3: 2
Case #4: 4
Case #5: 9
Case #6: 24


Average Metric: 5 / 11  (45.5):  22%|██▏       | 11/50 [00:32<01:34,  2.44s/it]

Case #1: N
Case #2: N
Case #3: N
Case #4: N
Case #5: Y
Case #6: N
Case #7: N
Case #8: Y


Average Metric: 5 / 12  (41.7):  24%|██▍       | 12/50 [00:33<01:16,  2.00s/it]

Case #1: 1
Case #2: 1
Case #3: 1
Case #4: 49
Case #5: 5
Case #6: 67867216
Case #1: 1/1
Case #2: 1/1
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
Case #1: 0.0
Case #2: 0.004975985528504535
Case #1: 3
Case #2: 1
Case #3: 0
Case #4: 39
Case #5: 199
Case #6: 1999999999999
Case #1: N
Case #2: N
Case #3: Y
Case #4: N
Case #5: N
Case #6: N
Case #7: N
Case #8: N
Case #1: 4
Case #2: 0
Case #3: 2
Case #4: 8
Case #5: 48
Case #6: 358
Case #1: 1
Case #2: 1
Case #3: 958333341
Case #4: 1
Case #5: 1
Case #6: 1
Case #1: 2
Case #2: 4
Case #3: 6
Case #4: 0
Case #5: 2
Case #1: 1/1
Case #2: 1/1
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
CORRECT SOLN FOUND! CODE OPTION 1/3 | DEBUGGING ITER: 0/1.
CORRECT SOLN FOUND! CODE OPTION 1/3 | DEBUGGING ITER: 0/1.
Case #1: N
Case #2: N
Case #3: Y
Case #4: N
Case #5: N
Case #6: N
Case #7: N
Case #8: N


Average Metric: 5 / 13  (38.5):  26%|██▌       | 13/50 [00:37<01:36,  2.61s/it]

Case #1: 100.060954430
Case #2: 150.060954430
Case #3: 0.005195678
Case #4: nan
Case #5: 0.759385113
Case #1: 4
Case #2: 0
Case #3: 2
Case #4: 8
Case #5: 48
Case #6: 358
Case #1: 500000004
Case #2: 250000002
Case #3: 0
Case #4: 850000006
Case #5: 879464292
Case #6: 118156154


Average Metric: 6 / 14  (42.9):  28%|██▊       | 14/50 [00:38<01:21,  2.26s/it]

Case #1: 0.0000000000
Case #2: 0.0049759855
Case #1: 20/7
Case #2: 20/7
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1Case #1: N
Case #2: N
Case #3: Y
Case #4: N
Case #5: N
Case #6: N
Case #7: N
Case #8: N

Case #1: 100.060954430
Case #2: 150.060954430
Case #3: 0.005195678
Case #4: nan
Case #5: 0.759385113
Case #1: 1
Case #2: 3
Case #3: 4
Case #4: 2
Case #5: 9


Average Metric: 6 / 15  (40.0):  30%|███       | 15/50 [00:40<01:11,  2.04s/it]

Case #1: 20/7
Case #2: 20/7
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
Case #1:
..
---
Case #2:
..
---
....
Case #3:
..
---
Case #1: 0.000000000
Case #2: 0.000000000
Case #3: 0.584415584
Case #4: 909.994348697
Case #5: 1.873420486
Case #1: 4
Case #2: 4
Case #3: 5
Case #4: 8
Case #5: 15
Case #1: 0.000000000
Case #2: 0.000000000
Case #1:
..
---
Case #2:
..
---
....
Case #3:
...
---
Case #1: 20/7
Case #2: 20/7
Case #3: 0/1
Case #4: 0/1
Case #5: 0/1
Case #1: 0.000000000
Case #2: 0.000000000
Case #3: 0.019480519
Case #4: -909.181818182
Case #5: -1.519379845
Case #1: 3
Case #2: 1
Case #3: 0
Case #4: 191
Case #5: 67
Case #6: 1999999999999Case #1: 1
Case #2: 3
Case #3: 4
Case #4: 2
Case #5: 9

Average Metric: 6 / 16  (37.5):  30%|███       | 15/50 [00:46<01:11,  2.04s/it]





Average Metric: 6 / 16  (37.5):  32%|███▏      | 16/50 [00:46<01:46,  3.13s/it]

Case #1:
..
---
Case #2:
..
---
....
Case #3:
..
---


Average Metric: 6 / 17  (35.3):  34%|███▍      | 17/50 [00:47<01:24,  2.57s/it]

Case #1: 7
Case #2: 6
Case #3: 7
Case #4: 12
Case #5: 19
Case #1: 0.000000000
Case #2: 100.060954430
Case #3: 0.000000000
Case #4: -951.146012869
Case #5: -3.588658615
Case #1:
..
---
Case #2:
..
---
....
Case #3:
---


Average Metric: 6 / 18  (33.3):  36%|███▌      | 18/50 [00:50<01:22,  2.59s/it]

Case #1: 1
Case #2: 3
Case #3: 4
Case #4: 2
Case #5: 9
Case #1: 9
Case #2: 9
Case #3: 21
Case #4: 19
Case #5: 44
Case #1:
.
..
Case #2:
.
..
...
Case #3:
.
..
Case #1: 1
Case #2: 3
Case #3: 5
Case #4: 2
Case #5: 13
Case #1: 9
Case #2: 9
Case #3: 21
Case #4: 19
Case #5: 44
Case #1:
--
...
Case #2:
--
...
----
Case #3:
--
...
Case #1: 1
Case #2: 3
Case #3: 5
Case #4: 2
Case #5: 13
Case #1: 9
Case #2: 9
Case #3: 21
Case #4: 19
Case #5: 44
Case #1: 0
Case #2: 999999992
Case #3: 0
Case #4: 360852209
Case #5: 533073315
Case #6: 59713260


Average Metric: 6 / 19  (31.6):  38%|███▊      | 19/50 [00:56<01:52,  3.62s/it]

Case #1:
--
...
Case #2:
--
...
----
Case #3:
--
...


Average Metric: 6 / 21  (28.6):  40%|████      | 20/50 [00:57<01:24,  2.81s/it]

Case #1: 9
Case #2: 6
Case #3: 15
Case #4: 14
Case #5: 31


Average Metric: 6 / 21  (28.6):  42%|████▏     | 21/50 [00:57<00:59,  2.04s/it]

Case #1: 500000004
Case #2: 437500004
Case #3: 583333338
Case #4: 438019804
Case #5: 339009800
Case #6: 770815060
Case #1: 0
Case #2: 0
Case #3: -1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 0
Case #2: 4
Case #3: 0
Case #4: 9
Case #5: 9
Case #1: 1
Case #2: 5
Case #3: 12
Case #4: 4
Case #5: 0
Case #1: 11
Case #2: 12
Case #3: 24
Case #4: 21
Case #5: 50
Case #1: 500000004
Case #2: 437500004
Case #3: 583333338
Case #4: 438019804
Case #5: 339009800
Case #6: 770815060
Case #1: 0
Case #2: 0
Case #3: -1
Case #4: 0
Case #5: 0
Case #6: 0
Case #1: 1
Case #2: 5
Case #3: 12
Case #4: 4
Case #5: 0
Case #1: 3
Case #2: 4
Case #3: 1
Case #4: 9
Case #5: 10
Case #1: 11
Case #2: 12
Case #3: 24
Case #4: 21
Case #5: 50
Case #1: 0
Case #2: 0
Case #3: 8.0
Case #4: 500000011.0
Case #5: 11.0
Case #1: 11
Case #2: 12
Case #3: 24
Case #4: 21
Case #5: 50
Case #1: 0
Case #2: 0
Case #3: 8
Case #4: 500000011
Case #5: 11
Case #1: 3
Case #2: 14
Case #3: 68
Case #4: 4
Case #5: 2


Average Metric: 6 / 24  (25.0):  48%|████▊     | 24/50 [01:07<01:05,  2.50s/it]

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 4
Case #5: 3


Average Metric: 6 / 25  (24.0):  50%|█████     | 25/50 [01:08<00:47,  1.91s/it]

Case #1: 4
Case #2: 2
Case #3: 1
Case #1: 9
Case #2: 9
Case #3: 14
Case #4: 22
Case #5: 56
Case #6: 160
Case #7: 136
Case #8: 1000000000
Case #1: 1
Case #2: 4
Case #3: 8
Case #4: 1
Case #5: 1
CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 1
Case #2: 8
Case #3: 36
Case #4: 4
Case #5: 0
Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 4
Case #5: 3


Average Metric: 7 / 26  (26.9):  52%|█████▏    | 26/50 [01:12<01:00,  2.53s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling ZejiaCORRECT SOLN FOUND! CODE OPTION 2/3 | DEBUGGING ITER: 0/1.

Case #1: 1
Case #2: 8
Case #3: 36
Case #4: 4
Case #5: 0


Average Metric: 7 / 27  (25.9):  54%|█████▍    | 27/50 [01:14<00:54,  2.36s/it]

Case #1: 200000003
Case #2: 600000008
Case #3: 452439
Case #4: 71428579
Case #5: 700191279
Case #6: 500000005Case #1: 0
Case #2: 0
Case #3: 0

Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia


Average Metric: 8 / 28  (28.6):  56%|█████▌    | 28/50 [01:15<00:47,  2.16s/it]

Case #1: 0
Case #2: 1
Case #3: 0
Case #4: 4
Case #5: 3


Average Metric: 8 / 29  (27.6):  58%|█████▊    | 29/50 [01:18<00:45,  2.17s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 5.099020
Case #2: 0.000000
Case #3: 4.242641
Case #4: 8.544004
Case #5: 8.485281

Average Metric: 8 / 30  (26.7):  60%|██████    | 30/50 [01:19<00:39,  2.00s/it]


Case #1: 850000006
Case #2: 616666671
Case #3: 467255066
Case #4: 23809524
Case #5: 169595187
Case #6: 83333334
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 2.236068
Case #2: 0.000000
Case #3: 3.605551
Case #4: 2.828427
Case #5: 8.485281
Case #1: 0
Case #2: 0
Case #3: 0
Case #1: 700000005
Case #2: 700000005
Case #3: 40859919
Case #4: 17857143
Case #5: 423665072
Case #6: 166666668


Average Metric: 8 / 31  (25.8):  62%|██████▏   | 31/50 [01:24<00:53,  2.80s/it]

Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia
Case #1: 5.830952
Case #2: 5.000000
Case #3: 5.000000
Case #4: 8.544004
Case #5: 8.485281
Case #1: 700000005
Case #2: 700000005
Case #3: 40859919
Case #4: 17857143
Case #5: 423665072
Case #6: 166666668
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia


Average Metric: 8 / 32  (25.0):  64%|██████▍   | 32/50 [01:26<00:49,  2.73s/it]

Case #1: 0.000000
Case #2: 0.000000
Case #3: 0.000000
Case #4: 0.000000
Case #5: 0.000000
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
None
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
Case #1: 5.830952
Case #2: 5.000000
Case #3: 5.000000
Case #4: 8.544004
Case #5: 8.485281
Case #1: Purav Slawek Wai Weiyan
Case #2: Kittipat Liang Paul Yifei
Case #3: Bhuwan Duc Fabien Meihong Vlad Wai
Case #4: Anil Clifton David Lin Ranjeeth Torbjorn
Case #5: Erling Zejia
Case #1: 3
Case #2: 3
Case #3: 0
Case #4: 2
Case #5: 5
Case #6: 17
Case #1: 3.605551
Case #2: 5.000000
Case #3: 3.605551
Case #4: 4.242641
Case #5: 8.485281


Average Metric: 8 / 33  (24.2):  66%|██████▌   | 33/50 [01:31<00:55,  3.26s/it]

Case #1: 9
Case #2: 9
Case #3: 14
Case #4: 17
Case #5: 47
Case #6: 80
Case #7: 69
Case #8: 1000000000
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #1: 3
Case #2: 3
Case #3: 0
Case #4: 1
Case #5: 3
Case #6: 12


Average Metric: 8 / 34  (23.5):  68%|██████▊   | 34/50 [01:34<00:50,  3.15s/it]

Case #1: 5.830952
Case #2: 5.000000
Case #3: 5.000000
Case #4: 4.242641
Case #5: 8.485281
Case #1: 1
Case #2: 2
Case #3: 2
Case #4: 0
Case #5: 5


Average Metric: 9 / 35  (25.7):  70%|███████   | 35/50 [01:36<00:43,  2.89s/it]

Case #1: 2.000000000
Case #2: 6.000000000
Case #3: 6.000000000
Case #4: 2951.000000000
Case #5: 2951.000000000
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #1: 1
Case #2: 1
Case #3: 6
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible


Average Metric: 9 / 36  (25.0):  72%|███████▏  | 36/50 [01:40<00:45,  3.23s/it]

Case #1: 1
Case #2: 1
Case #3: 6


Average Metric: 9 / 37  (24.3):  74%|███████▍  | 37/50 [01:41<00:30,  2.37s/it]

Case #1: 0 0 1
Case #2: 0 1 2 3 4 5 6
Case #3: 2 0 3 1
Case #4: 0 1 2 3 0 1
Case #5: 5 0 1 3 3 7 4 0 1 6 2 0
Case #1: Impossible
Case #2: Impossible
Case #3: Impossible
Case #1: 0
Case #2: 0
Case #3: -1


Average Metric: 9 / 38  (23.7):  76%|███████▌  | 38/50 [01:43<00:29,  2.48s/it]

Case #1: 0 0 1
Case #2: 0 1 2 3 4 5 6
Case #3: 2 0 3 1
Case #4: 0 1 2 3 0 1
Case #5: 5 0 1 3 3 7 4 0 1 6 2 0
Case #1: 0
Case #2: 0
Case #3: -1
Case #1: 0
Case #2: 1
Case #3: 6
None
Case #1: 0
Case #2: 1
Case #3: 6
Case #1: 1.000000 1.000000 1.333333 0.333333
Case #2: 0.500000 0.500000 0.500000 0.500000
Case #3: 0.500000 1.000000 0.500000 0.500000
Case #4: 1.000000 1.000000 1.000000 1.000000
Case #5: 1.000000 1.000000 1.000000 1.000000

Average Metric: 9 / 39  (23.1):  76%|███████▌  | 38/50 [01:49<00:29,  2.48s/it]




Average Metric: 9 / 39  (23.1):  78%|███████▊  | 39/50 [01:49<00:39,  3.56s/it]

Case #1: 0
Case #2: 1
Case #3: 6
Case #1: 0 0 1
Case #2: 0 1 2 3 4 5 6
Case #3: 1 0 3 2
Case #4: 0 1 2 3 0 1
Case #5: 5 0 2 6 1 2 4 0 1 7 3 0
Case #1: 1.000000 1.000000 1.333333 0.333333
Case #2: 0.500000 0.500000 0.500000 0.500000
Case #3: 0.500000 1.000000 0.500000 0.500000
Case #4: 1.000000 1.000000 1.000000 1.000000
Case #5: 1.000000 1.000000 1.000000 1.000000


Average Metric: 9 / 40  (22.5):  80%|████████  | 40/50 [01:54<00:39,  3.91s/it]

Case #1: 1
Case #2: -1
Case #3: 6
Case #1: 0 0 1
Case #2: 0 1 2 3 0 1 2
Case #3: 1 0 3 2
Case #4: 0 1 2 3 0 1
Case #5: 5 0 2 6 1 2 4 0 1 7 3 0


Average Metric: 9 / 41  (22.0):  82%|████████▏ | 41/50 [01:57<00:31,  3.49s/it]

Case #1: 4
Case #2: 3
Case #3: 22
Case #4: 33
Case #5: 27
Case #1: 0 0 1
Case #2: 0 1 2 3 0 1 2
Case #3: 1 0 3 2
Case #4: 0 1 2 3 0 1
Case #5: 5 0 2 6 1 2 4 0 1 7 3 0
Case #1: 4
Case #2: 3
Case #3: 22
Case #4: 33
Case #5: 27


Average Metric: 9 / 42  (21.4):  84%|████████▍ | 42/50 [02:00<00:28,  3.56s/it]

CORRECT SOLN FOUND! CODE OPTION 0/3 | DEBUGGING ITER: 0/1.


Average Metric: 10 / 45  (22.2):  90%|█████████ | 45/50 [02:08<00:14,  2.94s/it]

Case #1: 4
Case #2: 3
Case #3: 22
Case #4: 33
Case #5: 27
Case #1: 5
Case #2: 9
Case #3: 67
Case #4: 78
Case #5: 72
Case #1: 0
Case #2: 5
Case #3: 28
Case #4: 17
Case #5: 23


Average Metric: 10 / 48  (20.8):  96%|█████████▌| 48/50 [02:19<00:05,  2.95s/it]

Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0
Case #1: 0
Case #2: 0
Case #3: 0
Case #4: 0


Average Metric: 10 / 50  (20.0): 100%|██████████| 50/50 [02:35<00:00,  3.12s/it]


Scores so far: [20.0, 20.0]
Best score so far: 20.0


  0%|          | 0/50 [00:00<?, ?it/s]ERROR:dspy.teleprompt.bootstrap:[2m2024-09-25T06:50:03.345730Z[0m [[31m[1merror    [0m] [1mFailed to run or to evaluate example Example({'problem_description': 'Hacker Cup contest strategy often involves a metagame, where choosing which problems to work on might just be an important decision. On a Quest to become more Pro, you encounter an oracle promising to teach you the contest meta if you play her own Meta-game.\n\nThe oracle presents a peg board with \\(2N\\) moving dots. The initial \\(y\\)-positions of the dots are given as two arrays \\(A_{1..N}\\) and \\(B_{1..N}\\). Each second, simultaneously, \\(A_1\\) will move to the end of \\(B\\), while \\(B_1\\) will move to the end of \\(A\\) (with all elements shifting left accordingly).\n\nYou can connect the dots to form a *Meta-like logo* if all of the following are true:\n* For the first half of both arrays, each dot in \\(A\\) is below the corresponding dot in \\(B\\).\n* For the last

RuntimeError: asyncio.run() cannot be called from a running event loop