## Benchpress Hackathon

The challenge comes with a Jupyter notebook for your implementation and various utilities.
We provide a development set and a validation set you can use to develop your solution.
The development set is for testing your code and consists of 300 problems with a varying number of test cases.
You are free to use all data provided with a problem, a sample has the following structure:

```python
{
    # Unique identifier for the problem in the APPS dataset.
    "problem_id": 4424,
    # The problem statement
    "question": "Given three integers ...",
    # The expected function name and the input/output examples
    # representing test cases.
    "input_output": {
        "fn_name": "expression_matter",
        "inputs": [ ... ],
        "outputs": [ ... ]
    },
    "url": "https://www.codewars.com/kata/5ae62fcf252e66d44d00008e",
    "difficulty": "introductory",
    # The starter code for the problem.
    "starter_code": "def expression_matter(a, b, c):\n\t"
}
```

The validation set is consists of 200 problems, and includes an additional key `test_cases` which is used to score your solution with the provided scoring function.

```python
{
    ...
    "test_cases": {
        "fn_name": "expression_matter",
        "inputs": [ ... ],
        "outputs": [ ... ]
    },
    ...
}
```

### Loading Problems

Use the `load_sample` function to load a problem from the development or validation set.

```python
from utilities import load_sample

problem = load_sample(index=0, dataset_path="./data/dev")
```

### Generating Code

Use the `aleph_alpha_client` to generate code.
Make sure your `AA_TOKEN` is set.

```python
from aleph_alpha_client import Client, CompletionRequest, Prompt

client = Client(AA_TOKEN)

request = CompletionRequest(
    prompt=Prompt.from_text("Your prompt."),
    maximum_tokens=256,
)

# API reference for the client:
# https://aleph-alpha-client.readthedocs.io/en/latest/
response = client.complete(request, model=MODEL)
```

### Running Tests

Use the `run_test_cases` function to run the generated code against the test cases.
The function returns a dictionary with the test results, including the expected output, the generated output, a boolean indicating whether the test passed and a traceback in case of an error.

```python
from utilities import run_test_cases

test_results = run_test_cases(
    problem=problem, 
    generation=response.completions[0].completion, 
    timeout=10,
)
```

### Scoring

Use the `score` function to score your solution on the validation set.
It expects a function that takes a problem and a client and returns a generation.

```python
from utilities import score

passed_problems, passed_test_cases = score(
    generation_func=generate_code, 
    client=client,
    dataset_path="./data/val", 
    length=50,
)
```

In [3]:
%pip install --upgrade pip
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [1]:
import os

AA_TOKEN = "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjoyNTk4OCwidG9rZW5faWQiOjY0MTl9.nxJFNx04AMwicY8C6NRY8tWb8FIEGkB4hO7hQywuCiM"
# MODEL = "llama-3.1-8b-instruct-long-context"
MODEL = "llama-3.1-70b-instruct-long-context"

if AA_TOKEN is None:
    raise ValueError("Aleph Alpha Playground token is not set.")


In [83]:
from aleph_alpha_client import Client, CompletionRequest, Prompt
from utilities import load_sample
from multiprocessing import Pool, Manager
import re

client = Client(AA_TOKEN)

cache = {}

LARGE_INPUT_THRESHOLD = 100

# Ключевые слова, указывающие на потенциально сложные задачи
COMPLEXITY_KEYWORDS = [
    "optimize", "efficient", "mathematical", "combinatorial", "dynamic programming",
    "permutation", "combination", "graph", "tree", "dynamic", "recursion", "memoization",
    "greedy", "dp", "mathematics", "number theory", "advanced"
]

def is_complex_problem(problem: dict) -> bool:
    """
    Определяет, является ли задача потенциально сложной на основе описания.
    """
    question = problem.get('question', '').lower()
    # Ищем наличие ключевых слов в описании задачи
    for keyword in COMPLEXITY_KEYWORDS:
        if re.search(r'\b' + re.escape(keyword) + r'\b', question):
            return True
    return False

def generate_prompt(problem: dict, optimized: bool = False) -> str:
    if optimized:
        prompt = (
            "You are a CODE GENERATOR. Your task is to implement a **highly optimized** Python function based on the following details:\n\n"
            f"PROBLEM DESCRIPTION:\n{problem['question']}\n\n"
            "STARTER CODE (if any):\n"
            f"{problem.get('starter_code', '')}\n\n"
            "INSTRUCTIONS:\n"
            "1. Write **only the Python function implementation** that solves the problem.\n"
            "2. Ensure the implementation is:\n"
            "   - Free of syntax errors and ready to execute.\n"
            "   - Fully adherent to Python's PEP8 standards for code style.\n"
            "   - Optimized for performance, utilizing efficient algorithms and mathematical formulas where applicable.\n"
            "   - Example, you can use math.comb\n"
            "3. Do not include any:\n"
            "   - Comments or explanations.\n"
            "   - Test cases or print statements.\n"
            "   - Extra text outside the function body.\n"
            "4. Handle edge cases to avoid runtime errors such as key errors, division by zero, or type mismatches.\n\n"
            "5. Do not use greedy algorithms.\n"
            "EXAMPLE FUNCTION FORMAT:\n\n"
            "def function_name(parameters):\n"
            "    # Replace this line with the implementation\n\n"
            "ADDITIONAL CONSIDERATIONS:\n"
            "1. If there are input constraints, enforce them in the implementation.\n"
            "2. If the function relies on a specific data structure or library, ensure its usage is correct and imported.\n\n"
            "IMPLEMENT THE FUNCTION BELOW:\n"
        )
    else:
        prompt = (
            "You are a CODE GENERATOR. Your task is to implement a Python function based on the following details:\n\n"
            f"PROBLEM DESCRIPTION:\n{problem['question']}\n\n"
            "STARTER CODE (if any):\n"
            f"{problem.get('starter_code', '')}\n\n"
            "INSTRUCTIONS:\n"
            "1. Write **only the Python function implementation** that solves the problem.\n"
            "2. Ensure the implementation is:\n"
            "   - Free of syntax errors and ready to execute.\n"
            "   - Fully adherent to Python's PEP8 standards for code style.\n"
            "   - Optimal, concise, and free from redundancy.\n"
            "3. Do not include any:\n"
            "   - Comments or explanations.\n"
            "   - Test cases or print statements.\n"
            "   - Extra text outside the function body.\n"
            "4. Handle edge cases to avoid runtime errors such as key errors, division by zero, or type mismatches.\n"
            "5. MUST use hints if given.\n"
            "EXAMPLE FUNCTION FORMAT:\n\n"
            "def function_name(parameters):\n"
            "    # Replace this line with the implementation\n\n"
            "ADDITIONAL CONSIDERATIONS:\n"
            "1. If there are input constraints, enforce them in the implementation.\n"
            "2. If the function relies on a specific data structure or library, ensure its usage is correct and imported.\n\n"
            "IMPLEMENT THE FUNCTION BELOW:\n"
        )
    return prompt

def clean_code(generated_code: str) -> str:
    cleaned_code = generated_code.replace("```python", "").replace("```", "").strip()
    
    if cleaned_code.endswith(","):
        cleaned_code = cleaned_code.rsplit(",", 1)[0].strip()
    
    return cleaned_code

def fetch_or_generate(prompt: str, client: Client) -> str:
    if prompt in cache:
        return cache[prompt]

    request = CompletionRequest(
        prompt=Prompt.from_text(prompt),
        maximum_tokens=256,
        temperature=0.3,
    )
    response = client.complete(request, model=MODEL)
    result = clean_code(response.completions[0].completion)
    cache[prompt] = result
    return result

def generate_single_code(args) -> str:
    prompt, client = args
    return fetch_or_generate(prompt, client)

def generate_multiple_codes(problem: dict, client: Client, num_variations: int = 100) -> list[str]:
    # Определяем, является ли задача потенциально сложной
    is_complex = is_complex_problem(problem)
    
    if is_complex:
        # Генерируем ограниченное количество оптимизированных версий кода
        num_variations = min(num_variations, 100)  # Например, максимум 5 вариаций
        prompts = [generate_prompt(problem, optimized=True) for _ in range(num_variations)]
    else:
        # Генерируем стандартные вариации кода
        prompts = [generate_prompt(problem) for _ in range(num_variations)]
    
    with Pool() as pool:
        results = pool.map(generate_single_code, [(prompt, client) for prompt in prompts])
    
    return results

def get_smallest_input(inputs: list[list]) -> list:
    """
    Возвращает самое маленькое входное множество по длине.
    """
    if not inputs:
        return []
    return min(inputs, key=lambda x: len(x))

def evaluate_code_variations_on_inputs(
    generated_codes: list[str], inputs: list[list], fn_name: str, is_complex: bool
) -> dict:
    gen_code_outputs = {}
    
    if is_complex and len(inputs) > 1:
        # Берем самое маленькое входное множество
        input_set = get_smallest_input(inputs)
        input_key = str(input_set)
        gen_code_outputs[input_key] = []

        for code in generated_codes:
            local_scope = {}
            try:
                exec(code, {}, local_scope)
                if fn_name not in local_scope:
                    gen_code_outputs[input_key].append(f"Error: Function '{fn_name}' not defined")
                    continue

                func = local_scope[fn_name]
                result = func(*input_set)
                gen_code_outputs[input_key].append(result)

            except Exception as e:
                gen_code_outputs[input_key].append(f"Error: {str(e)}")
    else:
        # Обработка всех входных данных как обычно
        for input_set in inputs:
            input_key = str(input_set)
            gen_code_outputs[input_key] = []

            for code in generated_codes:
                local_scope = {}
                try:
                    exec(code, {}, local_scope)
                    if fn_name not in local_scope:
                        gen_code_outputs[input_key].append(f"Error: Function '{fn_name}' not defined")
                        continue

                    func = local_scope[fn_name]
                    result = func(*input_set)
                    gen_code_outputs[input_key].append(result)

                except Exception as e:
                    gen_code_outputs[input_key].append(f"Error: {str(e)}")
    
    return gen_code_outputs

def extract_ground_truth_outputs(problem: dict) -> dict:
    inputs = problem["input_output"]["inputs"]
    outputs = problem["input_output"]["outputs"]

    return {str(inputs[i]): outputs[i][0] for i in range(len(inputs))}

def find_correct_code(generated_codes: list[str], results: dict, ground_truth: dict) -> dict:
    correct_code_mapping = {}

    for input_key, outputs in results.items():
        if input_key in ground_truth:
            correct_indices = [
                idx for idx, output in enumerate(outputs) if output == ground_truth[input_key]
            ]
            correct_code_mapping[input_key] = [
                generated_codes[idx] for idx in correct_indices
            ]

    return correct_code_mapping

def generate_code(problem: dict, client: Client) -> str:
    generated_code_variations = generate_multiple_codes(problem, client, num_variations=100)

    ground_truth_outputs = extract_ground_truth_outputs(problem)

    inputs = problem["input_output"]["inputs"]
    fn_name = problem["input_output"]["fn_name"]
    is_complex = is_complex_problem(problem)

    gen_code_outputs = evaluate_code_variations_on_inputs(
        generated_codes=generated_code_variations,
        inputs=inputs,
        fn_name=fn_name,
        is_complex=is_complex
    )

    correct_code_mapping = find_correct_code(
        generated_codes=generated_code_variations,
        results=gen_code_outputs,
        ground_truth=ground_truth_outputs,
    )

    for input_key, codes in correct_code_mapping.items():
        if codes:
            return codes[0]

    return "Error: No correct implementation found."

problem = load_sample(index=72, dataset_path="./data/dev")
print(problem)
correct_code = generate_code(problem, client)

print("Correct Code:\n", correct_code)


{'problem_id': 3501, 'question': 'You have a grid with `$m$` rows and `$n$` columns. Return the number of unique ways that start from the top-left corner and go to the bottom-right corner. You are only allowed to move right and down.\n\nFor example, in the below grid of `$2$` rows and `$3$` columns, there are `$10$` unique paths:\n\n```\no----o----o----o\n|    |    |    |\no----o----o----o\n|    |    |    |\no----o----o----o\n```\n\n**Note:** there are random tests for grids up to 1000 x 1000 in most languages, so a naive solution will not work.\n\n---\n*Hint: use mathematical permutation and combination*', 'input_output': {'fn_name': 'number_of_routes', 'inputs': [[1, 1], [5, 1], [3, 4], [5, 6], [10, 10], [100, 3], [123, 456]], 'outputs': [[2], [6], [35], [462], [184756], [176851], [448843261729071620474858205566477025894357385375655014634306680560209909590802545266425906052279365647506075241055256064119806400]]}, 'url': 'https://www.codewars.com/kata/56a127b14d9687bba200004d', 'diffi

In [43]:
from utilities import score

manager = Manager()
cache = manager.dict()

passed_problems, passed_test_cases = score(
    generation_func=generate_code, 
    client=client,
    dataset_path="./data/val", 
    length=300,
)

print(f"Passed {passed_problems*100}% of problems")
print(f"Passed {passed_test_cases*100}% of test cases")

  0%|          | 0/300 [00:00<?, ?it/s]

  0%|          | 1/300 [00:07<36:57,  7.42s/it]

[{'passed': True, 'input': [[1, 2, 7, 0, 5], 0], 'output': [5.0], 'expected_output': [5.0], 'traceback': None}]


  1%|          | 2/300 [00:11<27:27,  5.53s/it]

[{'passed': True, 'input': [[0.5, 0.5, 0.5], 30], 'output': [[0.5, 0.5, 0.5, 1.5, 2.5, 4.5, 8.5, 15.5, 28.5, 52.5, 96.5, 177.5, 326.5, 600.5, 1104.5, 2031.5, 3736.5, 6872.5, 12640.5, 23249.5, 42762.5, 78652.5, 144664.5, 266079.5, 489396.5, 900140.5, 1655616.5, 3045153.5, 5600910.5, 10301680.5]], 'expected_output': [[0.5, 0.5, 0.5, 1.5, 2.5, 4.5, 8.5, 15.5, 28.5, 52.5, 96.5, 177.5, 326.5, 600.5, 1104.5, 2031.5, 3736.5, 6872.5, 12640.5, 23249.5, 42762.5, 78652.5, 144664.5, 266079.5, 489396.5, 900140.5, 1655616.5, 3045153.5, 5600910.5, 10301680.5]], 'traceback': None}]


  1%|          | 3/300 [00:22<38:35,  7.80s/it]

[{'passed': True, 'input': ['1z 2t 3q 5x 6u 8a 7b'], 'output': [8], 'expected_output': [8], 'traceback': None}]


  1%|▏         | 4/300 [00:26<31:14,  6.33s/it]

[{'passed': True, 'input': [['B', 'C', '', '']], 'output': [''], 'expected_output': [''], 'traceback': None}]


  2%|▏         | 5/300 [00:34<33:46,  6.87s/it]

[{'passed': True, 'input': ['PLPPLPLLEELELRPFFMAAGGTPLAMMGG'], 'output': [50.0], 'expected_output': [50], 'traceback': None}]


  2%|▏         | 6/300 [00:36<26:33,  5.42s/it]

[{'passed': True, 'input': [[-6, 20, -1, 10, -12]], 'output': [3], 'expected_output': [3], 'traceback': None}]


  2%|▏         | 7/300 [00:39<22:05,  4.52s/it]

[{'passed': False, 'input': [17, 76], 'output': [753218634355645708621994113678658], 'expected_output': [60022109925215517405815155929907200], 'traceback': None}]


  3%|▎         | 8/300 [00:45<24:05,  4.95s/it]

[{'passed': True, 'input': [999.5, 61.87, 1000.0, 3, 0], 'output': [False], 'expected_output': [False], 'traceback': None}]


  3%|▎         | 9/300 [00:54<30:15,  6.24s/it]

[{'passed': True, 'input': [5000, 9], 'output': [[426, 2250, 967696]], 'expected_output': [[426, 2250, 967696]], 'traceback': None}]


  3%|▎         | 10/300 [00:58<27:55,  5.78s/it]

[{'passed': True, 'input': ["Wół go pyta: 'Panie chrząszczu,Po co pan tak brzęczy w gąszczu?'"], 'output': ["Wol go pyta: 'Panie chrzaszczu,Po co pan tak brzeczy w gaszczu?'"], 'expected_output': ["Wol go pyta: 'Panie chrzaszczu,Po co pan tak brzeczy w gaszczu?'"], 'traceback': None}]


  4%|▎         | 11/300 [01:01<23:16,  4.83s/it]

[{'passed': True, 'input': [90, 2], 'output': ['30x^3'], 'expected_output': ['30x^3'], 'traceback': None}]
type 0 compilation error = invalid syntax (<string>, line 16)


  4%|▍         | 12/300 [01:09<27:50,  5.80s/it]

[{'passed': False, 'input': None, 'output': None, 'expected_output': None, 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 185, in run_test\n    tmp_sol = RuntimeModule.from_string("tmp_sol", "", sol)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 169, in _newf\n    return self._items[f.__name__][len(args)](*args, **kwargs)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 279, in from_string\n    _exec(s, g)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 97, in _exec\n    def _exec(m,g): exec(m,g)\n  File "<string>", line 16\n    Error: No correct implementation found.\n              ^\nSyntaxError: invalid syntax\n'}]


  4%|▍         | 13/300 [01:17<30:01,  6.28s/it]

[{'passed': True, 'input': [5], 'output': [41], 'expected_output': [41], 'traceback': None}]


  5%|▍         | 14/300 [01:19<23:57,  5.03s/it]

[{'passed': True, 'input': ['GAAGCTTATCCGTTCCTGAAGGCTGTGGCATCCTCTAAATCAGACTTGGCTACGCCGTTAGCCGAGGGCTTAGCGTTGAGTGTCATTATATACGCGGCCTGCGACCTGGCCACACAATGCCCTCGAAAATTTTTCTTTCGGTTATACGAGTTGCGAAACCTTTCGCGCGTAGACGAAGAATTTGAAGTGGCCTACACCGTTTGGAAAGCCGTTCTCATTAGAATGGTACCGACTACTCGGCTCGGAGTCATTGTATAGGGAGAGTGTCGTATCAACATCACACACTTTTAGCATTTAAGGTCCATGGCCGTTGACAGGTACCGA'], 'output': ['GAAGCUUAUCCGUUCCUGAAGGCUGUGGCAUCCUCUAAAUCAGACUUGGCUACGCCGUUAGCCGAGGGCUUAGCGUUGAGUGUCAUUAUAUACGCGGCCUGCGACCUGGCCACACAAUGCCCUCGAAAAUUUUUCUUUCGGUUAUACGAGUUGCGAAACCUUUCGCGCGUAGACGAAGAAUUUGAAGUGGCCUACACCGUUUGGAAAGCCGUUCUCAUUAGAAUGGUACCGACUACUCGGCUCGGAGUCAUUGUAUAGGGAGAGUGUCGUAUCAACAUCACACACUUUUAGCAUUUAAGGUCCAUGGCCGUUGACAGGUACCGA'], 'expected_output': ['GAAGCUUAUCCGUUCCUGAAGGCUGUGGCAUCCUCUAAAUCAGACUUGGCUACGCCGUUAGCCGAGGGCUUAGCGUUGAGUGUCAUUAUAUACGCGGCCUGCGACCUGGCCACACAAUGCCCUCGAAAAUUUUUCUUUCGGUUAUACGAGUUGCGAAACCUUUCGCGCGUAGACGAAGAAUUUGAAGUGGCCUACACCGUUUGGAAAGCCGUUCUCAUUAGAAUGGUACCGACUACUCGGCUCGGAGUCAUUGUAUAGGGAGAGUGUCGUAUCAACAUCACA

  5%|▌         | 15/300 [01:22<21:40,  4.56s/it]

[{'passed': True, 'input': [3, 118], 'output': [121], 'expected_output': [121], 'traceback': None}]


  5%|▌         | 16/300 [01:24<17:35,  3.72s/it]

[{'passed': True, 'input': ['knowledge'], 'output': [96], 'expected_output': [96], 'traceback': None}]


  6%|▌         | 17/300 [01:33<24:43,  5.24s/it]

[{'passed': True, 'input': [501, 5000], 'output': [[998, 1996, 2994, 3992, 4990]], 'expected_output': [[998, 1996, 2994, 3992, 4990]], 'traceback': None}]


  6%|▌         | 18/300 [01:43<31:15,  6.65s/it]

[{'passed': False, 'input': ['oaisjdfowjefpoibugsjsofijeo oi bugs o bug f bug poaj sfd s'], 'output': ['oaisjdfowjefpoisjsofijeo oi s o  f  poaj sfd s'], 'expected_output': ['oaisjdfowjefpoibugsjsofijeo oi bugs o  f  poaj sfd s'], 'traceback': None}]


  6%|▋         | 19/300 [01:45<25:17,  5.40s/it]

[{'passed': False, 'input': [14, [13, 15, 14, 14, 15, 13]], 'output': ['13 13 14 14'], 'expected_output': ['13,13,14,14'], 'traceback': None}]


  7%|▋         | 20/300 [01:47<19:39,  4.21s/it]

[{'passed': True, 'input': [['anyone', 'want', 'to', 'hire', 'me?'], 'me?'], 'output': [True], 'expected_output': [True], 'traceback': None}]


  7%|▋         | 21/300 [01:58<29:22,  6.32s/it]

[{'passed': False, 'input': ['10♣', 'joker', '♠'], 'output': ['Someone cheats.'], 'expected_output': ['The second card won.'], 'traceback': None}]


  7%|▋         | 22/300 [02:03<27:20,  5.90s/it]

[{'passed': True, 'input': [702], 'output': ['ZZ'], 'expected_output': ['ZZ'], 'traceback': None}]
type 0 compilation error = invalid syntax (<string>, line 16)


  8%|▊         | 23/300 [02:09<28:15,  6.12s/it]

[{'passed': False, 'input': None, 'output': None, 'expected_output': None, 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 185, in run_test\n    tmp_sol = RuntimeModule.from_string("tmp_sol", "", sol)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 169, in _newf\n    return self._items[f.__name__][len(args)](*args, **kwargs)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 279, in from_string\n    _exec(s, g)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 97, in _exec\n    def _exec(m,g): exec(m,g)\n  File "<string>", line 16\n    Error: No correct implementation found.\n              ^\nSyntaxError: invalid syntax\n'}]
type 0 compilation error = invalid syntax (<string>, line 16)


  8%|▊         | 24/300 [02:12<23:02,  5.01s/it]

[{'passed': False, 'input': None, 'output': None, 'expected_output': None, 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 185, in run_test\n    tmp_sol = RuntimeModule.from_string("tmp_sol", "", sol)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 169, in _newf\n    return self._items[f.__name__][len(args)](*args, **kwargs)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 279, in from_string\n    _exec(s, g)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 97, in _exec\n    def _exec(m,g): exec(m,g)\n  File "<string>", line 16\n    Error: No correct implementation found.\n              ^\nSyntaxError: invalid syntax\n'}]


  8%|▊         | 25/300 [02:16<21:30,  4.69s/it]

[{'passed': True, 'input': [-25], 'output': [''], 'expected_output': [''], 'traceback': None}]


  9%|▊         | 26/300 [02:24<25:46,  5.64s/it]

[{'passed': True, 'input': [[[7, 66], [71, 7], [0, 94], [16, 93], [33, 49], [49, 81], [17, 2], [95, 71], [32, 14], [31, 41], [92, 72], [12, 79]], [{'y': 38, 'x': 32, 'id': 1}, {'y': 49, 'x': 73, 'id': 2}, {'y': 85, 'x': 50, 'id': 3}, {'y': 2, 'x': 79, 'id': 4}, {'y': 20, 'x': 44, 'id': 5}, {'y': 56, 'x': 17, 'id': 6}, {'y': 43, 'x': 26, 'id': 7}, {'y': 61, 'x': 89, 'id': 8}, {'y': 18, 'x': 15, 'id': 9}, {'y': 34, 'x': 41, 'id': 10}, {'y': 27, 'x': 99, 'id': 11}]], 'output': ['The best location is number 6 with the coordinates x = 17 and y = 56'], 'expected_output': ['The best location is number 6 with the coordinates x = 17 and y = 56'], 'traceback': None}]
Standard input runtime error or time limit exceeded error = invalid literal for int() with base 10: '12S'


  9%|▉         | 27/300 [02:29<25:36,  5.63s/it]

[{'passed': False, 'input': [['12S', 'TGTTTCTCCAAG']], 'output': None, 'expected_output': [False], 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 296, in run_test\n    output = method(*inputs)\n  File "<string>", line 21, in is_matched\nValueError: invalid literal for int() with base 10: \'12S\'\n'}]


  9%|▉         | 28/300 [02:33<22:20,  4.93s/it]

[{'passed': True, 'input': [['b', 'd']], 'output': ['c'], 'expected_output': ['c'], 'traceback': None}]


 10%|▉         | 29/300 [02:36<20:39,  4.57s/it]

[{'passed': False, 'input': [[20, 26, 13, -47, -35, 39, 24, 46, -16, 5, 46, -30, -33, -38, -47, 23, 10, -39, -36, 41, 5, -24, 28, -30, 40, -24, -28, -17, -36, 41]], 'output': [None], 'expected_output': [5], 'traceback': None}]


 10%|█         | 30/300 [02:42<22:07,  4.92s/it]

[{'passed': True, 'input': ['', ''], 'output': [[]], 'expected_output': [[]], 'traceback': None}]


 10%|█         | 31/300 [02:53<30:55,  6.90s/it]

[{'passed': True, 'input': ['2d6++4'], 'output': [False], 'expected_output': [False], 'traceback': None}]


 11%|█         | 32/300 [02:56<25:21,  5.68s/it]

[{'passed': True, 'input': [10, -10, 10], 'output': [False], 'expected_output': [False], 'traceback': None}]


 11%|█         | 33/300 [03:00<23:03,  5.18s/it]

[{'passed': False, 'input': ['UFFDDFDUDFUFUUFFDDFDUDFUFUUFFDDFDUDFUFUUFFDDFDUDFUFUUFFDDFDUDFUFUUFFDDFDUDFUFU'], 'output': [0], 'expected_output': [6], 'traceback': None}]


 11%|█▏        | 34/300 [03:06<23:17,  5.25s/it]

[{'passed': True, 'input': [9], 'output': ['Nine'], 'expected_output': ['Nine'], 'traceback': None}]


 12%|█▏        | 35/300 [03:23<39:36,  8.97s/it]

[{'passed': False, 'input': [20000, 5], 'output': ['224743224759224771224797224813'], 'expected_output': ['09334'], 'traceback': None}]
Standard input runtime error or time limit exceeded error = list index out of range


 12%|█▏        | 36/300 [03:35<43:19,  9.84s/it]

[{'passed': False, 'input': ['2M ohms'], 'output': None, 'expected_output': ['red black green gold'], 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 296, in run_test\n    output = method(*inputs)\n  File "<string>", line 31, in encode_resistor_colors\nIndexError: list index out of range\n'}]


 12%|█▏        | 37/300 [03:39<35:11,  8.03s/it]

[{'passed': True, 'input': ['pippi'], 'output': ["'pippi'"], 'expected_output': ["'pippi'"], 'traceback': None}]


 13%|█▎        | 38/300 [03:44<30:24,  6.96s/it]

[{'passed': True, 'input': [63761, 3], 'output': [1.0], 'expected_output': [1], 'traceback': None}, {'passed': True, 'input': [132921, 3], 'output': [4.0], 'expected_output': [4], 'traceback': None}, {'passed': True, 'input': [10383, 6], 'output': [12933.0], 'expected_output': [12933], 'traceback': None}]


 13%|█▎        | 39/300 [03:50<29:27,  6.77s/it]

[{'passed': True, 'input': ['Vegan Black Metal Chef hits the big time on Amazon TV'], 'output': ['Vëgän Bläck Mëtäl Chëf hïts thë bïg tïmë ön Ämäzön TV'], 'expected_output': ['Vëgän Bläck Mëtäl Chëf hïts thë bïg tïmë ön Ämäzön TV'], 'traceback': None}]


 13%|█▎        | 40/300 [03:59<32:52,  7.59s/it]

[{'passed': True, 'input': [[2, 169, 13, -5, 0, -1], 4], 'output': [2], 'expected_output': [2], 'traceback': None}]


 14%|█▎        | 41/300 [04:01<25:14,  5.85s/it]

[{'passed': True, 'input': [9453], 'output': [False], 'expected_output': [False], 'traceback': None}]


 14%|█▍        | 42/300 [04:10<29:25,  6.84s/it]

[{'passed': True, 'input': [[1079, 490, 339, 180], [180, 250, 1200, 1980]], 'output': [[4, 4, 1, 1]], 'expected_output': [[4, 4, 1, 1]], 'traceback': None}]


 14%|█▍        | 43/300 [04:18<30:06,  7.03s/it]

[{'passed': True, 'input': [''], 'output': [''], 'expected_output': [''], 'traceback': None}]


 15%|█▍        | 44/300 [04:22<26:15,  6.16s/it]

[{'passed': True, 'input': [['hello', 2, ['text', [4, 5]]], [[]], '[list]'], 'output': [['hello', 2, 'text', 4, 5, '[list]']], 'expected_output': [['hello', 2, 'text', 4, 5, '[list]']], 'traceback': None}]


 15%|█▌        | 45/300 [04:24<20:38,  4.86s/it]

[{'passed': True, 'input': [''], 'output': [True], 'expected_output': [True], 'traceback': None}]


 15%|█▌        | 46/300 [04:26<16:48,  3.97s/it]

[{'passed': True, 'input': [20, 5, 5], 'output': [0], 'expected_output': [0], 'traceback': None}]


 16%|█▌        | 47/300 [04:31<18:05,  4.29s/it]

[{'passed': True, 'input': [[9, 1], [9, 1], 9], 'output': [1], 'expected_output': [1], 'traceback': None}]


 16%|█▌        | 48/300 [04:34<17:13,  4.10s/it]

[{'passed': True, 'input': ['ppipip', 'Pippi'], 'output': [False], 'expected_output': [False], 'traceback': None}]
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!
Disarium !!
Not !!


 16%|█▋        | 49/300 [04:41<19:53,  4.76s/it]

[{'passed': True, 'input': [2646798], 'output': ['Disarium !!'], 'expected_output': ['Disarium !!'], 'traceback': None}]


 17%|█▋        | 50/300 [04:45<18:45,  4.50s/it]

[{'passed': True, 'input': ['krish21an'], 'output': ['nahsirk'], 'expected_output': ['nahsirk'], 'traceback': None}]
type 0 compilation error = invalid syntax (<string>, line 16)


 17%|█▋        | 51/300 [04:53<23:35,  5.69s/it]

[{'passed': False, 'input': None, 'output': None, 'expected_output': None, 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 185, in run_test\n    tmp_sol = RuntimeModule.from_string("tmp_sol", "", sol)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 169, in _newf\n    return self._items[f.__name__][len(args)](*args, **kwargs)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 279, in from_string\n    _exec(s, g)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 97, in _exec\n    def _exec(m,g): exec(m,g)\n  File "<string>", line 16\n    Error: No correct implementation found.\n              ^\nSyntaxError: invalid syntax\n'}]


 17%|█▋        | 52/300 [04:57<21:45,  5.26s/it]

[{'passed': True, 'input': ['{{{{{{{{{{{((((((([])))))))}}}}}}}}}}'], 'output': [False], 'expected_output': [False], 'traceback': None}]


 18%|█▊        | 53/300 [05:00<18:16,  4.44s/it]

[{'passed': True, 'input': [[2, 4, 7], 1, 101], 'output': [66963], 'expected_output': [66963], 'traceback': None}]


 18%|█▊        | 54/300 [05:10<25:03,  6.11s/it]

[{'passed': True, 'input': ['abcde', '2db2a2ec'], 'output': ['2'], 'expected_output': ['2'], 'traceback': None}]


 18%|█▊        | 55/300 [05:18<27:25,  6.72s/it]

[{'passed': True, 'input': ['(((1+2)-(3)))', ['(', ')']], 'output': [['1+2', '-', '3']], 'expected_output': [['1+2', '-', '3']], 'traceback': None}]


 19%|█▊        | 56/300 [05:28<31:17,  7.70s/it]

[{'passed': True, 'input': [[[[3], [4], [5]], [9], [9, 9], [8], [[1, 2, 3], [77777]], [['a']]]], 'output': [[[3], [4], [5], 9, 9, 9, 8, [1, 2, 3], [77777], ['a']]], 'expected_output': [[[3], [4], [5], 9, 9, 9, 8, [1, 2, 3], [77777], ['a']]], 'traceback': None}]


 19%|█▉        | 57/300 [05:30<24:15,  5.99s/it]

[{'passed': True, 'input': [['jASon', 'JAsoN', 'JaSON', 'jasON'], ['JasoN', 'jASOn', 'JAsoN', 'jASon', 'JASON']], 'output': [['JAsoN', 'jASon']], 'expected_output': [['JAsoN', 'jASon']], 'traceback': None}]


 19%|█▉        | 58/300 [05:36<24:36,  6.10s/it]

[{'passed': False, 'input': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], 'output': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], 'expected_output': [[0, 0, 0, 0, 0, 0, 0]], 'traceback': None}]


 20%|█▉        | 59/300 [05:45<27:50,  6.93s/it]

[{'passed': True, 'input': ['peter', 'peter'], 'output': [0], 'expected_output': [0], 'traceback': None}]


 20%|██        | 60/300 [05:55<31:10,  7.79s/it]

[{'passed': True, 'input': [4], 'output': [328], 'expected_output': [328], 'traceback': None}]


 20%|██        | 61/300 [06:01<29:24,  7.38s/it]

[{'passed': True, 'input': [999, 2500], 'output': [[2222, 2223, 2225, 2227, 2232, 2233, 2235, 2252, 2253, 2255, 2257, 2272, 2275, 2277, 2322, 2323, 2325, 2327, 2332, 2335, 2337, 2352, 2353, 2355, 2372, 2373, 2375]], 'expected_output': [[2222, 2223, 2225, 2227, 2232, 2233, 2235, 2252, 2253, 2255, 2257, 2272, 2275, 2277, 2322, 2323, 2325, 2327, 2332, 2335, 2337, 2352, 2353, 2355, 2372, 2373, 2375]], 'traceback': None}]


 21%|██        | 62/300 [06:10<30:46,  7.76s/it]

[{'passed': True, 'input': [2, 10000000, 11000000], 'output': [[10000139, 10000141]], 'expected_output': [[10000139, 10000141]], 'traceback': None}]


 21%|██        | 63/300 [06:13<24:30,  6.20s/it]

[{'passed': True, 'input': [4], 'output': [[1, 2, 4, 8, 16]], 'expected_output': [[1, 2, 4, 8, 16]], 'traceback': None}]


 21%|██▏       | 64/300 [06:15<19:51,  5.05s/it]

[{'passed': True, 'input': [16], 'output': [3], 'expected_output': [3], 'traceback': None}]


 22%|██▏       | 65/300 [06:20<20:11,  5.16s/it]

[{'passed': False, 'input': [60], 'output': [[0, 1, 0]], 'expected_output': [[0, 0, 3]], 'traceback': None}]


 22%|██▏       | 66/300 [06:23<17:06,  4.39s/it]

[{'passed': True, 'input': [1, 0, 0], 'output': [3600000], 'expected_output': [3600000], 'traceback': None}]
type 0 compilation error = invalid syntax (<string>, line 16)


 22%|██▏       | 67/300 [06:34<25:11,  6.49s/it]

[{'passed': False, 'input': None, 'output': None, 'expected_output': None, 'traceback': 'Traceback (most recent call last):\n  File "/home/gulden/makeathon/benchpress-hackathon/utilities/testing_util.py", line 185, in run_test\n    tmp_sol = RuntimeModule.from_string("tmp_sol", "", sol)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 169, in _newf\n    return self._items[f.__name__][len(args)](*args, **kwargs)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 279, in from_string\n    _exec(s, g)\n  File "/home/gulden/makeathon/benchpress-hackathon/.venv/lib64/python3.9/site-packages/pyext.py", line 97, in _exec\n    def _exec(m,g): exec(m,g)\n  File "<string>", line 16\n    Error: No correct implementation found.\n              ^\nSyntaxError: invalid syntax\n'}]


 23%|██▎       | 68/300 [06:37<20:37,  5.33s/it]

[{'passed': True, 'input': ['hello, how, are, you, doing, today', 3], 'output': ['today, are, doing, hello, you, how'], 'expected_output': ['today, are, doing, hello, you, how'], 'traceback': None}]


 23%|██▎       | 69/300 [06:43<21:40,  5.63s/it]

[{'passed': True, 'input': [6, 6], 'output': [[['O', 'X', 'O', 'X', 'O', 'X'], ['X', 'O', 'X', 'O', 'X', 'O'], ['O', 'X', 'O', 'X', 'O', 'X'], ['X', 'O', 'X', 'O', 'X', 'O'], ['O', 'X', 'O', 'X', 'O', 'X'], ['X', 'O', 'X', 'O', 'X', 'O']]], 'expected_output': [[['O', 'X', 'O', 'X', 'O', 'X'], ['X', 'O', 'X', 'O', 'X', 'O'], ['O', 'X', 'O', 'X', 'O', 'X'], ['X', 'O', 'X', 'O', 'X', 'O'], ['O', 'X', 'O', 'X', 'O', 'X'], ['X', 'O', 'X', 'O', 'X', 'O']]], 'traceback': None}]


 23%|██▎       | 70/300 [06:48<20:36,  5.38s/it]

[{'passed': True, 'input': [['sheep', 'sheep', 'wolf']], 'output': ['Pls go away and stop eating my sheep'], 'expected_output': ['Pls go away and stop eating my sheep'], 'traceback': None}]


 24%|██▎       | 71/300 [06:52<18:46,  4.92s/it]

[{'passed': True, 'input': [['inECnBMAA/u', 'ABAaIUOUx/M']], 'output': [True], 'expected_output': [True], 'traceback': None}]


 24%|██▎       | 71/300 [24:12<1:18:04, 20.46s/it]


KeyboardInterrupt: 