# 🚀 Optimizing Task Efficiency in ARC-AGI with Smart Sorting

This notebook introduces a function designed to **optimize the order in which challenges are processed** by sorting them based on computational complexity. Complexity is determined by the number of cells in the matrices within a challenge's examples. 

Although sorting can be done in both ascending and descending order, this function is particularly geared towards sorting in **ascending order**. By tackling simpler tasks first, we can **maximize the number of challenges completed** during time-limited evaluations. This strategy ensures that less complex puzzles are addressed early, boosting overall completion rates and providing an **efficient approach** to tackling the ARC-AGI challenge. 🧠💡

## Initialization

🚨🚨🚨 change `dev_path` to `prod_path` for Kaggle testing 🚨🚨🚨

In [1]:
from abstract_and_reason.assets import load_json

dev_path = '../data/challenges/' # your own challenge directory
prod_path = '/kaggle/input/arc-prize-2024/' # path may change in 2025 arc-prize

base_path = dev_path # /!\ change dev_path to prod_path for Kaggle testing

# Reading files
training_challenges =  load_json(base_path +'arc-agi_training_challenges.json')
training_solutions =   load_json(base_path +'arc-agi_training_solutions.json')
evaluation_challenges = load_json(base_path +'arc-agi_evaluation_challenges.json')
evaluation_solutions = load_json(base_path +'arc-agi_evaluation_solutions.json')
test_challenges = load_json(base_path +'arc-agi_test_challenges.json')
sample_submission = load_json(base_path + 'sample_submission.json')

## The sorting function

Feel free to adapt the complexity rule to your need

In [2]:
def sort_challenges_by_size(challenges, ascending=True):
    """
    Sorts the challenges by the number of cells in their training examples (input+output).

    This function sorts a dictionary of challenges ID based on the total number 
    of cells (elements) in the 'input' and 'output' grids of the 'train' examples.

    Parameters:
    -----------
    challenges : dict
        A dictionary where keys are challenge IDs and values are challenge details.
        Each challenge contains a 'train' key, which is a list of examples, and each 
        example has 'input' and 'output' lists of lists.

    ascending : bool, optional (default=True)
        If True, the challenges are sorted in ascending order by the number of cells.
        If False, they are sorted in descending order.

    Returns:
    --------
    list
        A list of challenge IDs sorted by the number of cells in the 'train' examples.


    Example:
    --------
    res = sort_challenges_by_size(training_challenges)
    """
    def count_challenge_cells(challenge):
        return sum(
            extract_numbers(example['input']) + extract_numbers(example['output']) 
            for example in challenge['train']
        )

    def extract_numbers(list_of_lists):
        return sum(len(sublist) for sublist in list_of_lists)
    
    def check_ids(list1, list2):
        return sorted(list1) == sorted(list2)
    
    def sort_ids_by_numbers(ids, numbers, ascending=True):
        return [id for _, id in sorted(zip(numbers, ids), reverse=not ascending)]
        
    challenge_ids = list(challenges)
    numbers = [count_challenge_cells(challenges[_id]) for _id in challenge_ids]

    return sort_ids_by_numbers(challenge_ids, numbers, ascending=ascending)

## Testing

In [3]:
def count_challenge_cells(challenge):
    return sum(
        extract_numbers(example['input']) + extract_numbers(example['output']) 
        for example in challenge['train']
    )

def extract_numbers(list_of_lists):
    return sum(len(sublist) for sublist in list_of_lists)

### Ascending challenge sorting

In [4]:
res = sort_challenges_by_size(training_challenges)

last = float('-inf')

for _id in res:
    challenge = training_challenges[_id]
    nb_cells = count_challenge_cells(challenge)
    assert nb_cells >= last, f"{nb_cells} is not superior to previous challenge {last}"
    last = nb_cells
    
print("Ordered properly!")

Ordered properly!


### Descending challenge sorting

In [5]:
res = sort_challenges_by_size(training_challenges, ascending=False) # Descending

last = float('inf')

for _id in res:
    challenge = training_challenges[_id]
    nb_cells = count_challenge_cells(challenge)
    assert nb_cells <= last, f"{nb_cells} is not inferior to previous challenge {last}"
    last = nb_cells
    
print("Ordered properly!")

Ordered properly!


### 💡 Save Yourself the Headache! 💡

I've made the mistakes so you don't have to! Now, you'll save tons of time working on ARC-AGI. 

If this notebook helped you avoid common pitfalls or sped up your progress, I'd love your support!

- **Follow me on Kaggle:** [Malo Le Mestre](https://www.kaggle.com/malolem)
- **Leave a ⭐ on the GitHub repo** [here](https://github.com/MaloLM/arc-agi-genesis) to show your appreciation and keep the project growing!
- **Upvote this notebook** on Kaggle if it saved you from banging your head against the wall!

Your feedback keeps me motivated and helps others avoid the same challenges. 

# Thank you! 🚀✨