# **Day 2: Cube Conundrum**
This one doesn't seem too bad! Answering the first part seems pretty easy, but I want to build up a pretty robust parsing method in case Part 2 is a little harder. 

# Setup
The cells below will set up the rest of the notebook. 

I'll start by configuring my kernel:

In [1]:
# Changing the current working directory
%cd ..

# Enabling the autoreload extension
%load_ext autoreload
%autoreload 2

d:\data\programming\advent-of-code-2023


Now, I'm going to import some libraries:

In [2]:
# Import statements
import pandas as pd
import re

Finally, I'll load in the data for this puzzle. 

In [3]:
# Load in the data for the puzzle
with open("data/input-files/day-02-input.txt", "r") as txt_file:
    input_data = txt_file.readlines()

# Parsing the Input Data
I need to create a method that will parse the input data. Below, I'll define a method that will extract all sorts of data points from each of the lines of the input data.

In [58]:
def parse_game_record(input_str):
    """
    This method will parse different pieces of info from the `input_str`, which
    represents a record of one of the elf's cube games (lol)
    """

    # Indicate the possible cube colors
    cube_colors = ["red", "green", "blue"]

    # Extract the game ID from the input_str
    game_id = int(input_str.split(":")[0].split(" ")[-1])

    # Extract each of the set strings
    set_strings = [x.strip() for x in input_str.split(":")[-1].strip().split(";")]

    # Parse the set strings into set record dicts
    def parse_set_record_from_set_string(set_str):
        """
        This helper method will parse a `set_str`, which looks like this:

        `'7 green, 4 blue, 3 red'`

        It'll return a dictionary that looks like this:

        `{"green": 7, "blue": 4, "red": 3}`

        It'll *always* have numerical values for each of the colors. If a color is missing from
        a particular set_str, it'll be added in.
        """

        # Parse a dictionary from the set_str
        set_dict = {
            color_record.split(" ")[-1]: int(color_record.split(" ")[0])
            for color_record in [x.strip() for x in set_str.split(",")]
        }

        # Add in the missing keys
        for key in cube_colors:
            if key not in set_dict:
                set_dict[key] = 0

        # Return the set_dict
        return set_dict

    # Parse each of the set_strings into set_dicts
    set_dicts = [parse_set_record_from_set_string(set_str) for set_str in set_strings]

    # Identify the maximum and minimum number of reds, greens, and blues in total
    min_val_by_color = {}
    max_val_by_color = {}
    for color in cube_colors:
        # Identify each of the counts associated with this color
        color_cts = [set_dict.get(color) for set_dict in set_dicts]

        # Extract the minimum and maximum
        min_val_by_color[color] = min(color_cts)
        max_val_by_color[color] = max(color_cts)

    # Determine the minimum / maximum number of cubes in the bag
    min_num_cubes = sum([ct for color, ct in min_val_by_color.items()])
    max_num_cubes = sum([ct for color, ct in max_val_by_color.items()])

    # Now, we're going to create a dictionary containing all of the pertinent information
    # about the game record
    game_record_info_dict = (
        {
            "game_id": game_id,
            "original_str": input_str,
            "set_strings": set_strings,
            "set_dicts": set_dicts,
            "minimum_amt_of_cubes": min_num_cubes,
            "maximum_amt_of_cubes": max_num_cubes,
        }
        | {f"min_{color}": min_val_by_color[color] for color in cube_colors}
        | {f"max_{color}": max_val_by_color[color] for color in cube_colors}
    )

    # Return the game_record_info_dict
    return game_record_info_dict

That method might've been unecessarily complex, but I'm going to get a lot of mileage out of it. Let's make a `DataFrame` of all of the information I can extract from each game. 

In [59]:
# Parse each of the game_record_str from the `input_data`
game_record_info_df = pd.DataFrame.from_records(
    [parse_game_record(game_record_str) for game_record_str in input_data]
)

# Show the first couple of rows of the DataFrame
game_record_info_df.head(3)

Unnamed: 0,game_id,original_str,set_strings,set_dicts,minimum_amt_of_cubes,maximum_amt_of_cubes,min_red,min_green,min_blue,max_red,max_green,max_blue
0,1,"Game 1: 7 green, 4 blue, 3 red; 4 blue, 10 red...","[7 green, 4 blue, 3 red, 4 blue, 10 red, 1 gre...","[{'green': 7, 'blue': 4, 'red': 3}, {'blue': 4...",4,21,3,0,1,10,7,4
1,2,"Game 2: 2 red, 4 blue, 3 green; 5 green, 3 red...","[2 red, 4 blue, 3 green, 5 green, 3 red, 1 blu...","[{'red': 2, 'blue': 4, 'green': 3}, {'green': ...",6,13,2,3,1,3,5,5
2,3,"Game 3: 12 red, 1 blue; 6 red, 2 green, 3 blue...","[12 red, 1 blue, 6 red, 2 green, 3 blue, 2 blu...","[{'red': 12, 'blue': 1, 'green': 0}, {'red': 6...",6,18,5,0,1,12,3,3


# Answering the Elf's Question
Now that I've got parsed input data, I can actually answer the Elf's question: 

```
The Elf would first like to know which games would have been possible if the bag contained only 12 red cubes, 13 green cubes, and 14 blue cubes?
```

I'm just going to make a method that checks if a certain configuration of colors is possible given the information I know about a game (in `dict` form).

In [67]:
def determine_if_game_is_possible(cube_configuration, game_record_info_dict):
    """
    This method will determine whether a particular `cube_configuration` (given 
    in the form `{"red": 12, "green": 13, "blue": 14}`) is possible given what we know 
    about a game. "What we know about a game" is represented within the 
    `game_record_info_dict`. 
    """
        
    # Check if any of the colors in the `cube_configuration` are larger 
    # than the max value associated with that color in the `game_record_info_dict`
    for cur_color, cur_ct in cube_configuration.items():
        
        # Determine the maximum number seen in the game 
        max_seen_for_color = game_record_info_dict.get(f"max_{cur_color}")
        
        # If the max_seen_for_color is larger than the cur_ct, this game is impossible
        if max_seen_for_color > cur_ct:
            return False

    # If we've made it this far, return True
    return True

With this method in hand, I'll try and determine which games were possible. 

In [68]:
game_records_with_possibilities_df.head(1)

Unnamed: 0,game_id,original_str,set_strings,set_dicts,minimum_amt_of_cubes,maximum_amt_of_cubes,min_red,min_green,min_blue,max_red,max_green,max_blue,possible_game
0,1,"Game 1: 7 green, 4 blue, 3 red; 4 blue, 10 red...","[7 green, 4 blue, 3 red, 4 blue, 10 red, 1 gre...","[{'green': 7, 'blue': 4, 'red': 3}, {'blue': 4...",4,21,3,0,1,10,7,4,True


In [73]:
# Make a copy of the DataFrame
game_records_with_possibilities_df = game_record_info_df.copy()

# Indicate the target cube configuration we're aiming to check against
target_cube_configuration = {"red": 12, "green": 13, "blue": 14}

# Add a column indicating which games are possible
game_records_with_possibilities_df["possible_game"] = [
    determine_if_game_is_possible(target_cube_configuration, game_record_info_dict)
    for game_record_info_dict in game_records_with_possibilities_df.to_dict(
        orient="records"
    )
]

# Show off some games that are impossible with the target configuration
print(f"TARGET CONFIGURATION: {target_cube_configuration}\n")
print(
    f'EXAMPLE INVALID GAME: {game_records_with_possibilities_df.query("possible_game==False").head(1).iloc[0].original_str}'
)

TARGET CONFIGURATION: {'red': 12, 'green': 13, 'blue': 14}

EXAMPLE INVALID GAME: Game 5: 3 blue, 3 red, 8 green; 5 blue, 1 red; 1 green, 19 blue, 3 red; 1 red, 5 green, 3 blue; 4 green, 20 blue, 4 red; 20 blue, 4 green



Now: the question is interested in the sum of the IDs for the games that would've been possible for the given configuration. That's easily done: 

In [77]:
# Get the sum of the possible games' IDs
game_id_sum = game_records_with_possibilities_df[game_records_with_possibilities_df["possible_game"]]["game_id"].sum()

# Print it
print(f"The sum of the possible games' IDs is '{game_id_sum}'")

The sum of the possible games' IDs is '2447'
