# Overview

This notebook contains my solutions for **<a href="https://adventofcode.com/2023" target="_blank">Advent of Code 2023</a>**.

A few notes...
- The source for this notebook source lives in my GitHub repo, <a href="https://github.com/derailed-dash/Advent-of-Code/blob/master/src/AoC_2023/Dazbo's_Advent_of_Code_2023.ipynb" target="_blank">here</a>.
- You can run this Notebook wherever you like. For example, you could...
  - Run it locally, in your own Jupyter environment.
  - Run it in a cloud-based Jupyter environment, with no setup required on your part!  For example, <a href="https://colab.research.google.com/github/derailed-dash/Advent-of-Code/blob/master/src/AoC_2023/Dazbo's_Advent_of_Code_2023.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Google Colab"/></a>
- **To run the notebook, execute the cells in the [Setup](#Setup) section, as described below. Then you can run the code for any given day.**
- Be mindful that the first time you run this notebook, you will need to **obtain your AoC session key** and store it, if you have not done so already. This allows the notebook to automatically retrieve your input data. (See the guidance in the **[Get Access to Your AoC Data](#Get-Access-to-Your-AoC-Data)** section for details.)
- Use the navigation menu on the left to jump to any particular day.
- All of my AoC solutions are documented in my <a href="https://aoc.just2good.co.uk/" target="_blank">AoC Python Walkthrough site</a>.

# Setup

You need to run all cells in this section, before running any particular day solution.

## Packages and Imports

Here we use `pip` to install the packages used by my solutions in this event.

In [None]:
%pip install jupyterlab-lsp colorama python-dotenv ipykernel 

In [None]:
from __future__ import annotations
from dataclasses import asdict, dataclass, field
from enum import Enum, auto
from functools import cache, reduce
from itertools import permutations, combinations, count
from collections import Counter, deque, defaultdict
import heapq
import copy
import operator
import logging
import time
import os
import re
import ast
import unittest
import requests
import matplotlib.pyplot as plt
import numpy as np
import networkx as nx
import pandas as pd
from tqdm.notebook import tqdm
from dotenv import load_dotenv
from pathlib import Path
from getpass import getpass
from colorama import Fore
from IPython.display import display
from IPython.core.display import Markdown

## Logging and Output

Set up a new logger that uses `ColouredFormatter`, such that we have coloured logging.  The log colour depends on the logging level.

In [None]:
##########################################################################
# SETUP LOGGING
#
# Create a new instance of "logger" in the client application
# Set to your preferred logging level
# And add the stream_handler from this module, if you want coloured output
##########################################################################

# logger for aoc_commons only
logger = logging.getLogger(__name__) # aoc_common.aoc_commons
logger.setLevel(logging.INFO)
stream_handler = None

class ColouredFormatter(logging.Formatter):
    """ Custom Formater which adds colour to output, based on logging level """

    level_mapping = {"DEBUG": (Fore.BLUE, "DBG"),
                     "INFO": (Fore.GREEN, "INF"),
                     "WARNING": (Fore.YELLOW, "WRN"),
                     "ERROR": (Fore.RED, "ERR"),
                     "CRITICAL": (Fore.MAGENTA, "CRT")
    }

    def __init__(self, *args, apply_colour=True, shorten_lvl=True, **kwargs) -> None:
        """ Args:
            apply_colour (bool, optional): Apply colouring to messages. Defaults to True.
            shorten_lvl (bool, optional): Shorten level names to 3 chars. Defaults to True.
        """
        super().__init__(*args, **kwargs)
        self._apply_colour = apply_colour
        self._shorten_lvl = shorten_lvl

    def format(self, record):
        if record.levelname in ColouredFormatter.level_mapping:
            new_rec = copy.copy(record)
            colour, new_level = ColouredFormatter.level_mapping[record.levelname]

            if self._shorten_lvl:
                new_rec.levelname = new_level

            if self._apply_colour:
                msg = colour + super().format(new_rec) + Fore.RESET
            else:
                msg = super().format(new_rec)

            return msg

        # If our logging message is not using one of these levels...
        return super().format(record)

if not stream_handler:
    stream_handler = logging.StreamHandler()
    stream_fmt = ColouredFormatter(fmt='%(asctime)s.%(msecs)03d:%(name)s - %(levelname)s: %(message)s',
                                   datefmt='%H:%M:%S')
    stream_handler.setFormatter(stream_fmt)
    
if not logger.handlers:
    # Add our ColouredFormatter as the default console logging
    logger.addHandler(stream_handler)

def retrieve_console_logger(script_name):
    """ Create and return a new logger, named after the script
    So, in your calling code, add a line like this:
    logger = ac.retrieve_console_logger(locations.script_name)
    """
    a_logger = logging.getLogger(script_name)
    a_logger.addHandler(stream_handler)
    a_logger.propagate = False
    return a_logger

def setup_file_logging(a_logger: logging.Logger, folder: str|Path=""):
    """ Add a FileHandler to the specified logger. File name is based on the logger name.
    In calling code, we can add a line like this:
    td.setup_file_logging(logger, locations.output_dir)

    Args:
        a_logger (Logger): The existing logger
        folder (str): Where the log file will be created. Will be created if it doesn't exist
    """
    Path(folder).mkdir(parents=True, exist_ok=True)     # Create directory if it does not exist
    file_handler = logging.FileHandler(Path(folder, a_logger.name + ".log"), mode='w')
    file_fmt = logging.Formatter(fmt="%(asctime)s.%(msecs)03d:%(name)s:%(levelname)8s: %(message)s",
                                datefmt='%H:%M:%S')
    file_handler.setFormatter(file_fmt)
    a_logger.addHandler(file_handler)

In [None]:
def top_and_tail(data, block_size=5, include_line_numbers=True, zero_indexed=False):
    """ Print a summary of a large amount of data 

    Args:
        data (_type_): The data to present in summary form.
        block_size (int, optional): How many rows to include in the top, and in the tail.
        include_line_numbers (bool, optional): Prefix with line number. Defaults to True.
        zero_indexed (bool, optional): Lines start at 0? Defaults to False.
    """
    if isinstance(data, list):
        # Get the number of digits of the last item for proper alignment
        num_digits_last_item = len(str(len(data)))

        # Format the string with line number
        def format_with_line_number(idx, line):
            start = 0 if zero_indexed else 1
            if include_line_numbers:
                return f"{idx + start:>{num_digits_last_item}}: {line}"
            else:
                return line

        start = 0 if zero_indexed else 1
        if len(data) < 11:
            return "\n".join(format_with_line_number(i, line) for i, line in enumerate(data))
        else:
            top = [format_with_line_number(i, line) for i, line in enumerate(data[:block_size])]
            tail = [format_with_line_number(i, line) for i, line in enumerate(data[-block_size:], start=len(data)-block_size)]
            return "\n".join(top + ["..."] + tail)
    else:
        return data

## Get Access to Your AoC Data

Now provide your unique AoC session key, in order to download your input data. You can get this by:
1. Logging into [Advent of Code](https://adventofcode.com/).
1. From your browser, open Developer Tools. (In Chrome, you can do this by pressing F12.)
1. Open the `Application` tab.
1. Storage -> Cookies -> https://adventofcode.com
1. Copy the value associated with the cookie called `session`.
1. Once you've determiend your session key, I recommend you store it in a file called `.env`, in your `Advent-of-Code` folder, like this: \
`AOC_SESSION_COOKIE=536...your-own-session-key...658` \
This notebook will try to retrieve the key from that location.  If it is unable to retrieve the key, it will prompt you to enter your key in the cell below.

![Finding the session cookie](https://aoc.just2good.co.uk/assets/images/aoc-cookie.png)



In [None]:
def get_envs_from_file() -> bool:
    """ Look for .env files, read variables from it, and store as environment variables """
    potential_path = ".env"
    for _ in range(3):
        logger.debug("Trying .env at %s", os.path.realpath(potential_path))
        if os.path.exists(potential_path):
            logger.info("Using .env at %s", os.path.realpath(potential_path))
            load_dotenv(potential_path, verbose=True)
            return True
        
        potential_path = os.path.join('..', potential_path)
   
    logger.warning("No .env file found.")
    return False

get_envs_from_file() # read env variables from a .env file, if we can find one

In [None]:
if os.getenv('AOC_SESSION_COOKIE'):
    logger.info('Session cookie retrieved: %s...%s', os.environ['AOC_SESSION_COOKIE'][0:6], os.environ['AOC_SESSION_COOKIE'][-6:])
else: # it's not in our environment variables, so we'll need to input the value
    os.environ['AOC_SESSION_COOKIE'] = getpass('Enter AoC session key: ')

## Load Helpers and Useful Classes

Now we load a bunch of helper functions and classes.

### Locations

Where any input and output files get stored.

<img src="https://aoc.just2good.co.uk/assets/images/notebook-content-screenshot.png" width="320" />


In [None]:
#################################################################
# Paths and Locations
#################################################################

@dataclass
class Locations:
    """ Dataclass for storing various location properties """
    script_name: str
    script_dir: Path
    input_dir: Path
    output_dir: Path
    input_file: Path

def get_locations(script_name, folder="") -> Locations:
    """ Set various paths, based on the location of the calling script. """
    current_directory = os.getcwd()
    script_dir = Path(Path().resolve(), folder, script_name)
    input_dir = Path(script_dir, "input")
    output_dir = Path(script_dir, "output")
    input_file = Path(input_dir, "input.txt")

    return Locations(script_name, script_dir,
                     input_dir,
                     output_dir,
                     input_file)

### Retrieve the Input Data

This works by using your unique session cookie to retrieve your input data. E.g. from a URL like:

`https://adventofcode.com/2015/day/1/input`

In [None]:
##################################################################
# Retrieving input data
##################################################################

def write_puzzle_input_file(year: int, day, locations: Locations):
    """ Use session key to obtain user's unique data for this year and day.
    Only retrieve if the input file does not already exist.
    Return True if successful.
    Requires env: AOC_SESSION_COOKIE, which can be set from the .env.
    """
    if os.path.exists(locations.input_file):
        logger.debug("%s already exists", os.path.basename(locations.input_file))
        return os.path.basename(locations.input_file)

    session_cookie = os.getenv('AOC_SESSION_COOKIE')
    if not session_cookie:
        raise ValueError("Could not retrieve session cookie.")

    logger.info('Session cookie retrieved: %s...%s', session_cookie[0:6], session_cookie[-6:])

    # Create input folder, if it doesn't exist
    if not locations.input_dir.exists():
        locations.input_dir.mkdir(parents=True, exist_ok=True)

    url = f"https://adventofcode.com/{year}/day/{day}/input"
    
    # Don't think we need to set a user-agent
    # headers = {
    #     "User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36'
    # }
    cookies = { 
        "session": session_cookie
    }
    response = requests.get(url, cookies=cookies, timeout=5)

    data = ""
    if response.status_code == 200:
        data = response.text

        with open(locations.input_file, 'w') as file:
            logger.debug("Writing input file %s", os.path.basename(locations.input_file))
            file.write(data)
            return data
    else:
        raise ValueError(f"Unable to retrieve input data.\n" +
                         f"HTTP response: {response.status_code}\n" +
                         f"{response.reason}: {response.content.decode('utf-8').strip()}")


### Testing

A really simple function for testing that our solution produces the expected test output.

In [None]:
def validate(test, answer):
    """
    Args:
        test: the answer given by our solution
        answer: the expected answer, e.g. from instructions
    """
    if test != answer:
        raise AssertionError(f"{test} != {answer}")

### Useful Helper Classes

In [None]:
#################################################################
# POINTS, VECTORS AND GRIDS
#################################################################

@dataclass(frozen=True)
class Point:
    """ Class for storing a point x,y coordinate """
    x: int
    y: int

    def __add__(self, other: Point):
        return Point(self.x + other.x, self.y + other.y)

    def __mul__(self, other: Point):
        """ (x, y) * (a, b) = (xa, yb) """
        return Point(self.x * other.x, self.y * other.y)

    def __sub__(self, other: Point):
        return self + Point(-other.x, -other.y)

    def __lt__(self, other):
        # Arbitrary comparison logic
        return (self.x, self.y) < (other.x, other.y)
    
    def yield_neighbours(self, include_diagonals=True, include_self=False):
        """ Generator to yield neighbouring Points """

        deltas: list
        if not include_diagonals:
            deltas = [vector.value for vector in Vectors if abs(vector.value[0]) != abs(vector.value[1])]
        else:
            deltas = [vector.value for vector in Vectors]

        if include_self:
            deltas.append((0, 0))

        for delta in deltas:
            yield Point(self.x + delta[0], self.y + delta[1])

    def neighbours(self, include_diagonals=True, include_self=False) -> list[Point]:
        """ Return all the neighbours, with specified constraints.
        It wraps the generator with a list. """
        return list(self.yield_neighbours(include_diagonals, include_self))

    def get_specific_neighbours(self, directions: list[Vectors]) -> list[Point]:
        """ Get neighbours, given a specific list of allowed locations """
        return [(self + Point(*vector.value)) for vector in list(directions)]

    @staticmethod
    def manhattan_distance(a_point: Point) -> int:
        """ Return the Manhattan distance value of this vector """
        return sum(abs(coord) for coord in asdict(a_point).values())

    def manhattan_distance_from(self, other: Point) -> int:
        """ Manhattan distance between this Vector and another Vector """
        diff = self-other
        return Point.manhattan_distance(diff)

    def __repr__(self):
        return f"P({self.x},{self.y})"

class Vectors(Enum):
    """ Enumeration of 8 directions.
    Note: y axis increments in the North direction, i.e. N = (0, 1) """
    N = (0, 1)
    NE = (1, 1)
    E = (1, 0)
    SE = (1, -1)
    S = (0, -1)
    SW = (-1, -1)
    W = (-1, 0)
    NW = (-1, 1)

    @property
    def y_inverted(self):
        """ Return vector, but with y-axis inverted. I.e. N = (0, -1) """
        x, y = self.value
        return (x, -y)

class VectorDicts():
    """ Contains constants for Vectors """
    ARROWS = {
        '^': Vectors.N.value,
        '>': Vectors.E.value,
        'v': Vectors.S.value,
        '<': Vectors.W.value
    }

    DIRS = {
        'U': Vectors.N.value,
        'R': Vectors.E.value,
        'D': Vectors.S.value,
        'L': Vectors.W.value
    }

    NINE_BOX: dict[str, tuple[int, int]] = {
        # x, y vector for adjacent locations
        'tr': (1, 1),
        'mr': (1, 0),
        'br': (1, -1),
        'bm': (0, -1),
        'bl': (-1, -1),
        'ml': (-1, 0),
        'tl': (-1, 1),
        'tm': (0, 1)
    }

class Grid():
    """ 2D grid of point values. """
    def __init__(self, grid_array: list) -> None:
        self._array = grid_array
        self._width = len(self._array[0])
        self._height = len(self._array)

    def value_at_point(self, point: Point):
        """ The value at this point """
        return self._array[point.y][point.x]

    def set_value_at_point(self, point: Point, value):
        self._array[point.y][point.x] = value

    def valid_location(self, point: Point) -> bool:
        """ Check if a location is within the grid """
        if (0 <= point.x < self._width and  0 <= point.y < self._height):
            return True

        return False

    @property
    def width(self):
        """ Array width (cols) """
        return self._width

    @property
    def height(self):
        """ Array height (rows) """
        return self._height

    def all_points(self) -> list[Point]:
        points = [Point(x, y) for x in range(self.width) for y in range(self.height)]
        return points

    def rows_as_str(self):
        """ Return the grid """
        return ["".join(str(char) for char in row) for row in self._array]

    def cols_as_str(self):
        """ Render columns as str. Returns: list of str """
        cols_list = list(zip(*self._array))
        return ["".join(str(char) for char in col) for col in cols_list]

    def __repr__(self) -> str:
        return f"Grid(size={self.width}*{self.height})"

    def __str__(self) -> str:
        return "\n".join("".join(map(str, row)) for row in self._array)

### Useful Helper Functions

In [None]:
#################################################################
# CONSOLE STUFF
#################################################################

def cls():
    """ Clear console """
    os.system('cls' if os.name=='nt' else 'clear')

#################################################################
# USEFUL FUNCTIONS
#################################################################

def binary_search(target, low:int, high:int, func, *func_args, reverse_search=False):
    """ Generic binary search function that takes a target to find,
    low and high values to start with, and a function to run, plus its args.
    Implicitly returns None if the search is exceeded. """

    res = None  # just set it to something that isn't the target
    candidate = 0  # initialise; we'll set it to the mid point in a second

    while low < high:  # search exceeded
        candidate = int((low+high) // 2)  # pick mid-point of our low and high
        res = func(candidate, *func_args) # run our function, whatever it is
        logger.debug("%d -> %d", candidate, res)
        if res == target:
            return candidate  # solution found

        comp = operator.lt if not reverse_search else operator.gt
        if comp(res, target):
            low = candidate
        else:
            high = candidate

def merge_intervals(intervals: list[list]) -> list[list]:
    """ Takes intervals in the form [[a, b][c, d][d, e]...]
    Intervals can overlap.  Compresses to minimum number of non-overlapping intervals. """
    intervals.sort()
    stack = []
    stack.append(intervals[0])

    for interval in intervals[1:]:
        # Check for overlapping interval
        if stack[-1][0] <= interval[0] <= stack[-1][-1]:
            stack[-1][-1] = max(stack[-1][-1], interval[-1])
        else:
            stack.append(interval)

    return stack

@cache
def get_factors(num: int) -> set[int]:
    """ Gets the factors for a given number. Returns a set[int] of factors.
        # E.g. when num=8, factors will be 1, 2, 4, 8 """
    factors = set()

    # Iterate from 1 to sqrt of 8,
    # since a larger factor of num must be a multiple of a smaller factor already checked
    for i in range(1, int(num**0.5) + 1):  # e.g. with num=8, this is range(1, 3)
        if num % i == 0: # if it is a factor, then dividing num by it will yield no remainder
            factors.add(i)  # e.g. 1, 2
            factors.add(num//i)  # i.e. 8//1 = 8, 8//2 = 4

    return factors

def to_base_n(number: int, base: int):
    """ Convert any integer number into a base-n string representation of that number.
    E.g. to_base_n(38, 5) = 123

    Args:
        number (int): The number to convert
        base (int): The base to apply

    Returns:
        [str]: The string representation of the number
    """
    ret_str = ""
    curr_num = number
    while curr_num:
        ret_str = str(curr_num % base) + ret_str
        curr_num //= base

    return ret_str if number > 0 else "0"


### Generic Initialisation


In [None]:
FOLDER = "aoc"
YEAR = 2023
logger_identifier = "aoc" + str(YEAR)
logger = retrieve_console_logger(logger_identifier)
logger.setLevel(logging.DEBUG)

# Days

Here you'll find a template to build a solution for a given day, and then the solutions for all days in this event.

To copy the template day, select all the cells in the `Day n` template, add a new cell at the end, and then paste the cells there.

## Day 1: Trebuchet?!

In [None]:
DAY = 1
day_link = f"#### See [Day {DAY}](https://adventofcode.com/{YEAR}/day/{DAY})."
display(Markdown(day_link))

In [None]:
d_name = "d" + str(DAY).zfill(2) # e.g. d01
script_name = "aoc" + str(YEAR) + d_name # e.g. aoc2017d01
locations = get_locations(d_name)

# SETUP LOGGING
logger.setLevel(logging.INFO)
# td.setup_file_logging(logger, locations.output_dir)

# Retrieve input and store in local file
try:
    write_puzzle_input_file(YEAR, DAY, locations)
    with open(locations.input_file, mode="rt") as f:
        input_data = f.read().splitlines()

    logger.info("Input data:\n%s", top_and_tail(input_data))
except ValueError as e:
    logger.error(e)

### Day 1 Part 1

And we're off!!  Welcome to the first day of Advent of Code 2023!!

Today was a troublesome start for me.  My Internet was out.  (Thanks, Virgin Media.) So, after unsuccessful restarts of the router and home network, I switched over to mobile hotspot.

Part 1 is pretty trivial, as we've come to expect. You need to identify the first and last digits of each line of a string. Concatenating these two values gives you a two digit number, which the puzzle calls a _calibration value_. Then we just add them all together.

**My Solution**

- For each line, I simply loop through each char in the line, and use the `isdigit()` method to determine if it is a digit.
- Then repeat, but this time, looping from the end using the Python construct `[::-1]` which just means: start from the end, and then step with increments of `-1`. I.e. move backwards.
- Finally, concatenate the two digits (still as strings), to update a two digit number. Then convert it to an int.
- Store all these ints in a list.  And at the end, return the sum of the list.

In [None]:
num_words = {"one": 1,
             "two": 2,
             "three": 3,
             "four": 4,
             "five": 5,
             "six": 6,
             "seven": 7,
             "eight": 8,
             "nine": 9
             }

In [None]:
def solve(data, with_spelled_nums=False):
    calibration_vals = []
    for line in data:
        logger.debug(line)
        
        first_posn = 1e6        
        last_posn = -1
        first = last = ""

        for posn, char in enumerate(line): # read from start
            if char.isdigit():
                first_posn = posn
                first = char
                break
            
        for posn, char in enumerate(line[::-1]): # read from the end
            if char.isdigit():
                last_posn = len(line) - posn - 1 # remember, we're now counting from the end!!
                last = char
                break

        if with_spelled_nums:
            for num_word in num_words:
                posn = line.find(num_word)
                if 0 <= posn < first_posn:
                    first_posn = posn
                    first = str(num_words[num_word]) # map it back to int
            
                posn = line.rfind(num_word)
                if posn > last_posn:
                    last_posn = posn
                    last = str(num_words[num_word]) # map to the int
        
        calibration_vals.append(int(first + last))
    
    return sum(calibration_vals)  
        

In [None]:
%%time
sample_inputs = [["1abc2", "pqr3stu8vwx", "a1b2c3d4e5f", "treb7uchetabcdef"]]
sample_answers = [142]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    validate(solve(curr_input), curr_ans) # test with sample data

soln = solve(input_data)
logger.info(f"Part 1 soln={soln}")

### Day 1 Part 2

For Day 1, this wasn't quite as trivial as I was expecting! Now we have to also find the positions of any "spelled" versions of the digits 0-9.

**My solution:**

- Create a `dict` to store the spelled versions of 1-9, and map them to their respective int values.
- Now, with each line, perform the same code as we did for Part 1 to find the first and last positions of the digit representation. 
  - But this time, store the positions found, as well as the values. I use the [`enumerate()`](https://aoc.just2good.co.uk/python/enumerate) to give me the current position of each char in my line.
  - Be really careful when storing the position when counting from the end.  This tripped me up for a couple of minutes!!  When we're looping through chars from the end, backwards, we want to store the position in the string, not the current enumeration value. 
- Then, run another loop that looks for each spelled number in our dict of spelled numbers.
  - To search for our current spelled number in our line from the start, using the `find()` method.
  - To search for our current spelled number in our line from the end, using the `rfind()` method.
  - Whenever we find a spelled number, check whether we found it at a position that is earlier / later (as required) than the digit we found before.
  - Whenever I find such a spelled number, I convert the int value in the dict to a string, so that I can concatenate the string values, just as we did before.

In [None]:
%%time
sample_inputs = [["two1nine", 
                  "eightwothree", 
                  "abcone2threexyz", 
                  "xtwone3four", 
                  "4nineeightseven2", 
                  "zoneight234", 
                  "7pqrstsixteen"]]
sample_answers = [281]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    validate(solve(curr_input, with_spelled_nums=True), curr_ans) # test with sample data

soln = solve(input_data, with_spelled_nums=True)
logger.info(f"Part 2 soln={soln}")

## Day 2: Cube Conundrum

In [None]:
DAY = "2" # replace with actual number (without leading digit)
logger.setLevel(logging.DEBUG)
day_link = f"#### See [Day {DAY}](https://adventofcode.com/{YEAR}/day/{DAY})."
display(Markdown(day_link))

In [None]:
d_name = "d" + str(DAY).zfill(2) # e.g. d01
script_name = "aoc" + str(YEAR) + d_name # e.g. aoc2017d01
locations = get_locations(d_name)

# SETUP LOGGING
logger.setLevel(logging.DEBUG)
# td.setup_file_logging(logger, locations.output_dir)

# Retrieve input and store in local file
try:
    write_puzzle_input_file(YEAR, DAY, locations)
    with open(locations.input_file, mode="rt") as f:
        input_data = f.read().splitlines()

    logger.info("Input data:\n%s", top_and_tail(input_data))
except ValueError as e:
    logger.error(e)

### Day 2 Part 1

In each game, we have a bag containing some number of red, green and blue cubes.  The bag is samples several times per game. Our input data shows these random samples for each game. E.g.

```
Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green
Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue
Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red
Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red
Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green
```

**Determine which games would have been possible if the bag had been loaded with only 12 red cubes, 13 green cubes, and 14 blue cubes. What is the sum of the IDs of those games?**

**My solution:**

- I create a CubeSample class to store each sample, i.e. the number of r, g, b cubes.
- I create a Game class to store the game ID and all the samples for that game.
- I parse the input with regex. My approach was:
  - Split the game line into the game part, and the samples part. Retrieving the game ID is trivial.
  - For the samples, use a regex that looks for "n colour", and use a regex finditer() to find all matches for this.
  - Create a defaultdict that sets the initial values for r, g, b to 0.
  - Then iterate over the matches from finditer(), and update the r, g, b as required.
- Now I simply loop through each game. 
  - For each game, I loop through the samples. If any sample has more r, g, b than we're allowed, then this game is impossible.
  - Build up a list of the games that are possible. Then sum up the IDs with a comprehension.

In [None]:
@dataclass
class CubeSample:
    """ A sample contains a number of red, blue, and green cubes """
    red: int=0
    blue: int=0
    green: int=0

@dataclass
class Game:
    """ A game has an ID, and a random number of samples """
    id: int
    samples: list[CubeSample]

def parse_input(data) -> list[Game]:
    game_pattern = re.compile(r"Game\s+(\d+)")
    cubes_pattern = re.compile(r"(\d+)\s*(\w+)") # E.g. "3 blue" 
    
    games = []
    for line in data:
        game_part, samples_part = line.split(":")
        game_id = int(game_pattern.findall(game_part)[0])
        samples = samples_part.split(";")
        
        cube_samples = []
        for sample in samples:
            matches = cubes_pattern.finditer(sample)
            cube_counts = {"red": 0, "green": 0, "blue": 0} # reset cube counts for each sample
            for match in matches:
                cube_count, cube_colour = match.groups()
                cube_counts[cube_colour] = int(cube_count)
            
            cube_samples.append(CubeSample(cube_counts["red"], cube_counts["blue"], cube_counts["green"]))
        
        games.append(Game(game_id, cube_samples))
        
    return games
      
def solve_part1(games: list[Game]):
    """ Return the sum of the IDs for games that are possible. """
    
    allowed_red = 12
    allowed_green = 13
    allowed_blue = 14
    
    possible_games = []
    for game in games:
        possible = True
        for game_sample in game.samples:
            if (game_sample.red > allowed_red
                    or game_sample.green > allowed_green
                    or game_sample.blue > allowed_blue):
                possible = False
            
        if possible:
            possible_games.append(game)
            
    return sum(game.id for game in possible_games) 
            

In [None]:
%%time
sample_inputs = [["Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green",
                  "Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue",
                  "Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red",
                  "Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red",
                  "Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green"]
                ]
sample_answers = [8]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    sample_games = parse_input(curr_input)
    validate(solve_part1(sample_games), curr_ans) # test with sample data

games = parse_input(input_data)
soln = solve_part1(games)
logger.info(f"Part 1 soln={soln}")

### Day 2 Part 2

**For each game, find the minimum set of cubes that must have been present. What is the sum of the power of these sets?**

Here, we need to look at all the samples for a given game, and determine the largest number of cubes shown of each colour, across the samples.

**My solution:**

Fortunately, since we already have our list of Games, this is now trivial to do. Simply iterate through the games, and for each game, iterate over all the samples. For each sample, determine if the number of any of r, g, b is greater than the biggest number of which we've found so far.

Then, multiply the r, g, b to get the `power` of the game. Then sum up all the powers.

In [None]:
def solve_part2(games: list[Game]):
    """ Return the sum of the powers of all the games """
    game_powers = []
    for game in games:
        max_blue = max_green = max_red = 0
        for game_sample in game.samples:
            max_blue = max(max_blue, game_sample.blue)
            max_green = max(max_green, game_sample.green)
            max_red = max(max_red, game_sample.red)
     
        # We're told that power = product of r, g, b   
        game_powers.append(max_blue*max_green*max_red)
    
    return sum(game_powers)    
        

In [None]:
%%time
sample_inputs = [["Game 1: 3 blue, 4 red; 1 red, 2 green, 6 blue; 2 green",
                  "Game 2: 1 blue, 2 green; 3 green, 4 blue, 1 red; 1 green, 1 blue",
                  "Game 3: 8 green, 6 blue, 20 red; 5 blue, 4 red, 13 green; 5 green, 1 red",
                  "Game 4: 1 green, 3 red, 6 blue; 3 green, 6 red; 3 green, 15 blue, 14 red",
                  "Game 5: 6 red, 1 blue, 3 green; 2 blue, 1 red, 2 green"]
                ]
sample_answers = [2286]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    sample_games = parse_input(curr_input)
    validate(solve_part2(sample_games), curr_ans) # test with sample data

soln = solve_part2(games)
logger.info(f"Part 2 soln={soln}")

---
## Day 3: Gear Ratios

In [None]:
DAY = "3" # replace with actual number (without leading digit)
logger.setLevel(logging.DEBUG)
day_link = f"#### See [Day {DAY}](https://adventofcode.com/{YEAR}/day/{DAY})."
display(Markdown(day_link))

In [None]:
d_name = "d" + str(DAY).zfill(2) # e.g. d01
script_name = "aoc" + str(YEAR) + d_name # e.g. aoc2017d01
locations = get_locations(d_name)

# SETUP LOGGING
logger.setLevel(logging.DEBUG)
# td.setup_file_logging(logger, locations.output_dir)

# Retrieve input and store in local file
try:
    write_puzzle_input_file(YEAR, DAY, locations)
    with open(locations.input_file, mode="rt") as f:
        input_data = f.read().splitlines()

    logger.info("Input data:\n%s", top_and_tail(input_data))
except ValueError as e:
    logger.error(e)

### Day 3 Part 1

I'm finding AoC fairly tough this year. I wasn't expecting the early challenges to be this tricky.

Anyhoo...

We're given a 2D grid, called the _engine schematic_. That grid contains numbers, periods (which should be ignored), and symbols (anything else). We need to determine the _part numbers_, which are told are any numbers adjacent to a symbol.

**What is the sum of all of the part numbers in the engine schematic?**

**My solution:**

- I get to reuse one of my helper classes.  Yay!  
  - I'm going to reuse my `Point` class, which stores x, y coordinates, but also has the ability to return all of its adjacent neighbours.
  - I'm going to reuse my `Grid` class, which already knows how to create a 2D grid, iterate through the points in the grid, get the values at any location, and determine if a specied point is in the grid.
- I create a new class called `EngineGrid` by extending `Grid`.
  - This class knows how to return all the points that are symbols.
- To solve:
  - First, get all the symbol locations. This is trivial.
  - Then, get all the neighbour locations for each symbol location.
  - If a valid location, check if it is a digit. If it is, then this location is in a part number.
  - Now, for each of these locations, use the method `get_part_number_continugous_range()` to determine the full set of points that make up this part number. It works by taking this location on this line of the grid, and walking backwards and fowards, until the value found is no longer a digit. We return the full set of contiguous digits.
  - Store the contiguous locations in a `set`. I'm doing this, because more than one neighbour might be in the same range of points, and we don't want to ever double count the same range.
  - Finally, iterate over our set of location ranges, and obtain the actual numeric value stored in each range. These are our part numbers.


In [None]:
class EngineGrid(Grid):
    def get_symbol_locations(self) -> list[Point]:
        """ Return all locations that contain a symbol """
        symbol_locations = [point for point in self.all_points() if self._is_symbol(point)]
        return symbol_locations
    
    def get_gear_locations(self) -> list[Point]:
        """ Return all locations that contain a gear, where a gear is represented by * """
        gear_locations = [point for point in self.all_points() if self._is_gear(point)]
        return gear_locations
    
    def _is_gear(self, point: Point) -> bool:
        """ A gear is represented by * """
        val = str(self.value_at_point(point))
        if val == "*":
            return True
        
        return False        
    
    def _is_symbol(self, point: Point) -> bool:
        """ A symbol is anything that is not numeric, or not a period. """
        val = str(self.value_at_point(point))
        if val.isdigit():
            return False
        
        if val == ".":
            return False
        
        return True
    
    def get_part_number_contiguous_range(self, point: Point) -> tuple[Point, ...]:
        """ Given a point within a part number, we want to return the entire range or points that make up that part number. """
        line = self._array[point.y] # get the row this point is on
    
        # Find the start of the contiguous digits
        start = point.x
        while start > 0 and line[start - 1].isdigit():
            start -= 1

        # Find the end of the contiguous digits
        end = point.x
        while end < len(line) - 1 and line[end + 1].isdigit():
            end += 1

        # Return the contiguous locations that make up a part number
        contiguous_locations = [Point(x, point.y) for x in range(start, end+1)]
        return tuple(contiguous_locations)

    def get_part_number_for_range(self, part_range: tuple[Point, ...]) -> int:
        """ Given a set of points that make up a part number, return the part number they contain. """
        part_num = ""
        for point in part_range:
            part_num += self.value_at_point(point)
            
        return int(part_num)

In [None]:
def solve_part1(engine: EngineGrid) -> tuple[int, set[tuple]]:
    """ Return the sum of all part numbers, where a part number is the full set of continguous digits
    adjacent to a symbol. Also return the part number ranges, so we can reuse later. """
    part_numbers_locations = []
    part_numbers = []
    
    # get the locations of symbols, e.g. * ?, but not .
    symbol_locations = engine.get_symbol_locations()
    for point in symbol_locations:
        # get adjacent locations
        for neighbour in point.neighbours():
            if engine.valid_location(neighbour): # check it is in the grid
                val = str(engine.value_at_point(neighbour))
                if val.isdigit():
                    part_numbers_locations.append(neighbour)
    
    # get the part number ranges that contain the locations we have found
    part_number_ranges = set() # so we don't double count ranges
    for point in part_numbers_locations:
        part_number_ranges.add(engine.get_part_number_contiguous_range(point))

    # Now get the numbers stored at these ranges
    for part_number_range in part_number_ranges:
        part_numbers.append(engine.get_part_number_for_range(part_number_range))
        
    return sum(part_numbers), part_number_ranges

In [None]:
%%time
sample_inputs = [["467..114..",
                  "...*......",
                  "..35..633.",
                  "......#...",
                  "617*......",
                  ".....+.58.",
                  "..592.....",
                  "......755.",
                  "...$.*....",
                  ".664.598.."]]
sample_answers = [4361]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    sample_engine = EngineGrid(curr_input)
    logger.debug(f"\n{sample_engine}")
    sample_part_num_sum, sample_part_num_ranges = solve_part1(sample_engine) 
    validate(sample_part_num_sum, curr_ans) # test with sample data

engine = EngineGrid(input_data)
part_num_sum, part_num_ranges = solve_part1(engine)
logger.info(f"Part 1 soln={part_num_sum}")

### Day 3 Part 2

Now we're told we need to find symbols that are gears, i.e. the symbols that are simply `*`. And we need to find all the gears that have exactly two adjacent part numbers. Where this is true, the product of the two part numbers is the _gear ratio_. Then we need to add up all the gear ratios.

**What is the sum of all of the gear ratios in your engine schematic?**

**My solution:**

- We already have our engine, and all of our part numbers.
- Let's find all the locations that are gears.
- For each gear, we want to find out if it is adjacent to two part numbers. We can do this by iterating over all of this gear location's neighbouring points.
- For each neighbour, determine if its location is within a part number location range. If it is, store the range in a set. Again, I'm doing this so that we don't double count the same range, e.g. if a gear has two neighbours that are in the same part number.
- Where we find exactly two neighbouring ranges, determine the part number values of these ranges, using our `get_part_number_for_range()` method, just like in Part 1.
- The gear ratio can then be obtained by multiplying these two part numbers together.
- Finally, add up all the gear ratios we have found.

It works, but it's a little slow. I might come back and improve this later.

In [None]:
def solve_part2(engine: EngineGrid, part_num_ranges: set[tuple]) -> int:
    gear_locations = []
    gear_ratios = []
    
    gear_locations = engine.get_gear_locations()
    for point in gear_locations: # go through every gear
        # get adjacent locations and see if they fall in more than one part number range
        
        adjacent_part_nums = []
        found_ranges = set() # we don't want to double count the same range for the same gear
        
        for neighbour in point.neighbours(): # check if this neighbour is in a part num
            if not engine.valid_location(neighbour):
                continue
            
            for part_num_range in part_num_ranges:
                if part_num_range in found_ranges:
                    continue # move on to the next range
                
                if neighbour in part_num_range:
                    found_ranges.add(part_num_range)
                    break # a neighbour can only be in one range, so now we can move on to the next neighbour
        
            if len(found_ranges) > 2:
                break # we only want gears that are adjacent to EXACTLY TWO part numbers
            
        if len(found_ranges) == 2: # this gear is valid, so determine its ratio
            adjacent_part_nums = [engine.get_part_number_for_range(part_num_range) for part_num_range in found_ranges]
            gear_ratios.append(adjacent_part_nums[0]*adjacent_part_nums[1])
                    
    return sum(gear_ratios)    

In [None]:
%%time
sample_inputs = [["467..114..",
                  "...*......",
                  "..35..633.",
                  "......#...",
                  "617*......",
                  ".....+.58.",
                  "..592.....",
                  "......755.",
                  "...$.*....",
                  ".664.598.."]]
sample_answers = [467835]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    sample_engine = EngineGrid(curr_input)
    logger.debug(f"\n{sample_engine}")
    sample_part_num_sum, sample_part_num_ranges = solve_part1(sample_engine)
    sample_gear_ration_sum = solve_part2(sample_engine, sample_part_num_ranges)
    validate(sample_gear_ration_sum, curr_ans) # test with sample data

soln = solve_part2(engine, part_num_ranges)
logger.info(f"Part 2 soln={soln}")

---
## Day 4: title

In [None]:
DAY = "4" # replace with actual number (without leading digit)
logger.setLevel(logging.DEBUG)
day_link = f"#### See [Day {DAY}](https://adventofcode.com/{YEAR}/day/{DAY})."
display(Markdown(day_link))

In [None]:
d_name = "d" + str(DAY).zfill(2) # e.g. d01
script_name = "aoc" + str(YEAR) + d_name # e.g. aoc2017d01
locations = get_locations(d_name)

# SETUP LOGGING
logger.setLevel(logging.DEBUG)
# td.setup_file_logging(logger, locations.output_dir)

# Retrieve input and store in local file
try:
    write_puzzle_input_file(YEAR, DAY, locations)
    with open(locations.input_file, mode="rt") as f:
        input_data = f.read().splitlines()

    logger.info("Input data:\n%s", top_and_tail(input_data))
except ValueError as e:
    logger.error(e)

### Day 4 Part 1

Overview...

In [None]:
def solve_part1(data):
    pass

In [None]:
%%time
sample_inputs = ["abcdef"]
sample_answers = ["uvwxyz"]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    validate(solve_part1(curr_input), curr_ans) # test with sample data

soln = solve_part1(input_data)
logger.info(f"Part 1 soln={soln}")

### Day 4 Part 2

Overview...

In [None]:
def solve_part2(data):
    pass

In [None]:
%%time
sample_inputs = ["abcdef"]
sample_answers = ["uvwxyz"]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    validate(solve_part2(curr_input), curr_ans) # test with sample data

soln = solve_part2(input_data)
logger.info(f"Part 2 soln={soln}")

---
## Day n: title

In [None]:
DAY = "n" # replace with actual number (without leading digit)
logger.setLevel(logging.DEBUG)
day_link = f"#### See [Day {DAY}](https://adventofcode.com/{YEAR}/day/{DAY})."
display(Markdown(day_link))

In [None]:
d_name = "d" + str(DAY).zfill(2) # e.g. d01
script_name = "aoc" + str(YEAR) + d_name # e.g. aoc2017d01
locations = get_locations(d_name)

# SETUP LOGGING
logger.setLevel(logging.DEBUG)
# td.setup_file_logging(logger, locations.output_dir)

# Retrieve input and store in local file
try:
    write_puzzle_input_file(YEAR, DAY, locations)
    with open(locations.input_file, mode="rt") as f:
        input_data = f.read().splitlines()

    logger.info("Input data:\n%s", top_and_tail(input_data))
except ValueError as e:
    logger.error(e)

### Day n Part 1

Overview...

In [None]:
def solve_part1(data):
    pass

In [None]:
%%time
sample_inputs = ["abcdef"]
sample_answers = ["uvwxyz"]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    validate(solve_part1(curr_input), curr_ans) # test with sample data

soln = solve_part1(input_data)
logger.info(f"Part 1 soln={soln}")

### Day n Part 2

Overview...

In [None]:
def solve_part2(data):
    pass

In [None]:
%%time
sample_inputs = ["abcdef"]
sample_answers = ["uvwxyz"]

for curr_input, curr_ans in zip(sample_inputs, sample_answers):
    validate(solve_part2(curr_input), curr_ans) # test with sample data

soln = solve_part2(input_data)
logger.info(f"Part 2 soln={soln}")