# Advent of Code 2023

## Day 1 -- Trebuchet?! (Part 2)

## Author: Chris Kimber

The instructions for this problem can be found at https://adventofcode.com/2023/day/1.

Boilerplate for reading in the data and splitting it by line. The re package is loaded so regex can be used to identify the first and last numbers in a line.

In [64]:
file = open("input", "r")
input_file = file.read()
input_file = input_file.rstrip()

In [65]:
input_list = [x for x in input_file.split("\n")]

In [32]:
import re

The goal is to identify numbers represented in one of two ways, as a digit or as a word. A list containing all options is initialized, as is a dictionary containing the correspondence between digit and word.

In [9]:
digits = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [3]:
digit_dict = {'one': '1', 'two': '2', 'three': '3', 'four': '4', 'five': '5', 'six': '6', 'seven': '7', 'eight': '8', 'nine': '9'}

This function checks whether an item in the list is in word or digit form by checking if it is a key in the dictionary. If it is, then it is in word form and the corresponding digit is returned. If it is not a key then it is already in digit form.

In [19]:
def digit_converter(digit):
    if digit in digit_dict:
        return digit_dict[digit]
    else:
        return digit

This function applies the above function to a list of numbers in either format and returns the sum of the first and last numbers in the list.

In [18]:
def line_formatter(line):
    match1 = digit_converter(line[0])
    match2 = digit_converter(line[-1])
    return int(match1 + match2)

This function uses a regex to find all matches to any of the items in the list of numbers (whether in digit or word form) within a line. This is done by concatenating all the items in the list using an "or" operator to put the list in regex format. It also uses a lookahead (the "?=") to deal with overlapping matches, as the main "gotcha" in this problem is that two numbers in word format can overlap eg. "oneight". A regex without a lookahead will only find "one".

It then applies the function above to calcluate the sum of the first and last values found by the regex for each line (the calibration value of the line).

In [79]:
def line_processor(line):
    matches = re.findall(r"(?=("+'|'.join(digits)+r"))", line)
    return line_formatter(matches)

The function above is applied across each line to calculate the calibration value of each line. The sum of all the calibration values is the answer to the problem!

In [80]:
calibration_values = [line_processor(x) for x in input_list]

In [82]:
sum_calibration_values = sum(calibration_values)

In [83]:
print(sum_calibration_values)

53268


This section below was used as a test case to make sure the regex was handling overlapping numbers in "word" format correctly.

In [70]:
test_string = "8eight9oneight"

In [73]:
pattern = '|'.join(digits)

In [78]:
re.findall(r"(?=("+'|'.join(digits)+r"))", test_string)

['8', 'eight', '9', 'one', 'eight']