# --- Day 13 Distress Signal ---

https://adventofcode.com/2022/day/13

## Parsing options

This problem starts off with input data that are string representations of a list, containing either itegers or other lists.  
(So, potentially nested lists.)

Converting that string to an actual list object can be done in multiple ways:

1. Use Python's built-in `eval()` function,
2. Use Python's built-in `ast.literal_eval()` function,
3. Use Python's `json` package, because the string is actually in valid JSON format, or 
4. Write a custom parsing routine

## 1. Use Python's built-in `eval()` function

**NOTE**: This is typically frowned upon, because it's potentially risky if you don't know the source of the input data that is being evaluated.

See:  
https://www.reddit.com/r/adventofcode/comments/zkoc0o/2022_day_13_got_some_weird_input_today_hope_none/  
https://www.reddit.com/r/adventofcode/comments/zlb1qr/2022_day_13_python_users_giving_in_and_using_eval/  
https://www.reddit.com/r/adventofcode/comments/zkxxyt/2022_day_13_python_ftw/  

But it works quite swell:

In [1]:
with open("../inputs/test_packet_pairs.txt") as file:
    for line in file:
        if line != '\n': 
            print(eval(line))

[1, 1, 3, 1, 1]
[1, 1, 5, 1, 1]
[[1], [2, 3, 4]]
[[1], 4]
[9]
[[8, 7, 6]]
[[4, 4], 4, 4]
[[4, 4], 4, 4, 4]
[7, 7, 7, 7]
[7, 7, 7]
[]
[3]
[[[]]]
[[]]
[1, [2, [3, [4, [5, 6, 7]]]], 8, 9]
[1, [2, [3, [4, [5, 6, 0]]]], 8, 9]


A safe way to use the `eval()` function would be to combine it with a regex check to make sure the line  
only contains acceptable characters, like so:

In [2]:
import re
acceptable_line = re.compile(r"^\[[\d\[\],]+\]$")

with open("../inputs/test_packet_pairs.txt") as file:
    for line in file:
        if re.match(acceptable_line, line.strip()): 
            print(eval(line))

[1, 1, 3, 1, 1]
[1, 1, 5, 1, 1]
[[1], [2, 3, 4]]
[[1], 4]
[9]
[[8, 7, 6]]
[[4, 4], 4, 4]
[[4, 4], 4, 4, 4]
[7, 7, 7, 7]
[7, 7, 7]
[3]
[[[]]]
[[]]
[1, [2, [3, [4, [5, 6, 7]]]], 8, 9]
[1, [2, [3, [4, [5, 6, 0]]]], 8, 9]


## 2. Use Python's built-in `ast.literal_eval()` function

Python's `ast` package ("ast" means "abstract syntax tree") is used for parsing syntax into grammars.  
https://docs.python.org/3/library/ast.html

I **think**(?) this defines how Python parses python scripts into a tree data structure that then gets sent  
for compilation through the interpreter.

The **key** difference between `eval()` and `ast.literal_eval()` is that the latter has some internal  
checks to make sure no system code is being evaluated and only data structures are being returned.

In [3]:
import ast

In [4]:
with open("../inputs/test_packet_pairs.txt") as file:
    for line in file:
        if line != '\n': 
            print(ast.literal_eval(line))

[1, 1, 3, 1, 1]
[1, 1, 5, 1, 1]
[[1], [2, 3, 4]]
[[1], 4]
[9]
[[8, 7, 6]]
[[4, 4], 4, 4]
[[4, 4], 4, 4, 4]
[7, 7, 7, 7]
[7, 7, 7]
[]
[3]
[[[]]]
[[]]
[1, [2, [3, [4, [5, 6, 7]]]], 8, 9]
[1, [2, [3, [4, [5, 6, 0]]]], 8, 9]


## 3. Use Python's `json` package

Hat tip to Nabeel for pointing out that the string is valid JSON!  
This works like a charm.

In [2]:
import json

In [4]:
with open("../inputs/test_packet_pairs.txt") as file:
    for line in file:
        if line != '\n': 
            print(json.loads(line))

[1, 1, 3, 1, 1]
[1, 1, 5, 1, 1]
[[1], [2, 3, 4]]
[[1], 4]
[9]
[[8, 7, 6]]
[[4, 4], 4, 4]
[[4, 4], 4, 4, 4]
[7, 7, 7, 7]
[7, 7, 7]
[]
[3]
[[[]]]
[[]]
[1, [2, [3, [4, [5, 6, 7]]]], 8, 9]
[1, [2, [3, [4, [5, 6, 0]]]], 8, 9]


## 4. Write a custom parsing function

By **FAR** the hardest route to take... ("No pain, no gain.")

Below, is a slightly modified (and highly annotated) version of the code that Dan Ready wrote:
https://github.com/dready10/AOC22/blob/master/day13_pyparser.py, which is based on his C implementation found here:
https://github.com/dready10/AOC22/blob/master/day13.c

In [7]:
def dan_parse(_string, parent_list):
    """Parse a string containing a list, containing either (potentially nested) list(s), or digits.
    
    Recursive function that returns the number of characters "eaten" as the string is parsed.

    Parameters
    ----------
    _string : str
        The string to be parsed.
    parent_list : list, default []
        The list object that gets populated along the way as the function runs.

    Returns
    -------
    i : int
        The number of characters "eaten".
    """

    # i is the first counter moving over the _string
    # The first character in _string will always be '['.
    # Don't need it, so skip it.
    i = 1
    
    # digit_start is a second counter that typically lags i and marks the potential begining of
    # a digit character, which can be more than 1 character in length.
    digit_start = 1

    while i < len(_string):

        # Happens in all calls to dan_parse()
        if _string[i] == ',':
            # If previous character is a digit...
            if _string[i-1].isdigit():
                # ...then every character from digit_start to i (a comma here) is a digit
                # (digits can be longer than 1 character), so we convert that string slice
                # to an int and append it to the parent_list
                parent_list.append(int(_string[digit_start:i]))
                
                # Also, advance digit_start counter to be ahead of current i counter.
                # (The current i counter, however, will be advanced at the bottom of the while loop.
                # so they'll start off in the same position again for next iteration of while loop.)
                digit_start = i + 1

        # Recursive case.
        # We are at the beginning of a child_list, so create an empty child_list then parse it.
        # NB: The dan_parse() function alters the object passed as a reference, so need to create
        # it first, and then it will be populated after returning the characters eaten along the way.
        elif _string[i] == '[':
            child_list = []
            chars_eaten = dan_parse(_string[i:], child_list)
            parent_list.append(child_list)

            # Advance counters appropriately
            i += chars_eaten
            digit_start = i + 2  # This confuses me... Why increment by 2?  digit_start will be ahead of i now, even after the i += 1 at the end of the while loop...

        # Base case.
        # We reached the end of either a child_list or the parent_list.
        elif _string[i] == ']':
            # If there's a digit before the last bracket...
            if _string[i-1].isdigit():
                # ...then every character from digit_start to i (a comma here) is a digit
                # (digits can be longer than 1 character), so we convert that string slice
                # to an int and append it to the parent_list
                parent_list.append(int(_string[digit_start:i]))

            # Return the number of characters eaten by this call to dan_parse()
            return i

        # Increment to next position in the string
        i += 1


In [8]:
with open("../inputs/test_packet_pairs.txt") as file:
    for line in file:
        if line != '\n':
            _list = []
            dan_parse(line.strip(), _list)
            print(_list)

[1, 1, 3, 1, 1]
[1, 1, 5, 1, 1]
[[1], [2, 3, 4]]
[[1], 4]
[9]
[[8, 7, 6]]
[[4, 4], 4, 4]
[[4, 4], 4, 4, 4]
[7, 7, 7, 7]
[7, 7, 7]
[]
[3]
[[[]]]
[[]]
[1, [2, [3, [4, [5, 6, 7]]]], 8, 9]
[1, [2, [3, [4, [5, 6, 0]]]], 8, 9]
