Oh damn it, I haven't had to write a parser for years. I thought all these things were done with deep learning now?

Anyhow, let's define a function `parse_group` which will build a parse tree starting at a group opener (`{`). Some observations for the moment:
* I'll also need a `parse_garbage` function.
* It's not clear what the commas are for, whether they're needed at all, or whether they could be arbitrary (non-special) characters

In [1]:
def parse_group(input_str, currentDepth_i=1):
    '''
    Build the parse tree from the start of the next group. Check
    that the first character of input_str is a '{'.
    Return a list of the depths of the contained groups.
    '''
    assert input_str[0]=='{'
    out_ls=[]
    
    while input_str[0] != '}':
        # Remove the first character of the string:
        input_str=input_str[1:]
        
        # if it's a '{', then parse the next group:
        if input_str[0]=='{':
            (parse_ls, input_str)=parse_group(input_str, currentDepth_i+1)
            out_ls.append(parse_ls)
        
        # if it's a '<', parse as garbage:
        if input_str[0]=='<':
            input_str=remove_garbage(input_str)
        
    # Finally, add the current depth to the beginning 
    # of the output list, and return it, and the current
    # (shortened) input string.
    out_ls[0:0]=[currentDepth_i]
    return (out_ls, input_str[1:])

So just for the `parse_group` function (that is, without the garbage recogniser), the following test cases should give the correct answers:

In [2]:
# Should have 1 group
parse_group('{}')

([1], '')

In [3]:
# Should have 3 groups
parse_group('{{{}}}')

([1, [2, [3]]], '')

In [4]:
# Should have 6 groups
parse_group('{{{},{},{{}}}}')

([1, [2, [3], [3], [3, [4]]]], '')

OK, seems OK so far. Let's try adding the garbage recogniser now:

In [5]:
def remove_garbage(input_str):
    '''
    Remove garbage from the front of input_str, and return a string
    with the garbage removed.
    '''
    # Check that we have the garbage opening character:
    assert input_str[0]=='<'
    
    # Now ignore anything until the closing character:
    while input_str[0] != '>':
        # unless it's a '!', in which case ignore the following character
        if input_str[0]=='!':
            input_str=input_str[2:]
        # otherwise, just remove the next character:
        else:
            input_str=input_str[1:]
    
    # Finally, return the remaining input string (without the
    # closing bracket):
    return input_str[1:]

In [6]:
# Run the test cases:

assert remove_garbage('<>End of garbage')=='End of garbage'
assert remove_garbage('<sjafhkasjhfkdjksj982487qwl{@#$>End of garbage')=='End of garbage'
assert remove_garbage('<{!>}>End of garbage')=='End of garbage'
assert remove_garbage('<!!>End of garbage')=='End of garbage'
assert remove_garbage('<!!!>>End of garbage')=='End of garbage'
assert remove_garbage('<{o"i!a,<{i<a>End of garbage')=='End of garbage'

Good. Now check the remaining test cases.

In [7]:
# Should have 1 group
parse_group('{<a>,<a>,<a>,<a>}')

([1], '')

In [8]:
# Should have 5 groups
parse_group('{{<a>},{<a>},{<a>},{<a>}}')

([1, [2], [2], [2], [2]], '')

In [9]:
# Should have 2 groups
parse_group('{{<!>},{<!>},{<!>},{<a>}}')

([1, [2]], '')

That all seems to be behaving properly. So let's do a small function to extract all the counts in the parse tree. Each node in the tree has either 1 numerical member, or 1 member and a subtree:

In [10]:
def extract_groups(parseTree_ls):
    '''
    Extract the group values from the provided tree. Return as a list 
    ('cos it's easy to just sum after)
    '''
    # Use a local flatten function:
    flatten_list = lambda lists:[item for sublist in lists for item in sublist]
    
    if len(parseTree_ls)==1:
        return parseTree_ls
    else:
        return parseTree_ls[0:1]+flatten_list([extract_groups(pt) for pt in parseTree_ls[1:]])

Do the test cases:

In [11]:
def test(input_str):
    return sum(extract_groups(parse_group(input_str)[0]))

assert test('{}')==1
assert test('{{{}}}')==6
assert test('{{},{}}')==5
assert test('{{{},{},{{}}}}')==16
assert test('{<a>,<a>,<a>,<a>}')==1
assert test('{{<ab>},{<ab>},{<ab>},{<ab>}}')==9
assert test('{{<!!>},{<!!>},{<!!>},{<!!>}}')==9
assert test('{{<a!>},{<a!>},{<a!>},{<ab>}}')==3

Finally, do my input:

In [12]:
with open('data/day9.txt') as fIn:
    myInput_str=fIn.read().strip()

test(myInput_str)

16827