## Advent of code 2021 day 1-23
See https://adventofcode.com/

In [None]:
# note that this notebook requires the .venv environment with python >= 3.10
# to activate it from a git bash shell: source .venv/Scripts/activate
# to generate its requirements: pip freeze > .venv-requirements.txt
# to re-install from requirements: python -m venv .venv; source .venv/Scripts/activate; pip install -r .venv-requirements.txt
# (may need the full path to python, e.g. ~/AppData/Local/Programs/Python/Python310/python.exe)

import collections
import itertools
import re
import copy
import math
import sys
import time
#import json
import numpy as np
import cProfile

In [None]:
# utility functions

def get_line_groups(lines):
    '''return list of lists of lines, each separated by empty lines, ignores empty lines from start and end'''
    lines=list(lines)
    lines.append('') # add terminator
    res=[]
    group=[]
    for line in lines:
        line=line.strip()
        if len(line)>0:
            group.append(line)
        elif len(group)>0: # close group
            res.append(group)
            group=[]
    return res

In [None]:
# 2021 day 23 BFS
# mv ~/Downloads/input data_src/2021-day-23-input.txt
# big input file looks like: small puzzle
# idea: parse to board, then BFS, data structure is a map of state vector to lowest cost, 
# These state vectors can encode all 5**7 * 9**4 different positions, not all will be used so it should (barely?) fit in memory.
# Estimated memory use: 50% of boards reachable, 40 bytes per vector+cost;
# "%e" % (5**7 * 9**4 * 40 / 2) == 1.025156e+10 i.e. 10 GB of RAM, sounds somewhat ambitious,
# if needed could encode into a 7*3+4*4=37 bit number - NB in fact the solutions for the presented puzzles only require
# about 200k states?!

sample1='''
#############
#...........#
###B#C#B#D###
###A#D#C#A#
###########
''' # the extra # on the left border are a trick to prevent VS Code tab detection from making a fool of itself

hall_geo={ # maps (ci, ri) to (checkmin, checkmax, hordist) where ci 0..6, ri 0..3, checkmin 0..6, checkmax 0..6, hordist 0..11
    (0, 0): (1,  1, 2),
    (1, 0): (0, -1, 1),
    (2, 0): (0, -1, 1),
    (3, 0): (2,  2, 3),
    (4, 0): (2,  3, 5),
    (5, 0): (2,  4, 7),
    (6, 0): (2,  5, 8),
    (0, 1): (1,  2, 4),
    (1, 1): (2,  2, 3),
    (2, 1): (0, -1, 1),
    (3, 1): (0, -1, 1),
    (4, 1): (3,  3, 3),
    (5, 1): (3,  4, 5),
    (6, 1): (3,  5, 6),
    (0, 2): (1,  3, 6),
    (1, 2): (2,  3, 5),
    (2, 2): (3,  3, 3),
    (3, 2): (0, -1, 1),
    (4, 2): (0, -1, 1),
    (5, 2): (4,  4, 3),
    (6, 2): (4,  5, 4),
    (0, 3): (1,  4, 8),
    (1, 3): (2,  4, 7),
    (2, 3): (3,  4, 5),
    (3, 3): (4,  4, 3),
    (4, 3): (0, -1, 1),
    (5, 3): (0, -1, 1),
    (6, 3): (5,  5, 2),
}

amph_energy={'A': 1, 'B': 10, 'C': 100, 'D': 1000}

# State vector: a compressed board into a single string: CCCCCCCNNNN, each C is either . or A-D for an amphipod (aka 'guy')
# in the hallway parking spots left to right, each N is the state of one of the side rooms:
# 0 - full as per starting board
# 1 - one top square emptied out
# 2 - two top squares emptied out, empty in part 1 smaller game
# 3 - 3 top squares emptied out
# 4 - 4 top squares emptied out, empty in the part 2 bigger game
# 5 - full as per ending board, final
# 6 - one top spot open, rest filled with final guys
# 7 - two top spots open, rest filled with final guys (in part 1 game this would be empty but code 2 is used, this one isn't)
# 8 - three top spots open, bottom one filled with final guy

def generate_bfs_moves_from_rooms(srooms, state, res): # generates list of (newstate, movecost) moves
    # from top of stack in side rooms to empty reachable spot in hallway
    # NB guys *can* leave their own room, to free up guys below or hang out in the hall in general
    for ri in range(4): # for each room check top spot, there is always a move unless empty
        # check room state
        if len(srooms)==2:
            emptystates={'0': ('1', 1), '1': ('2', 2), '5': ('6', 1), '6': ('2', 2)}
        elif len(srooms)==4:
            emptystates={'0': ('1', 1), '1': ('2', 2), '2': ('3', 3), '3': ('4', 4), '5': ('6', 1), '6': ('7', 2), 
                         '7': ('8', 3), '8': ('4', 4)}
        else:
            assert False
        roomstate=state[7+ri]
        nexttup=emptystates.get(roomstate)
        if nexttup is None:
            continue # no occupied spot
        nextempty, vertdist=nexttup
        if roomstate>='5':
            c=chr(ord('A')+ri)
        else:
            c=srooms[vertdist-1][ri]
            # check whether we can flip from partially emptied directly to partially full
            numguys=len(srooms)-vertdist
            if numguys>0:
                allroomies=True # do all numguys belong in this room?
                for y in range(vertdist, len(srooms)):
                    if srooms[y][ri]!=chr(ord('A')+ri):
                        allroomies=False
                        break
                if allroomies: # we can flip
                    assert nextempty>='1' and nextempty<='3'
                    nextempty=chr(ord(nextempty)+5) # 1 to 6, 2 to 7, 3 to 8
        assert c>='A' and c<='D'
        # generate reachable ci indices (i.e. spots in the hallway)
        roomgeo={0: (1, 2), 1: (2, 3), 2: (3, 4), 3: (4, 5)}
        ci_down, ci_up=roomgeo[ri]
        reachable_ci=set()
        for ci in range(ci_up, 7):
            if state[ci]=='.':
                reachable_ci.add(ci)
            else:
                break
        for ci in range(ci_down, -1, -1):
            if state[ci]=='.':
                reachable_ci.add(ci)
            else:
                break
        for ci in reachable_ci:
            newstate=state[:ci]+c+state[ci+1:7+ri]+nextempty+state[8+ri:]
            checkmin, checkmax, hordist=hall_geo[(ci, ri)]
            cost=amph_energy[c]*(hordist+vertdist)
            res.append( (newstate, cost) )

def generate_bfs_moves_from_hallway(srooms, state, res): # generates list of (newstate, movecost) moves
    # from hallway to destination room bottom spot, exactly one possible source location per state char
    # NB no room to room direct moves, not needed
    for ci in range(7): # for each hallway spot move to destination room bottom spot - if reachable
        c=state[ci]
        if c=='.':
            continue
        assert c>='A' and c<='D'
        ri=ord(c)-ord('A') # room index
        checkmin, checkmax, hordist=hall_geo[(ci, ri)]
        if checkmax>=0:
            notreachable=False
            for x in range(checkmin, checkmax+1):
                if state[x]!='.': # room not reachable
                    notreachable=True
                    break
            if notreachable:
                continue
        roomstate=state[7+ri]
        if len(srooms)==2:
            fillstates={'2': ('6', 2), '6': ('5', 1)}
        elif len(srooms)==4:
            fillstates={'4': ('8', 4), '8': ('7', 3), '7': ('6', 2), '6': ('5', 1)}
        else:
            assert False
        nexttup=fillstates.get(roomstate)
        if nexttup is None:
            continue # no open spot
        nextfill, vertdist=nexttup
        newstate=state[:ci]+'.'+state[ci+1:7+ri]+nextfill+state[8+ri:]
        cost=amph_energy[c]*(hordist+vertdist)
        res.append( (newstate, cost) )

def generate_bfs_moves(srooms, state):
    assert len(state)==11
    res=[]
    generate_bfs_moves_from_hallway(srooms, state, res)
    generate_bfs_moves_from_rooms(srooms, state, res)
    return res

def mincost_bfs_jump(srooms, states): # BFS with info spreading
    todo=set(states.keys())
    while len(todo)>0:
        state=todo.pop()
        basecost=states[state]
        moves=generate_bfs_moves(srooms, state)
        for newstate,movecost in moves:
            oldcost=states.get(newstate)
            if oldcost is None or basecost+movecost<oldcost:
                states[newstate]=basecost+movecost
                todo.add(newstate)

sample1=open('data_src/2021-day-23-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
lines.insert(3, '  #D#C#B#A#') # part 2, comment out for part 1
lines.insert(4, '  #D#B#A#C#') # part 2, comment out for part 1
srooms=[ s[3]+s[5]+s[7]+s[9] for s in lines[2:-1] ] # starting side rooms as list of strings from y=2 onwards
print(f'{srooms=}')
states={ '.......0000': 0 }
mincost_bfs_jump(srooms, states)
mincost=states.get('.......5555', 'unknown')
print(f'{mincost=} {len(states)=}')

# part 1: 18282
# part 2: 50132

In [None]:
# 2021 day 23 BFS unit tests

testsrooms=['BCBD', 'ADCA']
testmoves=[]
teststate='.A...B.0206'
generate_bfs_moves_from_hallway(testsrooms, teststate, testmoves)
assert len(testmoves)==1
assert testmoves[0] == ('.A.....0606', 70)

testmoves=[]
teststate='.......0000'
generate_bfs_moves_from_rooms(testsrooms, teststate, testmoves)
assert len(testmoves)==28
testmoves.sort(key=lambda m: f'{m[1]:06d}{m[0]}')
assert testmoves[0] == ('....B..0060', 20) # includes 'roomies flip' check result
assert testmoves[-1] == ('D......0001', 9000)
#for m in testmoves:
#    print(m)


In [None]:
# 2021 day 23 DFS solution which mostly works, but is quite slow (and also doesn't work 100% - perhaps
# because it doesn't allow guys to leave their destination room unless absolutely required?)
# mv ~/Downloads/input data_src/2021-day-23-input.txt
# big input file looks like: small puzzle
# idea: part 1 parse to board, then DFS, part 2 also (BFS would take too much space considering number of board states?)

sample1='''
#############
#...........#
###B#C#B#D###
  #A#D#C#A#
  #########
'''

def is_endpos(data):
  for x in range(1, 12):
    if data[1][x]!='.':
      return False
  for y in range(2, len(data)-1):
    if data[y][3]!='A' or data[y][5]!='B' or data[y][7]!='C' or data[y][9]!='D':
      return False
  return True

def sign(a):
    return (a > 0) - (a < 0)

def show_board(data, stats, locked, cost, depth, force=False):
  if depth>150 and stats['show_triggered']==0:
    stats['show_triggered']=1
    stats['show_board']=0
  if not force:
    if stats['show_triggered']==0:
      return
    if stats['show_board']>=20:
      return
  for row in data:
    print(''.join(row))
  mincost=stats['mincost']
  turn=stats['turns']
  print(f'{locked=} {depth=} {turn=} {cost=} {mincost=}')
  print()
  stats['show_board']+=1

def blockers_solvable(data, newx, newc): # return True if new hallway situation is still solvable,
  # does not consider room situations
  blockers={} # maps x coord in hallway to char, later roomx, initially assume all are blocked
  for x in range(1, 12):
    c=data[1][x]
    if c>='A' and c<='D':
      blockers[x]=c
  blockers[newx]=newc
  if len(blockers)<2: # one cannot block itself anyway
    return True
  blockers={ x: {'A': 3, 'B': 5, 'C': 7, 'D': 9}[c] for x,c in blockers.items() } # map x to room x
  while len(blockers)>=2:
    unblocked=set()
    for x, rx in blockers.items(): # is this one actually blocked?
      inbetween=False
      for x1 in range(min(x, rx), max(x, rx)+1):
        if x1!=x and x1 in blockers:
          inbetween=True
          break
      if not inbetween:
        unblocked.add(x)
    if len(unblocked)<1:
      break
    for x in unblocked:
      del blockers[x]
  return len(blockers)<2

def valid_jump(data, oldy, oldx, newy, newx): # check jump and return distance,finalmove if valid, or None
  c=data[oldy][oldx]
  assert c>='A' and c<='D'
  if oldy==newy and oldy==1: # must make a move from room to hallway or vice versa, or room to own room
    return None
  if oldx==newx: # cannot shuffle around in a room
    return None
  if newy>1: # checks if moving to own room
    x={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]
    if newx!=x: # only to our own room
      return None
    for y in range(newy+1, len(data)-1):
      bottomc=data[y][newx]
      if bottomc=='.': # could have moved further down 
        return None
      if bottomc>='A' and bottomc<='D' and bottomc!=c: # will block someone
        return None
  if newy==1 and newx in {3, 5, 7, 9}: # don't park in front of door
    return None
  if oldy>1: # moving from your own room only allowed to let someone else out
    x={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]
    if oldx==x:
      foundreason=False
      for y in range(oldy+1, len(data)-1):
        bottomc=data[y][oldx]
        if bottomc>='A' and bottomc<='D' and bottomc!=c:
          foundreason=True
          break
      if not foundreason:
        return None
  # way must be clear
  dx=sign(newx-oldx)
  dy=sign(1-oldy)
  dist=0
  x=oldx
  y=oldy
  if dy<0: # moving up, vertical part first
    while y!=1:
      if data[y+dy][x]!='.':
        return None
      y+=dy
      dist+=1
  # for all moves the horizontal part
  while x!=newx:
    if data[y][x+dx]!='.':
      return None
    x+=dx
    dist+=1
  dy=sign(newy-y)
  if dy>0: # moving down, final vertical part
    while y!=newy:
      if data[y+dy][x]!='.':
        return None
      y+=dy
      dist+=1
  # prevent creating cross-blocks in the hallway (eg D to left A, both between their rooms)
  if newy==1:
    if not blockers_solvable(data, newx, data[oldy][oldx]):
      return None
  # determine finalmove
  finalmove=(newy>1)
  # some final moves ('outsiders') can be expensive, and should not be considered final final,
  # as a hack we just don't mark them final
  if oldx<3 or oldx>9:
    finalmove=False
  return dist,finalmove

def mincost_dfs_jump(data, cost, stats, depth): # DFS jumps
  stats['turns']+=1
  show_board(data, stats, None, cost, depth)
  mincost=stats['mincost']
  if is_endpos(data):
    if mincost==0 or cost<mincost:
      stats['mincost']=cost
      turn=stats['turns']
      print(f'mincost improved; {depth=} {turn=} {cost=}')
    return
  if mincost>0 and cost>=mincost:
    return
  # iterate over moves
  optmoves=[]
  finalmoves=[]
  finalsrccounts=collections.Counter()
  finaldestcounts=collections.Counter()
  for y in range(1, len(data)-1):
    for x in range(1, len(data[y])-1):
      c=data[y][x]
      if c>='A' and c<='D':
        energy={'A': 1, 'B': 10, 'C': 100, 'D': 1000}[c]
        for y2 in range(1, len(data)-1):
          for x2 in range(1, len(data[y2])-1):
            if data[y2][x2]=='.':
              jumpres=valid_jump(data, y, x, y2, x2)
              if jumpres is None:
                continue
              dist,finalmove=jumpres
              if finalmove: 
                finalmoves.append( (y,x,y2,x2,c,energy*dist) )
              else:
                optmoves.append( (y,x,y2,x2,c,energy*dist) )
              finalsrccounts[ (y,x) ]+=1 # any conflict is counted, also between final and optional
              finaldestcounts[ (y2,x2) ]+=1                
  # now apply all non-conflicting finalmoves
  finalcopy=list(finalmoves)
  finalmoves=[]
  for finalmove in finalcopy:
    y,x,y2,x2,c,ecost=finalmove
    if finalsrccounts[ (y,x) ] >1 or finaldestcounts[ (y2,x2) ] >1:
      optmoves.append(finalmove) # pity but it conflicts with other moves
    else:
      assert data[y][x]==c and data[y2][x2]=='.'
      data[y2][x2]=c
      data[y][x]='.'
      cost+=ecost
      finalmoves.append(finalmove) # only keep the ones we have executed
  # could be at endpos now
  if is_endpos(data):
    if mincost==0 or cost<mincost:
      stats['mincost']=cost
      turn=stats['turns']
      print(f'mincost improved; {depth=} {turn=} {cost=}')
    for finalmove in finalmoves:
      y,x,y2,x2,c,ecost=finalmove
      data[y][x]=c
      data[y2][x2]='.'      
    return                
  if mincost>0 and cost>=mincost:
    for finalmove in finalmoves:
      y,x,y2,x2,c,ecost=finalmove
      data[y][x]=c
      data[y2][x2]='.'
    return
  optmoves.sort(key=lambda mv: mv[5]) # lowest cost first
  for optmove in optmoves:
    y,x,y2,x2,c,ecost=optmove
    cost+=ecost
    if mincost>0 and cost>=mincost:
      break # later options are only more costly
    assert data[y][x]==c and data[y2][x2]=='.'
    data[y2][x2]=c
    data[y][x]='.'
    mincost_dfs_jump(data, cost, stats, depth+1)
    data[y][x]=c
    data[y2][x2]='.'
    cost-=ecost
  # finally undo the finalmoves
  for finalmove in finalmoves:
    y,x,y2,x2,c,ecost=finalmove
    data[y][x]=c
    data[y2][x2]='.'

#sample1=open('data_src/2021-day-23-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
lines.insert(3, '  #D#C#B#A#') # part 2, comment out for part 1
lines.insert(4, '  #D#B#A#C#') # part 2, comment out for part 1
data=[ list(s) for s in lines ]
stats=collections.Counter()
mincost_dfs_jump(data, 0, stats, 0)
print(f'{stats=}')
mincost=stats['mincost']
print(f'{mincost=}')

# part 1: not 18300 or 18286 but 18282
# part 2: current implementation finds mincost=0, and for non-file sample1 finds mincost=50171 after 104 minutes, 
# i.e. doesn't find all solutions

In [None]:
# 2021 day 23 unit tests for blockers_solvable

testsample6='''
#############
#A......B..A#
###.#.#.#.###
'''
testlines=[s for s in testsample6.splitlines() if len(s)>0 ]
testdata6=[ list(s) for s in testlines ]
assert blockers_solvable(testdata6, 1, 'A')
assert blockers_solvable(testdata6, 2, 'A')
assert blockers_solvable(testdata6, 10, 'D')
assert blockers_solvable(testdata6, 2, 'D')
assert blockers_solvable(testdata6, 4, 'D')
assert not blockers_solvable(testdata6, 6, 'D')

testsample7='''
#############
#DC.....A...#
###.#.#.#.###
'''
testlines=[s for s in testsample7.splitlines() if len(s)>0 ]
testdata7=[ list(s) for s in testlines ]
assert blockers_solvable(testdata7, 1, 'D')
assert not blockers_solvable(testdata7, 6, 'D')
assert not blockers_solvable(testdata7, 4, 'D')

In [None]:
# 2021 day 23 unit tests for valid_jump and is_endpos part 2

testsample2='''
#############
#...B.......#
###B#C#.#D###
  #.#.#.#.#
  #.#.#.#.#
  #A#D#C#A#
  #########
'''
testlines=[s for s in testsample2.splitlines() if len(s)>0 ]
testdata2=[ list(s) for s in testlines ]
assert valid_jump(testdata2, 2, 9, 1, 1) == None # cannot move D through guy
assert valid_jump(testdata2, 5, 7, 1, 10) == None # cannot move C out of own room
assert valid_jump(testdata2, 2, 9, 1, 7) == None # cannot move D in front of room entrance
assert valid_jump(testdata2, 2, 9, 1, 11) == (3,False) # can move D up aside, from own room, because freeing up guy
assert valid_jump(testdata2, 1, 4, 4, 7) == None # cannot move into wrong room
assert valid_jump(testdata2, 1, 4, 1, 11) == None # cannot move within hallway
assert valid_jump(testdata2, 2, 5, 4, 7) == (6,True) # ok room to room move

testsample3='''
#############
#A........BA#
###.#.#.#.###
  #.#.#.#.#
  #.#.#.#.#
  #.#D#C#A#
  #########
'''
testlines=[s for s in testsample3.splitlines() if len(s)>0 ]
testdata3=[ list(s) for s in testlines ]
assert valid_jump(testdata3, 1, 1, 4, 9) == None # cannot move into wrong room
assert valid_jump(testdata3, 1, 1, 2, 3) == None # cannot move into room leaving space below
assert valid_jump(testdata3, 1, 1, 4, 3) == None # cannot move into room leaving space below
assert valid_jump(testdata3, 1, 11, 5, 3) == None # cannot move through guy
assert valid_jump(testdata3, 1, 1, 5, 3) == (6,False) # can move into bottom of own room, but not final
assert valid_jump(testdata3, 5, 9, 2, 3) == None # cannot move into room leaving space below
assert valid_jump(testdata3, 5, 9, 5, 3) == (14,True) # can move room to room
assert valid_jump(testdata3, 5, 9, 1, 6) == (7,False) # can move up aside
assert valid_jump(testdata3, 5, 5, 2, 9) == None # cannot move into room blocking guy
assert valid_jump(testdata3, 5, 5, 4, 9) == None # cannot move into room blocking guy

testsample4='''
#############
#...........#
###B#C#B#D###
  #D#C#B#A#
  #D#B#A#C#
  #A#D#C#A#
  #########
'''
testlines=[s for s in testsample4.splitlines() if len(s)>0 ]
testdata4=[ list(s) for s in testlines ]
assert not is_endpos(testdata4)

testsample5='''
#############
#...........#
###A#B#C#D###
  #A#B#C#D#
  #A#B#C#D#
  #A#B#C#D#
  #########
'''
testlines=[s for s in testsample5.splitlines() if len(s)>0 ]
testdata5=[ list(s) for s in testlines ]
assert is_endpos(testdata5)


In [None]:
# 2021 day 23 unit tests for valid_jump part 1

testsample2='''
#############
#...B.......#
###B#C#.#D###
  #A#D#C#A#
  #########
'''
testlines=[s for s in testsample2.splitlines() if len(s)>0 ]
testdata2=[ list(s) for s in testlines ]
assert valid_jump(testdata2, 2, 9, 1, 1) == None # cannot move D through guy
assert valid_jump(testdata2, 3, 7, 1, 10) == None # cannot move C out of own room
assert valid_jump(testdata2, 2, 9, 1, 7) == None # cannot move D in front of room entrance
assert valid_jump(testdata2, 2, 9, 1, 11) == (3,False) # can move D up aside, from own room, because freeing up guy
assert valid_jump(testdata2, 1, 4, 2, 7) == None # cannot move into wrong room
assert valid_jump(testdata2, 1, 4, 1, 11) == None # cannot move within hallway
assert valid_jump(testdata2, 2, 5, 2, 7) == (4,True) # ok room to room move

testsample3='''
#############
#A........BA#
###.#.#.#.###
  #.#D#C#A#
  #########
'''
testlines=[s for s in testsample3.splitlines() if len(s)>0 ]
testdata3=[ list(s) for s in testlines ]
assert valid_jump(testdata3, 1, 1, 2, 9) == None # cannot move into wrong room
assert valid_jump(testdata3, 1, 1, 2, 3) == None # cannot move into room leaving space below
assert valid_jump(testdata3, 1, 11, 3, 3) == None # cannot move through guy
assert valid_jump(testdata3, 1, 1, 3, 3) == (4,False) # can move into bottom of own room, but not final
assert valid_jump(testdata3, 3, 9, 2, 3) == None # cannot move into room leaving space below
assert valid_jump(testdata3, 3, 9, 3, 3) == (10,True) # can move room to room
assert valid_jump(testdata3, 3, 9, 1, 6) == (5,False) # can move up aside
assert valid_jump(testdata3, 3, 5, 2, 9) == None # cannot move into room blocking guy

In [None]:
# 2021 day 23 parked code for a step by step approach

def mincost_dfs(data, cost, stats, locked, lastmove, depth=0): # DFS
  show_board(data, stats, locked, cost, depth)
  mincost=stats['mincost']
  if is_endpos(data):
    if mincost==0 or cost<mincost:
      stats['mincost']=cost
      show_board(data, stats, locked, cost, depth, force=True)
    return
  if mincost>0 and cost>=mincost:
    return
  # iterate over moves
  savelocked=set(locked)
  for y in range(len(data)):
    for x in range(len(data[y])):
      c=data[y][x]
      if c>='A' and c<='D':
        energy={'A': 1, 'B': 10, 'C': 100, 'D': 1000}[c]
        for dy in (-1, 1):
          dx=0
          if data[y+dy][x+dx]=='.':
            restore_locked(locked, savelocked)
            if not valid_move(data, locked, y, x, y+dy, x+dx, lastmove):
              continue
            data[y+dy][x+dx]=c
            data[y][x]='.'
            newlastmove=( (y,x), (y+dy, x+dx) )
            mincost_dfs(data, cost+energy, stats, locked, newlastmove, depth+1)
            data[y][x]=c
            data[y+dy][x+dx]='.'
        for dx in (-1, 1):
          dy=0
          if data[y+dy][x+dx]=='.':
            restore_locked(locked, savelocked)
            if not valid_move(data, locked, y, x, y+dy, x+dx, lastmove):
              continue
            data[y+dy][x+dx]=c
            data[y][x]='.'
            newlastmove=( (y,x), (y+dy, x+dx) )
            mincost_dfs(data, cost+energy, stats, locked, newlastmove, depth+1)
            data[y][x]=c
            data[y+dy][x+dx]='.'

def valid_move(data, locked, oldy, oldx, newy, newx, lastmove): # check move and update locked
  c=data[oldy][oldx]
  # do not reverse lastmove
  lastold,lastnew=lastmove
  lastoldy,lastoldx=lastold
  lastnewy,lastnewx=lastnew
  if oldy==lastnewy and oldx==lastnewx and newy==lastoldy and newx==lastoldx:
    return False
  # if switched guy lock others in hallway
  if oldy!=lastnewy or oldx!=lastnewx: # switched guy
    if 'U' in locked: # switching not allowed
      return False
    for x in range(1, 12):
      if data[1][x]!='.' and x!=oldx:
        locked.add(x)         
  # only move into own room from hall
  if oldy==1 and newy==2:
    if newx!={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]:
      return False
    assert oldx==newx
    locked.discard('U')
    return True
  # do not stop on space in front of room
  for x in (3, 5, 7, 9):
    if data[1][x]!='.' and (oldx!=x or oldy!=1):
      return False
  # in own room may not move up or away if 'empty'
  if oldx=={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]:
    if (data[2][oldx]=='.' or data[2][oldx]==c) and (data[3][oldx]=='.' or data[3][oldx]==c):
      if newx!=oldx:
        return False
      if newy<oldy:
        return False
      assert newy>oldy
      return True
  # if your horizontal move and way to room clear and no U then have to go that way
  if ('U' not in locked) and oldy==1 and newy==1:
    x={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]
    if (data[2][x]=='.' or data[2][x]==c) and (data[3][x]=='.' or data[3][x]==c): # room empty
      intheway=False
      for x0 in range(min(oldx, x), max(oldx, x)+1): # hallway clear?
        if x0!=oldx and data[1][x0]!='.':
          intheway=True
          break
      if not intheway:
        if x>oldx and newx<oldx:
          return False
        if x<oldx and newx>oldx:
          return False 
  # if moving in the hall and switched from prev. guy and all locked can unlock
  if oldy==1 and newy==1:
    if oldx in locked and (oldy!=lastnewy or oldx!=lastnewx): # switched guy, can we go to our room?
      # only if nobody else already unlocked
      if 'U' in locked:
        return False
      # only if your room is 'empty'
      x={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]
      if data[2][x]!='.' and data[2][x]!=c:
        return False
      if data[3][x]!='.' and data[3][x]!=c:
        return False
      # only if hallway to your room is clear, between oldx and x must be only this guy
      for x0 in range(min(oldx, x), max(oldx, x)+1):
        if x0!=oldx and data[1][x0]!='.':
          return False
      # yes, can move, unlock
      locked.remove(oldx)
      locked.add('U')
    # still locked? can't move
    if oldx in locked:
      return False
    if 'U' in locked: # must be us
      # have to move to own room
      x={'A': 3, 'B': 5, 'C': 7, 'D': 9}[c]
      if x>oldx and newx<oldx:
        return False
      if x<oldx and newx>oldx:
        return False
  # done all checks, must be good
  return True


def restore_locked(locked, savelocked):
  locked.clear()
  locked.update(savelocked)

locked=set() # set of x positions of locked pods, and one unlocked token 'U'
lastmove=( (-1, -1), (-1, -1) )

Conclusion for day 23: persistence pays off, the BFS solution looks really nice, would have been awesome to come up with it right away, but still happy about it.

In [None]:
# 2021 day 22 data & part1
# mv ~/Downloads/input data_src/2021-day-22-input.txt
# big input file looks like: big list of overlapping cubes
# idea: part 1 parse ..., then list of on regions

sample1='''
on x=10..12,y=10..12,z=10..12
on x=11..13,y=11..13,z=11..13
off x=9..11,y=9..11,z=9..11
on x=10..10,y=10..10,z=10..10
'''

sample2='''
on x=-20..26,y=-36..17,z=-47..7
on x=-20..33,y=-21..23,z=-26..28
on x=-22..28,y=-29..23,z=-38..16
on x=-46..7,y=-6..46,z=-50..-1
on x=-49..1,y=-3..46,z=-24..28
on x=2..47,y=-22..22,z=-23..27
on x=-27..23,y=-28..26,z=-21..29
on x=-39..5,y=-6..47,z=-3..44
on x=-30..21,y=-8..43,z=-13..34
on x=-22..26,y=-27..20,z=-29..19
off x=-48..-32,y=26..41,z=-47..-37
on x=-12..35,y=6..50,z=-50..-2
off x=-48..-32,y=-32..-16,z=-15..-5
on x=-18..26,y=-33..15,z=-7..46
off x=-40..-22,y=-38..-28,z=23..41
on x=-16..35,y=-41..10,z=-47..6
off x=-32..-23,y=11..30,z=-14..3
on x=-49..-5,y=-3..45,z=-29..18
off x=18..30,y=-20..-8,z=-3..13
on x=-41..9,y=-7..43,z=-33..15
on x=-54112..-39298,y=-85059..-49293,z=-27449..7877
on x=967..23432,y=45373..81175,z=27513..53682
'''

sample3='''
on x=-5..47,y=-31..22,z=-19..33
on x=-44..5,y=-27..21,z=-14..35
on x=-49..-1,y=-11..42,z=-10..38
on x=-20..34,y=-40..6,z=-44..1
off x=26..39,y=40..50,z=-2..11
on x=-41..5,y=-41..6,z=-36..8
off x=-43..-33,y=-45..-28,z=7..25
on x=-33..15,y=-32..19,z=-34..11
off x=35..47,y=-46..-34,z=-11..5
on x=-14..36,y=-6..44,z=-16..29
on x=-57795..-6158,y=29564..72030,z=20435..90618
on x=36731..105352,y=-21140..28532,z=16094..90401
on x=30999..107136,y=-53464..15513,z=8553..71215
on x=13528..83982,y=-99403..-27377,z=-24141..23996
on x=-72682..-12347,y=18159..111354,z=7391..80950
on x=-1060..80757,y=-65301..-20884,z=-103788..-16709
on x=-83015..-9461,y=-72160..-8347,z=-81239..-26856
on x=-52752..22273,y=-49450..9096,z=54442..119054
on x=-29982..40483,y=-108474..-28371,z=-24328..38471
on x=-4958..62750,y=40422..118853,z=-7672..65583
on x=55694..108686,y=-43367..46958,z=-26781..48729
on x=-98497..-18186,y=-63569..3412,z=1232..88485
on x=-726..56291,y=-62629..13224,z=18033..85226
on x=-110886..-34664,y=-81338..-8658,z=8914..63723
on x=-55829..24974,y=-16897..54165,z=-121762..-28058
on x=-65152..-11147,y=22489..91432,z=-58782..1780
on x=-120100..-32970,y=-46592..27473,z=-11695..61039
on x=-18631..37533,y=-124565..-50804,z=-35667..28308
on x=-57817..18248,y=49321..117703,z=5745..55881
on x=14781..98692,y=-1341..70827,z=15753..70151
on x=-34419..55919,y=-19626..40991,z=39015..114138
on x=-60785..11593,y=-56135..2999,z=-95368..-26915
on x=-32178..58085,y=17647..101866,z=-91405..-8878
on x=-53655..12091,y=50097..105568,z=-75335..-4862
on x=-111166..-40997,y=-71714..2688,z=5609..50954
on x=-16602..70118,y=-98693..-44401,z=5197..76897
on x=16383..101554,y=4615..83635,z=-44907..18747
off x=-95822..-15171,y=-19987..48940,z=10804..104439
on x=-89813..-14614,y=16069..88491,z=-3297..45228
on x=41075..99376,y=-20427..49978,z=-52012..13762
on x=-21330..50085,y=-17944..62733,z=-112280..-30197
on x=-16478..35915,y=36008..118594,z=-7885..47086
off x=-98156..-27851,y=-49952..43171,z=-99005..-8456
off x=2032..69770,y=-71013..4824,z=7471..94418
on x=43670..120875,y=-42068..12382,z=-24787..38892
off x=37514..111226,y=-45862..25743,z=-16714..54663
off x=25699..97951,y=-30668..59918,z=-15349..69697
off x=-44271..17935,y=-9516..60759,z=49131..112598
on x=-61695..-5813,y=40978..94975,z=8655..80240
off x=-101086..-9439,y=-7088..67543,z=33935..83858
off x=18020..114017,y=-48931..32606,z=21474..89843
off x=-77139..10506,y=-89994..-18797,z=-80..59318
off x=8476..79288,y=-75520..11602,z=-96624..-24783
on x=-47488..-1262,y=24338..100707,z=16292..72967
off x=-84341..13987,y=2429..92914,z=-90671..-1318
off x=-37810..49457,y=-71013..-7894,z=-105357..-13188
off x=-27365..46395,y=31009..98017,z=15428..76570
off x=-70369..-16548,y=22648..78696,z=-1892..86821
on x=-53470..21291,y=-120233..-33476,z=-44150..38147
off x=-93533..-4276,y=-16170..68771,z=-104985..-24507
'''

sample0a='''
on x=0..1,y=0..1,z=0..1
on x=2..3,y=2..3,z=2..3
'''

sample0b='''
on x=0..1,y=0..1,z=0..1
on x=1..2,y=1..2,z=1..2
'''

sample0c='''
on x=1..32,y=1..32,z=1..32
'''

sample0d='''
on x=-5..47,y=-31..22,z=-19..33
on x=-44..5,y=-27..21,z=-14..35
on x=-49..-1,y=-11..42,z=-10..38
on x=-20..34,y=-40..6,z=-44..1
off x=26..39,y=40..50,z=-2..11
on x=-41..5,y=-41..6,z=-36..8
off x=-43..-33,y=-45..-28,z=7..25
on x=-33..15,y=-32..19,z=-34..11
off x=35..47,y=-46..-34,z=-11..5
on x=-14..36,y=-6..44,z=-16..29
on x=-57795..-6158,y=29564..72030,z=20435..90618
on x=36731..105352,y=-21140..28532,z=16094..90401
'''

#sample1=open('data_src/2021-day-22-input.txt').read()
lines=[s for s in sample3.splitlines() if len(s)>0 ]

def restrict(x0, x1, lim0, lim1): # check and return the overlap between two line segments in a one-dimensional space,
    # or None,None if no overlap
    if x0<lim0:
        x0=lim0
    if x0>lim1:
        return None,None
    if x1<lim0:
        return None,None
    if x1>lim1:
        x1=lim1
    if x0>x1:
        return None,None
    return x0,x1

def maintain_on(ons, row): # with ons being a set of 1x1x1 (x,y,z) on-pixels, first restrict row to the -50,50 init area,
    # then apply to ons
    onoff, x0, x1, y0, y1, z0, z1=row
    assert onoff in {'on', 'off'}
    assert x0<=x1
    assert y0<=y1
    assert z0<=z1
    x0,x1=restrict(x0, x1, -50, 50)
    if x0 is None or x1 is None:
        return
    y0,y1=restrict(y0, y1, -50, 50)
    if y0 is None or y1 is None:
        return
    z0,z1=restrict(z0, z1, -50, 50)
    if z0 is None or z1 is None:
        return
    for x in range(x0, x1+1):
        for y in range(y0, y1+1):
            for z in range(z0, z1+1):
                if onoff=='on':
                    ons.add( (x,y,z) )
                else:
                    ons.discard( (x,y,z) )

def count_on(ons): # with ons being a set of 1x1x1 (x,y,z) on-pixels count the on-pixels
    return len(ons)

data=[ result.group(1, 2, 3, 4, 5, 6, 7) for s in lines if (result:= re.match(r'(\w+)\s*x=([\d\-]+)\.\.([\d\-]+),y=([\d\-]+)\.\.([\d\-]+),z=([\d\-]+)\.\.([\d\-]+)', s)) ]
data=[ (row[0], int(row[1]), int(row[2]), int(row[3]), int(row[4]), int(row[5]), int(row[6]) ) for row in data ]
ons=set() # all areas off, for now set of (x,y,z) pixels
for row in data:
    maintain_on(ons, row)
    count=count_on(ons)
    print(f'after {row=} {count=}')
    #if row[-1]==29: # sample3-specific
    #if row[-2]== -44: # sample1-file-specific # DEBUG
    #    break

In [None]:
# 2021 day 22 part 2 code fourth implementation, named 'cubarraytree', see cubtree below, here tree nodes are stored
# in a numpy array instead of dictionaries (also tried a record array layered on top of a structured array and also a 
# structured array directly but these are unfortunately both quite a bit slower).
# Nodes are still cuboids, with up to 27 cuboids as children.
# Nodes can be deleted when there is an 'exact fit' for a node that previously had children, in this implementation
# these deleted nodes are not re-used or garbage collected, but simply abandoned. Each node has:
# (corner_x, corner_y, corner_z): minimum of all coords in the cube
# (corner2_x, corner2_y, corner2_z): maximum of all coords in the cube
# val, int, 0 is off, 1 is on, 2 and higher is numchildren+1
# first_child: index of the first child (or 0 if no children)
# all fields are int32

NUMNODES=200000
nodes=np.zeros( (NUMNODES, 8), dtype=np.dtype(np.int32, align=True))
class Nd(object): # constants
    __slots__=()
    CORNER_X=0
    CORNER_Y=1
    CORNER_Z=2
    CORNER2_X=3
    CORNER2_Y=4
    CORNER2_Z=5
    VAL=6
    FIRST_CHILD=7
ND=Nd()
first_free_node=0

def max_coord(data): # from data determine largest absolute coord, round up, a power of two not needed here
    size=0
    for row in data:
        maxc=max([ abs(num) for num in row[1:] ])
        size=max(size, maxc)
    return size+2

def addnonempty_child(x0, x1, y0, y1, z0, z1, val): # add specified cuboid child if non-empty, return index
    global nodes, first_free_node
    if x0<=x1 and y0<=y1 and z0<=z1:
        i=first_free_node
        assert i<NUMNODES
        child=nodes[i]
        child[ND.CORNER_X]=x0
        child[ND.CORNER_Y]=y0
        child[ND.CORNER_Z]=z0
        child[ND.CORNER2_X]=x1
        child[ND.CORNER2_Y]=y1
        child[ND.CORNER2_Z]=z1
        child[ND.VAL]=val
        child[ND.FIRST_CHILD]=0
        first_free_node+=1
        return i
    return None

def make_children(node, regx0, regx1, regy0, regy1, regz0, regz1): # create children along specified lines
    val=node[ND.VAL]
    corner_x=node[ND.CORNER_X]
    corner_y=node[ND.CORNER_Y]
    corner_z=node[ND.CORNER_Z]
    corner2_x=node[ND.CORNER2_X]
    corner2_y=node[ND.CORNER2_Y]
    corner2_z=node[ND.CORNER2_Z]
    assert node[ND.FIRST_CHILD]==0 and val<2
    # generate and return max 27 non-empty parts out of block, in and outside region
    first_child=None
    nchildren=0
    for xpair in [ (corner_x, regx0-1), (regx0, regx1), (regx1+1, corner2_x) ]:
        for ypair in [ (corner_y, regy0-1), (regy0, regy1), (regy1+1, corner2_y) ]:
            for zpair in [ (corner_z, regz0-1), (regz0, regz1), (regz1+1, corner2_z) ]:
                cindex=addnonempty_child(xpair[0], xpair[1], ypair[0], ypair[1], zpair[0], zpair[1], val)
                if cindex is not None:
                    if first_child is None:
                        first_child=cindex
                    nchildren+=1
    assert first_child is not None
    node[ND.VAL]=1+nchildren
    node[ND.FIRST_CHILD]=first_child

def overlap_cub(onx0, ony0, onz0, onx1, ony1, onz1, row): # return overlap between cuboid of corners and row shape, or None
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=row
    x0,x1=restrict(onx0, onx1, rowx0, rowx1)
    if x0 is None or x1 is None:
        return None
    y0,y1=restrict(ony0, ony1, rowy0, rowy1)
    if y0 is None or y1 is None:
        return None
    z0,z1=restrict(onz0, onz1, rowz0, rowz1)
    if z0 is None or z1 is None:
        return None
    exactfit=(x0==onx0 and x1==onx1 and y0==ony0 and y1==ony1 and z0==onz0 and z1==onz1)
    return (x0, x1, y0, y1, z0, z1, exactfit)

def maintain_tree(node, row): # recursively 'paint' the tree
    # determine part of row that fits in node
    rowfit=overlap_cub(node[ND.CORNER_X], node[ND.CORNER_Y], node[ND.CORNER_Z], node[ND.CORNER2_X], node[ND.CORNER2_Y],
      node[ND.CORNER2_Z], row)
    if rowfit is None: # no overlap
        return
    onoff=row[0]
    if node[ND.VAL]==onoff: # repainting what was already painted
        return
    x0, x1, y0, y1, z0, z1, exactfit=rowfit
    if exactfit: # if there were any children they will be abandoned, efficient in runtime but not in space
        node[ND.FIRST_CHILD]=0
        node[ND.VAL]=onoff
        return
    # create/update children recursively
    if node[ND.VAL]<2:
        make_children(node, x0, x1, y0, y1, z0, z1)
    nchildren=node[ND.VAL]-1
    first_child=node[ND.FIRST_CHILD]
    assert nchildren>0
    assert first_child>0
    for cindex in range(first_child, first_child+nchildren):
        maintain_tree(nodes[cindex], row)

def count_tree(node): # recursively count the cubtree, return count (of lit pixels), nodes
    val=node[ND.VAL]
    nchildren=val-1
    if nchildren>0:
        count=0
        nodec=1
        first_child=node[ND.FIRST_CHILD]
        for cindex in range(first_child, first_child+nchildren):
            cnt,lf=count_tree(nodes[cindex])
            count+=cnt
            nodec+=lf
        return count,nodec
    elif val==0:
        return 0,1
    else:
        assert val==1
        count=1
        count *= int(node[ND.CORNER2_X]-node[ND.CORNER_X]+1)
        count *= int(node[ND.CORNER2_Y]-node[ND.CORNER_Y]+1)
        count *= int(node[ND.CORNER2_Z]-node[ND.CORNER_Z]+1)
        return count,1

sample1=open('data_src/2021-day-22-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ result.group(1, 2, 3, 4, 5, 6, 7) for s in lines if (result:= re.match(r'(\w+)\s*x=([\d\-]+)\.\.([\d\-]+),y=([\d\-]+)\.\.([\d\-]+),z=([\d\-]+)\.\.([\d\-]+)', s)) ]
data=[ ({'on': 1, 'off': 0}[row[0]], int(row[1]), int(row[2]), int(row[3]), int(row[4]), int(row[5]), int(row[6]) ) for row in data ]

def main():
    global nodes, first_free_node
    root=nodes[0] # cubtree root node
    first_free_node=1
    size=max_coord(data)
    print(f'{size=}')
    root[ND.CORNER_X]= -size
    root[ND.CORNER_Y]= -size
    root[ND.CORNER_Z]= -size
    root[ND.CORNER2_X]= size
    root[ND.CORNER2_Y]= size
    root[ND.CORNER2_Z]= size
    root[ND.VAL]=0 # starts unlit
    root[ND.FIRST_CHILD]=0
    for row in data:
        maintain_tree(root, row)
    count,nodec=count_tree(root)
    print(f'node stats: allocated={first_free_node}, {nodec=}, abandoned={first_free_node-nodec}')
    print(f'after {row=}, lit {count=}')
    print()

main()
#cProfile.run('main()', sort=1)

# sample3 output last line (after 0.2 secs.):
#node stats: allocated=9518, nodec=8542, abandoned=976
#after row=(0, -93533, -4276, -16170, 68771, -104985, -24507), lit count=2758514936282235

# sample1 from file last line (after 0.9 secs.):
#node stats: allocated=57286, nodec=53076, abandoned=4210
#after row=(1, 69353, 76679, -32520, -24321, -32891, 129), lit count=1285677377848549


In [None]:
# 2021 day 22 part 2 code third implementation, named 'cubtree', based on an octree / bsptree, 
# but instead of dividing each cube into 8 or 64 equally sized cube children here the cuboids(!) are split along
# the lines of incoming 'work orders' to minimize the number of nodes, into max. 27 children cuboids.
# childset is removed here for simplicity, also less useful because you cannot guarantee a leaf level
# of 4x4x4 cubes. each node has:
# corner, (x,y,z), minimum of all coords in the cube
# corner2, (x,y,z), maximum of all coords in the cube
# val, int, 0 is off, 1 is on
# children, list of n recursive nodes
# (only one of val and children is set)

def max_coord(data): # from data determine largest absolute coord, round up, a power of two not needed here
    size=0
    for row in data:
        row2=list(row)
        row2.pop(0) # remove onoff
        maxc=max([ abs(num) for num in row2 ])
        size=max(size, maxc)
    return size+2

def addnonempty_child(res, x0, x1, y0, y1, z0, z1, val): # add specified cuboid child if non-empty
    if x0<=x1 and y0<=y1 and z0<=z1:
        child={}
        child['corner']=(x0, y0, z0)
        child['corner2']=(x1, y1, z1)
        child['val']=val
        res.append(child)        

def make_children(node, regx0, regx1, regy0, regy1, regz0, regz1): # create children along specified lines
    val=node['val']
    corner=node['corner']
    corner2=node['corner2']
    assert regx1>=regx0
    assert regy1>=regy0
    assert regz1>=regz0
    assert 'children' not in node
    node['children']=[]
    del node['val']
    # generate and return max 27 non-empty parts out of block, in and outside region
    for xpair in [ (corner[0], regx0-1), (regx0, regx1), (regx1+1, corner2[0]) ]:
        for ypair in [ (corner[1], regy0-1), (regy0, regy1), (regy1+1, corner2[1]) ]:
            for zpair in [ (corner[2], regz0-1), (regz0, regz1), (regz1+1, corner2[2]) ]:
                addnonempty_child(node['children'], xpair[0], xpair[1], ypair[0], ypair[1], zpair[0], zpair[1], val)

def overlap_cub(corner, corner2, row): # return overlap between cuboid of corners and row shape, or None
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=row
    assert rowx0<=rowx1
    assert rowy0<=rowy1
    assert rowz0<=rowz1
    onx0,onx1,ony0,ony1,onz0,onz1=(corner[0], corner2[0], corner[1], corner2[1], corner[2], corner2[2])
    x0,x1=restrict(onx0, onx1, rowx0, rowx1)
    if x0 is None or x1 is None:
        return None
    y0,y1=restrict(ony0, ony1, rowy0, rowy1)
    if y0 is None or y1 is None:
        return None
    z0,z1=restrict(onz0, onz1, rowz0, rowz1)
    if z0 is None or z1 is None:
        return None
    exactfit=(x0==onx0 and x1==onx1 and y0==ony0 and y1==ony1 and z0==onz0 and z1==onz1)
    return (x0, x1, y0, y1, z0, z1, exactfit)

def maintain_tree(node, row): # recursively 'paint' the tree
    # determine part of row that fits in node
    rowfit=overlap_cub(node['corner'], node['corner2'], row)
    if rowfit is None: # no overlap
        return
    onoff={'on': 1, 'off': 0}[row[0]]
    if 'val' in node and node['val']==onoff: # repainting what was already painted
        return
    x0, x1, y0, y1, z0, z1, exactfit=rowfit
    if exactfit:
        if 'children' in node:
            del node['children']
        node['val']=onoff
        return
    # create/update children recursively
    if 'children' not in node:
        make_children(node, x0, x1, y0, y1, z0, z1)
    assert 'val' not in node
    assert len(node['children'])>0
    for child in node['children']:
        maintain_tree(child, row)

def count_tree(node): # recursively count the cubtree, return count (of lit pixels), nodes
    if 'children' in node:
        assert 'val' not in node
        count=0
        nodec=1
        for child in node['children']:
            cnt,lf=count_tree(child)
            count+=cnt
            nodec+=lf
        return count,nodec
    elif node['val']==0:
        return 0,1
    else:
        assert node['val']==1
        count=1
        for i in range(3):
            count*= node['corner2'][i]-node['corner'][i]+1
        return count,1

sample1=open('data_src/2021-day-22-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ result.group(1, 2, 3, 4, 5, 6, 7) for s in lines if (result:= re.match(r'(\w+)\s*x=([\d\-]+)\.\.([\d\-]+),y=([\d\-]+)\.\.([\d\-]+),z=([\d\-]+)\.\.([\d\-]+)', s)) ]
data=[ (row[0], int(row[1]), int(row[2]), int(row[3]), int(row[4]), int(row[5]), int(row[6]) ) for row in data ]

root={} # cubtree root node
size=max_coord(data)
print(f'{size=}')
root['corner']=(-size,-size,-size)
root['corner2']=(size,size,size)
root['val']=0 # starts unlit
for row in data:
    maintain_tree(root, row)
    count,nodec=count_tree(root)
print(f'after {row=} nodes={nodec}, lit {count=}')

# sample3 output last line (after 0.3 secs.):
#after row=('off', -93533, -4276, -16170, 68771, -104985, -24507) nodes=8542, lit count=2758514936282235

# sample1 from file last line (after 5.7 secs.):
#after row=('on', 69353, 76679, -32520, -24321, -32891, 129) nodes=53076, lit count=1285677377848549

# could be made to run quite a bit faster still with some of the tricks outlined for the octree below, e.g. 
# instead of using these unsightly (but flexible!) dictionary-based nodes
# encoding each node as (int32 corner_x, int32 corner_y, int32 corner_z, int32 corner2_x, int32 corner2_y, int32 corner2_z,
# int32 val, int32 first_child) where val would be 0=off, 1=on, 2 and higher is numchildren+1,
# would allow storing everything in a single pre-allocated numpy array of e.g. 8x200000 that could always be doubled in size
# when running out of space


In [None]:
# 2021 day 22 part 2 code second implementation, based on an octree, each node has:
# corner, (x,y,z), minimum of all coords in the octant
# size, int, length of each edge of the cube
# val, string, 'on' or 'off'
# children, list of 8 or 64 recursive nodes
# childset, int bitset of on/off values for 64 children
# (only one of val, children and childset is set)

def max_coord(data): # from data determine largest absolute coord, round up to power of 2
    size=0
    for row in data:
        row2=list(row)
        row2.pop(0) # remove onoff
        maxc=max([ abs(num) for num in row2 ])
        size=max(size, maxc)
    size=2**math.ceil(math.log2(size)) # round up size to power of 2
    return size*2 # round up once more

def make_children(node): # create children, either 8 or 64
    size=node['size']
    val=node['val']
    assert size>=2
    assert 'children' not in node
    node['children']=[]
    del node['val']
    # set nr of children so that there will be no size 2 nodes at the bottom if it can be avoided
    enable64=True
    if size>=16:
        log2size=math.ceil(math.log2(size))
        if log2size%2==1:
            enable64=False
    if size>=4 and enable64: # can even split into 64 children, i.e. skip a level
        size//=4
        for x in (0, size, 2*size, 3*size):
            for y in (0, size, 2*size, 3*size):
                for z in (0, size, 2*size, 3*size):
                    child={}
                    child['corner']=(node['corner'][0]+x, node['corner'][1]+y, node['corner'][2]+z)
                    child['size']=size
                    child['val']=val
                    node['children'].append(child)
    else:
        size//=2
        for x in (0, size):
            for y in (0, size):
                for z in (0, size):
                    child={}
                    child['corner']=(node['corner'][0]+x, node['corner'][1]+y, node['corner'][2]+z)
                    child['size']=size
                    child['val']=val
                    node['children'].append(child)

def overlap_cub(corner, size, row): # validate row, return overlap between cuboid of corner and size and row shape, or None
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=row
    assert onoff in {'on', 'off'}
    assert rowx0<=rowx1
    assert rowy0<=rowy1
    assert rowz0<=rowz1
    onx0,onx1,ony0,ony1,onz0,onz1=(corner[0], corner[0]+size-1, corner[1], corner[1]+size-1, corner[2], corner[2]+size-1)
    x0,x1=restrict(onx0, onx1, rowx0, rowx1)
    if x0 is None or x1 is None:
        return None
    y0,y1=restrict(ony0, ony1, rowy0, rowy1)
    if y0 is None or y1 is None:
        return None
    z0,z1=restrict(onz0, onz1, rowz0, rowz1)
    if z0 is None or z1 is None:
        return None
    exactfit=(x0==onx0 and x1==onx1 and y0==ony0 and y1==ony1 and z0==onz0 and z1==onz1)
    return (x0, x1, y0, y1, z0, z1, exactfit)

def maintain_tree(node, row): # recursively 'paint' the octree
    # determine part of row that fits in node
    rowfit=overlap_cub(node['corner'], node['size'], row)
    if rowfit is None: # no overlap
        #print(f'maintain_tree {node["corner"]=}, {node["size"]=}, {row=} -> None')
        return
    onoff=row[0]
    if 'val' in node and node['val']==onoff: # repainting what was already painted
        return
    x0, x1, y0, y1, z0, z1, exactfit=rowfit
    #print(f'maintain_tree {node["corner"]=}, {node["size"]=}, {row=} -> {rowfit=}')
    if exactfit and ('children' not in node) and ('childset' not in node):
        node['val']=onoff
    elif node['size']==4:
        maintain_childset(node, onoff, x0, x1, y0, y1, z0, z1)
    else: # create/update children recursively
        if 'children' not in node:
            make_children(node)
        assert 'val' not in node
        assert len(node['children'])==8 or len(node['children'])==64
        for child in node['children']:
            maintain_tree(child, row )

max_childset=2**64-1

def maintain_childset(node, onoff, x0, x1, y0, y1, z0, z1): # maintain childset, and convert from/to val,
    # this is simply an int where each of the lowest 64 bits is set if the corresponding pixel is on
    assert node['size']==4
    # create childset if needed
    if 'val' in node:
        assert 'childset' not in node
        assert node['val'] in {'on', 'off'}
        node['childset']= max_childset if node['val']=='on' else 0
        del node['val']
    # update childset, iterate over all 64 coords
    i=0
    corner=node['corner']
    childset=node['childset']
    for x in range(4):
        for y in range(4):
            for z in range(4):
                if corner[0]+x>=x0 and corner[0]+x<=x1 and corner[1]+y>=y0 and corner[1]+y<=y1 and \
                 corner[2]+z>=z0 and corner[2]+z<=z1:
                    if onoff=='on':
                        childset |= (1 << i)
                    else:
                        childset &= ~(1 << i)
                i+=1
    node['childset']=childset

def count_tree(node): # recursively count the octree, return count of lit pixels, count of nodes
    if 'children' in node:
        assert 'val' not in node
        count=0
        nodec=1
        for child in node['children']:
            cnt,lf=count_tree(child)
            count+=cnt
            nodec+=lf
        return count,nodec
    elif 'childset' in node:
        assert 'val' not in node
        count=0
        childset=node['childset']
        for i in range(64):
            if childset & (1 << i):
                count+=1
        return count,1
    elif node['val']=='off':
        return 0,1
    else:
        assert node['val']=='on'
        size=node['size']
        return size*size*size,1

def hist_block_tree1(node, ncount, bcount):
    if 'children' in node:
        assert 'val' not in node
        assert 'childset' not in node
        for child in node['children']:
            hist_block_tree1(child, ncount, bcount)
    elif 'childset' in node:
        assert 'val' not in node
        ncount['childset4']+=1
        bcount['childset4']+=sys.getsizeof(node)+sys.getsizeof(node['size'])+sys.getsizeof(node['corner'])+\
            sys.getsizeof(node['childset'])
    elif node['val']=='off':
        ncount[-node['size']]+=1
        bcount[-node['size']]+=sys.getsizeof(node)+sys.getsizeof(node['size'])+sys.getsizeof(node['corner'])+\
            sys.getsizeof(node['val'])
    else:
        assert node['val']=='on'
        ncount[node['size']]+=1
        bcount[node['size']]+=sys.getsizeof(node)+sys.getsizeof(node['size'])+sys.getsizeof(node['corner'])+\
            sys.getsizeof(node['val'])

def hist_block_tree(root): # histogram of size: count, off sizes are recorded as negative, also hbytes is avg. bytes per node
    ncount=collections.Counter()
    bcount=collections.Counter()
    hist_block_tree1(root, ncount, bcount)
    # bcount is total, adjust to average
    bcount={ k: (v // ncount[k]) for k,v in bcount.items() }
    return ncount, bcount

#sample1=open('data_src/2021-day-22-input.txt').read()
lines=[s for s in sample0d.splitlines() if len(s)>0 ]
data=[ result.group(1, 2, 3, 4, 5, 6, 7) for s in lines if (result:= re.match(r'(\w+)\s*x=([\d\-]+)\.\.([\d\-]+),y=([\d\-]+)\.\.([\d\-]+),z=([\d\-]+)\.\.([\d\-]+)', s)) ]
data=[ (row[0], int(row[1]), int(row[2]), int(row[3]), int(row[4]), int(row[5]), int(row[6]) ) for row in data ]

root={} # tree of octants
size=max_coord(data)
print(f'{size=}')
root['corner']=(-size,-size,-size)
root['size']=size*2
root['val']='off' # starts unlit
for row in data:
    maintain_tree(root, row)
    count,nodec=count_tree(root)
    print(f'after {row=} nodes={nodec} {count=}')
    #hist,hbytes=hist_block_tree(root)
    #print(f'nodes histogram: {dict(hist)}')
    #print(f'node byte sizes: {hbytes}')

# unfortunately this implementation doesn't run fast enough / takes too much memory for data with largish coords :(
# could improve speed by using explicit memory management by using numpy 2-dim arrays for all data,
# allocating e.g. 2**16 tuples at a time,
# allows assigning quite compact data types,
# non-leaf nodes would look like (int32 corner_x, int32 corner_y, int32 corner_z, uint32 size, uint32 val,\
#  uint32 children_block+children_first_index) where val would be 0=off, 1=on, 2=8 children, 3=64 children,
#  children_block would point to either a non-leaf or leaf data block and the 8 or 64 children would be numbered 
#  consecutively in there
# leaf nodes would look like (int32 corner_x, int32 corner_y, int32 corner_z, uint64 childset)
# probably both data blocks would be best to model as two parallel arrays with different data types
# child and node references can be uint32, with first 16 bits identifying the data block / numpy array, which is then
# retrieved via a dict, second 16 bits are the index in the block
# this data layer could then be encapsulated in functions like node_get_size(noderef) and node_set_size(noderef, newsize)
# this implementation would then try to avoid creating lists or dicts on the fly, would not remove nodes once created
# (cannot reclaim anyway in this basic memory allocation approach)
#
# however note that the last working line of sample0d output gives nodes=15177 count=474140,
# so stores on average 31 lit pixels per node. let's say in a more sparse data set this could be improved by a factor of 10,
# so 310 lit pixels per node,
# and that the sample given in part two has already 2758514936282235 lit cubes,
# so would need ca. 8.9E12 nodes, while the optimized octree described above with 32-bit references tops out at 4E9 nodes,
# obviously could surpass that with 64-bit references but on single computers it's not very reasonable to assume storing
# orders of magnitude more nodes than 4E9 in RAM anyway (leaf nodes described above would take at least 20 bytes each,
# non-leaf nodes 24 bytes, so 32 GB of RAM would fit ca. 1.6E9 nodes not counting overhead),
# so would be an interesting experiment to try this, but seems unlikely this octree would fit in memory
# (luckily the bsptree / 'cubtree' implemented above is much more efficient and works like a charm)

# sample0c output:
#size=64
#after row=('on', 1, 32, 1, 32, 1, 32) leafs=26027 count=32768
#and hist=Counter({-1: 13888, 1: 10816, -4: 999, 4: 279, -16: 37, -64: 7, 16: 1})

# sample0d output:
#size=262144
#after row=('on', -5, 47, -31, 22, -19, 33) nodes=7561 count=151686
#after row=('on', -44, 5, -27, 21, -14, 35) nodes=9353 count=248314
#after row=('on', -49, -1, -11, 42, -10, 38) nodes=11145 count=310956
#after row=('on', -20, 34, -40, 6, -44, 1) nodes=13577 count=389786
#after row=('off', 26, 39, 40, 50, -2, 11) nodes=13577 count=389786
#after row=('on', -41, 5, -41, 6, -36, 8) nodes=14217 count=421952
#after row=('off', -43, -33, -45, -28, 7, 25) nodes=14217 count=421700
#after row=('on', -33, 15, -32, 19, -34, 11) nodes=14601 count=433638
#after row=('off', 35, 47, -46, -34, -11, 5) nodes=14601 count=433638
#after row=('on', -14, 36, -6, 44, -16, 29) nodes=15177 count=474140
#after row=('on', -57795, -6158, 29564, 72030, 20435, 90618) len(onz)=601 count=153907262308204
#after row=('on', 36731, 105352, -21140, 28532, 16094, 90401) len(onz)=602 count=407198014618852
# this cuboid initial implementation crashes the IDE after ca. 5 minutes of working on the first line with sparse coords


In [None]:
# 2021 day 22 part 2 code initial (and cleaned up) implementation, based on splitting cuboids and work orders

def addnonempty(res, x0, x1, y0, y1, z0, z1): # add specified cuboid if non-empty
    if x0<=x1 and y0<=y1 and z0<=z1:
        res.add( (x0,x1,y0,y1,z0,z1) )

def split27B(regx0,regx1,regy0,regy1,regz0,regz1, blx0, blx1, bly0, bly1, blz0, blz1):
    # generate and return max 27 non-empty parts out of block, in and outside region
    res=set()
    for xpair in [ (blx0, regx0-1), (regx0, regx1), (regx1+1, blx1) ]:
        for ypair in [ (bly0, regy0-1), (regy0, regy1), (regy1+1, bly1) ]:
            for zpair in [ (blz0, regz0-1), (regz0, regz1), (regz1+1, blz1) ]:
                addnonempty(res, xpair[0], xpair[1], ypair[0], ypair[1], zpair[0], zpair[1])
    return res

def split27(onzitem, newonz, rowitem, newrows): # return False if no overlap or shapes exactly the same,
    # else split both shapes in max 27 parts (putting in newonz/newrows) & return True
    if split27_same(onzitem, rowitem):
        return False
    onx0,onx1,ony0,ony1,onz0,onz1=onzitem
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=rowitem
    assert onx0<=onx1
    assert ony0<=ony1
    assert onz0<=onz1
    x0,x1=restrict(onx0, onx1, rowx0, rowx1)
    if x0 is None or x1 is None:
        return False
    y0,y1=restrict(ony0, ony1, rowy0, rowy1)
    if y0 is None or y1 is None:
        return False
    z0,z1=restrict(onz0, onz1, rowz0, rowz1)
    if z0 is None or z1 is None:
        return False
    if newonz is not None:
        shaperows=split27B(x0,x1,y0,y1,z0,z1,onx0,onx1,ony0,ony1,onz0,onz1)
        for x in shaperows:
            newonz.add(x)
    if newrows is not None:
        shaperows=split27B(x0,x1,y0,y1,z0,z1,rowx0, rowx1, rowy0, rowy1, rowz0, rowz1)
        for row in shaperows:
            row2=[onoff,]
            row2.extend(row)
            row2=tuple(row2)
            newrows.add(row2)
    return True

def split27_same(onzitem, rowitem): # return True if shapes are the same
    onx0,onx1,ony0,ony1,onz0,onz1=onzitem
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=rowitem
    if onx0==rowx0 and onx1==rowx1 and ony0==rowy0 and ony1==rowy1 and onz0==rowz0 and onz1==rowz1:
        return True
    else:
        return False

def split27_overlap(onzitem, rowitem): # return True if shapes overlap
    onx0,onx1,ony0,ony1,onz0,onz1=onzitem
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=rowitem
    assert onx0<=onx1
    assert ony0<=ony1
    assert onz0<=onz1
    x0,x1=restrict(onx0, onx1, rowx0, rowx1)
    if x0 is None or x1 is None:
        return False
    y0,y1=restrict(ony0, ony1, rowy0, rowy1)
    if y0 is None or y1 is None:
        return False
    z0,z1=restrict(onz0, onz1, rowz0, rowz1)
    if z0 is None or z1 is None:
        return False
    return True

def maintain_onz_basic(onz, rowitem, do_check): # maintain onz, there is either 100% or no overlap
    onoff, rowx0, rowx1, rowy0, rowy1, rowz0, rowz1=rowitem
    assert onoff in {'on', 'off'}
    assert rowx0<=rowx1
    assert rowy0<=rowy1
    assert rowz0<=rowz1
    # first the check - there is either 100% or no overlap
    if do_check:
        for onzitem in onz:
            if not split27_overlap(onzitem, rowitem):
                continue
            if not split27_same(onzitem, rowitem):
                print(f'basic overlap error: {rowitem=} {onzitem=}')
                assert False
    # onoff is off: just delete all overlapping ons
    # onoff is on: search for overlapping one, if not found add
    if onoff=='on':
        onz.add( (rowx0, rowx1, rowy0, rowy1, rowz0, rowz1) )
    else:
        onz.discard( (rowx0, rowx1, rowy0, rowy1, rowz0, rowz1) )

def print_onz_rows(onz, rows): # print current list of onz and (reboot step) rows
    print('print_onz_rows onz:')
    for onzrow in onz:
        print(onzrow)
    print('rows:')
    for rw in rows:
        print(rw)

def maintain_onz(onz, rowitem0):
    # match rowitem0 with all shapes in onz, splitting both until there is either 100% or no overlap,
    # then apply split row to split onz in basic fashion (all or nothing)
    onoff, x0, x1, y0, y1, z0, z1=rowitem0
    assert onoff in {'on', 'off'}
    assert x0<=x1
    assert y0<=y1
    assert z0<=z1
    # outsiders are temporarily removed from onz
    outsideronz=set()
    for onzitem in onz:
        if not split27_overlap(onzitem, rowitem0):
            outsideronz.add(onzitem)
    onz.difference_update(outsideronz)
    # now split onz and rowitem0
    rows=set()
    rows.add(rowitem0)
    pairtodo=set() # (onzitem,rowitem), we only have to do each pair once
    for onzitem in onz:
        pairtodo.add( (onzitem, rowitem0) )
    while len(pairtodo)>0:
        pt=pairtodo.pop()
        onzitem, rowitem=pt
        if (onzitem not in onz) or (rowitem not in rows):
            continue
        newrows=set()
        newonz=set()
        if split27(onzitem, newonz, rowitem, newrows):
            assert len(newonz)>0
            assert len(newrows)>0
            rows.discard(rowitem)
            rows.update(newrows)
            onz.discard(onzitem)
            onz.update(newonz)
            for rowitem2 in newrows:
                for onzitem2 in onz:
                    pt=(onzitem2, rowitem2)
                    pairtodo.add( pt )
            for rowitem2 in rows:
                for onzitem2 in newonz:
                    pt=(onzitem2, rowitem2)
                    pairtodo.add( pt )
    # add back outsiders
    onz.update(outsideronz)
    #print(f'doing {row=}:') # DEBUG
    #print_onz_rows(onz, rows)
    for rw in rows:
        maintain_onz_basic(onz, rw, do_check=False) # DEBUG set do_check to True!
    #print('after maintain_onz_basic:') # DEBUG
    #print_onz_rows(onz, rows)

def count_onz(onz): # assuming onz is a set of non-overlapping lit pixels, count and return lit pixels
    count=0
    for onzrow in onz:
        onx0,onx1,ony0,ony1,onz0,onz1=onzrow
        assert onx0<=onx1
        assert ony0<=ony1
        assert onz0<=onz1
        count+=(onx1+1-onx0)*(ony1+1-ony0)*(onz1+1-onz0)
    return count

onz=set() # set of non-overlapping (x0, x1, y0, y1, z0, z1) regions/cuboids that are switched on
for row in data:
    maintain_onz(onz, row)
    count=count_onz(onz)
    print(f'after {row=} {len(onz)=} {count=}')

# 1285677377848549 after 5 min 13 sec.

Conclusion for day 22: efficiently calculating cuboid intersections in a sparse space is non-trivial, and this 'quick' cuboid splitting solution already took a very long time. It's not pretty (in particular the initial implementation) but it works. There are a few minor optimizations possible (for 'off' you don't actually have to split the row/reboot step), but to really make it more efficient would require maintaining some kind of index / space bisection, leading into the area of octrees. Later did write an octree-based implementation which divides each cube of space into 8 or 64 child cubes, quite elegant but unfortunately does not handle data sets with large / sparse coords very gracefully and needs a surprising amount of nodes. Then wrote a third tree-based implementation named 'cubtree' which splits cuboids into max 27 child cuboids along the lines of the incoming reboot steps. This one uses a very reasonable number of nodes and run time (even while still using inefficient dictionary-based nodes) and is the simplest in lines of code as well. Nice. The fourth implementation used the same cuboid tree, but stored in a numpy array instead of dictionaries. It runs within a second, so is again 5 to 6 times faster.

In [None]:
# 2021 day 21 part 1
# mv ~/Downloads/input data_src/2021-day-21-input.txt
# big input file looks like: 
# idea: part 1 parse ..., then ...

sample1='''
Player 1 starting position: 4
Player 2 starting position: 8
'''

def roll(pos, score, dice): # one role simulated, returns (pos, score, dice)
    rolled=0
    for i in range(3):
        rolled+=dice
        dice+=1
        if dice>100:
            dice=1
    pos+=rolled
    pos=((pos-1) % 10)+1
    score+=pos
    return (pos, score, dice)

sample1=open('data_src/2021-day-21-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ int(s.split(':')[1]) for s in lines ]
#data
dice=1
p1pos, p1score=(data[0], 0)
p2pos, p2score=(data[1], 0)
trolled=0
while True:
    p1pos, p1score, dice=roll(p1pos, p1score, dice)
    trolled+=3
    #print(f'{p1pos=} {p1score=} {dice=}')
    if p1score>=1000:
        print(f'end {trolled*p2score}')
        break
    p2pos, p2score, dice=roll(p2pos, p2score, dice)
    trolled+=3
    #print(f'{p2pos=} {p2score=} {dice=}')
    if p2score>=1000:
        print(f'end {trolled*p1score}')
        break

In [None]:
# 2021 day 21 part 2

sample1='''
Player 1 starting position: 4
Player 2 starting position: 8
'''

def three_dice():
    res=collections.Counter()
    for a in range(1, 4):
        for b in range(1, 4):
            for c in range(1, 4):
                res[a+b+c]+=1
    return res

def roll_p1(univ, thrdc):
    res=collections.Counter()
    for key,ucount in univ.items():
        p1pos,p1score,p2pos,p2score=key
        if p1score>=21 or p2score>=21: # as is
            res[ (p1pos,p1score,p2pos,p2score) ] +=ucount
            continue
        for rolled,uc2 in thrdc.items():
            pos=p1pos+rolled
            pos=((pos-1) % 10)+1
            score=p1score+pos
            res[ (pos,score,p2pos,p2score) ] +=ucount*uc2
    return res

def roll_p2(univ, thrdc):
    res=collections.Counter()
    for key,ucount in univ.items():
        p1pos,p1score,p2pos,p2score=key
        if p1score>=21 or p2score>=21: # as is
            res[ (p1pos,p1score,p2pos,p2score) ] +=ucount
            continue
        for rolled,uc2 in thrdc.items():
            pos=p2pos+rolled
            pos=((pos-1) % 10)+1
            score=p2score+pos
            res[ (p1pos,p1score,pos,score) ] +=ucount*uc2
    return res

def finished(univ): # are any games done, and how many universes does the winner have?
    p1count=0 # how many has p1 won?
    p2count=0 # how many has p2 won?
    for key,ucount in univ.items():
        p1pos,p1score,p2pos,p2score=key
        if p1score<21 and p2score<21:
            return None
        if p1score>=21:
            p1count+=ucount
        if p2score>=21:
            p2count+=ucount
    return max(p1count, p2count)

sample1=open('data_src/2021-day-21-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ int(s.split(':')[1]) for s in lines ]
thrdc=three_dice()
univ=collections.Counter() # counts universes of (p1pos, p1score, p2pos, p2score)
univ[ (data[0], 0, data[1], 0) ]+=1
trolled=0
print(f'turn {trolled=} {univ=}')
while True:
    univ=roll_p1(univ, thrdc)
    trolled+=1
    #print(f'turn {trolled=} with {thrdc=} got {univ=}')

    univ=roll_p2(univ, thrdc)
    trolled+=1
    #print(f'turn {trolled=} with {thrdc=} got {univ=}')
    score=finished(univ)
    if score is not None:
        print(f'{score=}')
        break
    #if trolled>5:
    #    break

# 146854918035875

Conclusion for day 21: part 2 impolementation was perfect already pretty quickly, except for one too enthousiastic multiplication step. Finding that step took way too long. How to improve next time: when debugging could have critically examined all steps, not only on code level but also on design level.

In [None]:
# 2021 day 20
# mv ~/Downloads/input data_src/2021-day-20-input.txt
# big input file looks like: big input 80x80
# idea: part 1 parse into two blocks, then as binary, then repeat copy into new, tricky part is proper extending and clipping 
# using edge chars (and the first and last char of algo determine whether there will be an infinite number of lit pixels 
# after an even number of enhance steps)

sample1='''
..#.#..#####.#.#.#.###.##.....###.##.#..###.####..#####..#....#..#..##..###..######.###...####..#..#####..##..#.#####...##.#.#..#.##..#.#......#.###.######.###.####...#.##.##..#..#..#####.....#.#....###..#.##......#.....#..#..#..##..#...##.######.####.####.#.#...#.......#..#.#.#...####.##.#......#..#...##.#.##..#...##.#.##..###.#......#.#.......#.#.#.####.###.##...#.....####.#..#..#.##.#....##..#.####....##...##..#...#......#.#.......#.......##..####..#...#.#.#...##..#.#..###..#####........#..####......#..#

#..#.
#....
##..#
..#..
..###
'''

def extend_input(input, first=False): # extend the image with a border character to simulate an infinite board,
    # uses 1 row and column of extra extension on all sides because our enhance doesn't process the whole board
    c='.' if first else input[0][0] 
    res=[]
    cols=len(input[0])
    emptyrow=c * (cols+12)
    emptyside=c * 6
    for i in range(6):
        res.append(emptyrow)
    for row in input:
        res.append(emptyside+row+emptyside)
    for i in range(6):
        res.append(emptyrow)
    return res

def print_input(lbl, input): # display input with a label
    print(f'{lbl}:')
    for row in input:
        print(row)

def enhance(input, algo): # enhance the image using the algo lookup,
    # output will already be clipped - one row/col smaller on each side
    output=[]
    for j in range(len(input)-2):
        row=[]
        for i in range(len(input[0])-2):
            code=input[j][i:i+3]+input[j+1][i:i+3]+input[j+2][i:i+3]
            code=code.replace('.', '0').replace('#', '1')
            index=int(code, 2)
            c=algo[index]
            row.append(c)
        output.append(''.join(row))
    return output

def count_lit(input): # count lit spots
    count=0
    for row in input:
        for c in row:
            if c=='#':
                count+=1
    return count

sample1=open('data_src/2021-day-20-input.txt').read()
lines=[s for s in sample1.splitlines() ]
groups=get_line_groups(lines)
algo=groups[0][0]
print(f'{algo=}')
assert len(algo)==512
input=groups[1]

first=True
for i in range(50):
    input=extend_input(input, first)
    first=False
    input=enhance(input, algo)

#print_input('B', input)
count=count_lit(input)
print(f'{count=}')

# 17325


Conclusion for day 20: the initial solution was a bit messy, although it worked ok, so cleaned up a bit to avoid an explicit clip step and to make extend look a bit more efficient (although the whole didn't get much faster). The initial solution could have gone a bit faster with a little bit more design and neater working from the start (to avoid some wrong answers for part 1), but it was ok enough.

In [None]:
# 2021 day 19
# mv ~/Downloads/input data_src/2021-day-19-input.txt
# big input file looks like: long list of 36 scanners
# idea: part 1 parse w/regions, should be contiguous, then generate 24 alternative orientations,
# then look for 12 overlapping beacons with each other scanner by taking diff between first beacon and all of other
# scanner, or second, etc., and seeing how many then overlap, have to determine scanner orientations


sample1='''
--- scanner 0 ---
404,-588,-901
528,-643,409
-838,591,734
390,-675,-793
-537,-823,-458
-485,-357,347
-345,-311,381
-661,-816,-575
-876,649,763
-618,-824,-621
553,345,-567
474,580,667
-447,-329,318
-584,868,-557
544,-627,-890
564,392,-477
455,729,728
-892,524,684
-689,845,-530
423,-701,434
7,-33,-71
630,319,-379
443,580,662
-789,900,-551
459,-707,401

--- scanner 1 ---
686,422,578
605,423,415
515,917,-361
-336,658,858
95,138,22
-476,619,847
-340,-569,-846
567,-361,727
-460,603,-452
669,-402,600
729,430,532
-500,-761,534
-322,571,750
-466,-666,-811
-429,-592,574
-355,545,-477
703,-491,-529
-328,-685,520
413,935,-424
-391,539,-444
586,-435,557
-364,-763,-893
807,-499,-711
755,-354,-619
553,889,-390

--- scanner 2 ---
649,640,665
682,-795,504
-784,533,-524
-644,584,-595
-588,-843,648
-30,6,44
-674,560,763
500,723,-460
609,671,-379
-555,-800,653
-675,-892,-343
697,-426,-610
578,704,681
493,664,-388
-671,-858,530
-667,343,800
571,-461,-707
-138,-166,112
-889,563,-600
646,-828,498
640,759,510
-630,509,768
-681,-892,-333
673,-379,-804
-742,-814,-386
577,-820,562

--- scanner 3 ---
-589,542,597
605,-692,669
-500,565,-823
-660,373,557
-458,-679,-417
-488,449,543
-626,468,-788
338,-750,-386
528,-832,-391
562,-778,733
-938,-730,414
543,643,-506
-524,371,-870
407,773,750
-104,29,83
378,-903,-323
-778,-728,485
426,699,580
-438,-605,-362
-469,-447,-387
509,732,623
647,635,-688
-868,-804,481
614,-800,639
595,780,-596

--- scanner 4 ---
727,592,562
-293,-554,779
441,611,-461
-714,465,-776
-743,427,-804
-660,-479,-426
832,-632,460
927,-485,-438
408,393,-506
466,436,-512
110,16,151
-258,-428,682
-393,719,612
-211,-452,876
808,-476,-593
-575,615,604
-485,667,467
-680,325,-822
-627,-443,-432
872,-547,-609
833,512,582
807,604,487
839,-516,451
891,-625,532
-652,-548,-490
30,-46,-14
'''

sample2='''
--- scanner 0 ---
-1,-1,1
-2,-2,2
-3,-3,3
-2,-3,1
5,6,-4
8,0,7
'''

sample1=open('data_src/2021-day-19-input.txt').read()

In [None]:
# 2021 day 19 code

def gen_orient_append1(res, count, d, e, f):
    gen_orient_append2(res, count, d, e, -f)
    gen_orient_append2(res, count, d, e, f)
    gen_orient_append2(res, count, d, -e, -f)
    gen_orient_append2(res, count, d, -e, f)
    gen_orient_append2(res, count, -d, e, -f)
    gen_orient_append2(res, count, -d, e, f)
    gen_orient_append2(res, count, -d, -e, -f)
    gen_orient_append2(res, count, -d, -e, f)

def gen_orient_append2(res, count, d, e, f):
    i=count['index']
    count['index']+=1
    while len(res)<i+1:
        res.append([])
    res[i].append( (d, e, f) )

def generate_orients(group): # generate all 24 orientations as lists of beacon coords, actually for now generate all 48...
    # 6 combinations of axis orders, each mirrored pos. and neg.
    # (because the overlaps we're looking for are so sparse, it turned out it didn't matter that we're generating a
    # bunch of invalid mirrored orientations here)
    res=[]
    count=collections.Counter()
    for tup in group:
        a, b, c = tup
        count['index']=0
        gen_orient_append1(res, count, a, b, c)
        gen_orient_append1(res, count, a, c, b)
        gen_orient_append1(res, count, b, a, c)
        gen_orient_append1(res, count, b, c, a)
        gen_orient_append1(res, count, c, a, b)
        gen_orient_append1(res, count, c, b, a)
    return res

def tup_sub(tup1, tup2): # return tup1-tup2 as tuple
    res=[]
    for i in range(len(tup1)):
        res.append(tup1[i]-tup2[i])
    return tuple(res)

def tup_add(tup1, tup2): # return tup1+tup2 as tuple
    res=[]
    for i in range(len(tup1)):
        res.append(tup1[i]+tup2[i])
    return tuple(res)

def manhattan(tup1, tup2): # return sum(abs(tup1-tup2))
    res=[]
    for i in range(len(tup1)):
        res.append(abs(tup1[i]-tup2[i]))
    return sum(res)

def max_overlap(group1, group2): # for these two groups in assumed same orientation, find a translation that 
    # produces max overlap, diff between first beacon and all of others, see how many overlap, etc.
    # return (numoverlap, (translation_x, translation_y, translation_z))
    count=collections.Counter()
    for gr1 in group1:
        for gr2 in group2:
            tr=tup_sub(gr1, gr2)
            count[tr]+=1
    max_count=max(count.values())
    trs=[ tr for tr,cnt in count.items() if cnt==max_count ]
    return max_count, trs[0]

def discover_beacons(group2, tr): # after applying tr to group2, return set
    res=set()
    for tup in group2:
        trtup=tup_add(tup, tr)
        res.add(trtup)
    return res

# MAIN
lines=[s for s in sample1.splitlines() ]
groups=get_line_groups(lines)
for i in range(len(groups)):
    group=groups[i]
    assert group[0].startswith('--- scanner')
    group.pop(0)
    groups[i]= [ tuple( [ int(x) for x in s.split(',') ] ) for s in group ]
# for each combination of groups A,B - for B generate all orients, 
# then find max overlap with B, if at least 12 then there is enough overlap, take that orientation for B for remainder,
# continue like this and now all should be in same orientation, and you can pool all beacons

print(f'{len(groups)=}')
all_beacons=set()
for beac in groups[0]:
    all_beacons.add(beac)
oriented= {0: (0, 0, 0) } # map of oriented scanners with their translation to scanner 0
print(f'oriented 0, now {len(all_beacons)=}, {oriented=}')
while len(oriented)<len(groups):
    anyfound=False
    for group_ai in list(oriented.keys()):
        for group_bi in range(len(groups)):
            if group_ai==group_bi or group_bi in oriented:
                continue
            group_a=groups[group_ai]
            group_b=groups[group_bi]
            orients_b=generate_orients(group_b)
            overlaps=[]
            for orb in orients_b:
                overlaps.append(max_overlap(group_a, orb))
            overlap_max=max([ tup[0] for tup in overlaps])
            if overlap_max>=12: # found orientation of B
                found=False
                for i in range(len(overlaps)):
                    if overlaps[i][0]==overlap_max:
                        if found:
                            assert False # multiple max overlaps
                        tr=tup_add(overlaps[i][1], oriented[group_ai]) # translation to group 0
                        groups[group_bi]=orients_b[i] # orient group B to orientation of A so indirectly group 0
                        oriented[group_bi]=tr
                        for beac in discover_beacons(groups[group_bi], tr):
                            all_beacons.add(beac)
                        found=True
                        anyfound=True
                assert found
                print(f'oriented {group_bi} to {group_ai} using {overlap_max} beacons, now {len(all_beacons)=}')
    assert anyfound # otherwise infinite loop
print(f'{len(all_beacons)=}')
maxdist=0
for scanner1 in oriented.values():
    for scanner2 in oriented.values():
        dist=manhattan(scanner1, scanner2)
        if dist>maxdist:
            maxdist=dist
print(f'{maxdist=}')

# not 10618 but 10918

In [None]:
# 2021 day 18
# mv ~/Downloads/input data_src/2021-day-18-input.txt
# big input file looks like: short lines
# idea: part 1 parse w/eval, enumerate to dicts of {index: num, val: num}, then find recursive

sample1='''
[[[0,[5,8]],[[1,7],[9,6]]],[[4,[1,2]],[[1,4],2]]]
[[[5,[2,8]],4],[5,[[9,9],0]]]
[6,[[[6,2],[5,6]],[[7,6],[4,7]]]]
[[[6,[0,7]],[0,9]],[4,[9,[9,0]]]]
[[[7,[6,4]],[3,[1,3]]],[[[5,5],1],9]]
[[6,[[7,3],[3,2]]],[[[3,8],[5,7]],4]]
[[[[5,4],[7,7]],8],[[8,3],8]]
[[9,3],[[9,9],[6,[4,9]]]]
[[2,[[7,7],7]],[[5,8],[[9,3],[0,2]]]]
[[[[5,2],5],[8,[3,7]]],[[5,[7,5]],[4,4]]]
'''

def enumerate_tree(tree, count): # assign numbers to the tree leaves, also replace ints by dicts
    if isinstance(tree, int):
        assert False
    if isinstance(tree, dict):
        tree['index']=count['index']
        count['index']+=1
        return
    if isinstance(tree[0], int):
        tree[0]={'val': tree[0]}
    enumerate_tree(tree[0], count)
    if isinstance(tree[1], int):
        tree[1]={'val': tree[1]}
    enumerate_tree(tree[1], count)

def add_tree(tree, index, val): # add a number to the specified leaf
    if isinstance(tree, int):
        assert False
    if isinstance(tree, dict):
        if tree['index']==index:
            tree['val']+=val
        return
    add_tree(tree[0], index, val)
    add_tree(tree[1], index, val)

def explode(tree, top, depth): # return (did-we-explode?, must-replace-me-with-zero?)
    # tree is a dict, a pair, or either is a tree
    if isinstance(tree, int):
        assert False
    if isinstance(tree, dict):
        assert False
    if isinstance(tree[0], dict) and isinstance(tree[1], dict): # pair
        if depth<4:
            return False, False
        add_tree(top, tree[0]['index']-1, tree[0]['val'])
        add_tree(top, tree[1]['index']+1, tree[1]['val'])
        return True, True
    if isinstance(tree[0], list):
        expl,repl=explode(tree[0], top, depth+1)
        if repl:
            tree[0]={'val': 0}
        if expl:
            return True, False
    if isinstance(tree[1], list):
        expl,repl=explode(tree[1], top, depth+1)
        if repl:
            tree[1]={'val': 0}
        if expl:
            return True, False
    return False,False

def split_node(num): # return pair for leaf split
    return [{'val': num//2}, {'val': (num+1)//2}]

def split(tree): # split leaves >=10, return did-we-split?
    if isinstance(tree, int):
        assert False
    if isinstance(tree, dict):
        assert False
    if isinstance(tree[0], dict):
        if tree[0]['val']>=10:
            tree[0]=split_node(tree[0]['val'])
            return True
    if isinstance(tree[0], list):
        if split(tree[0]):
            return True
    if isinstance(tree[1], dict):
        if tree[1]['val']>=10:
            tree[1]=split_node(tree[1]['val'])
            return True
    if isinstance(tree[1], list):
        if split(tree[1]):
            return True
    return False

def addition(tree1, tree2):
    return [tree1, tree2]

def printable(tree):
    if isinstance(tree, int):
        return tree
    if isinstance(tree, dict):
        return tree['val']
    return [printable(tree[0]), printable(tree[1])]

def magnitude(tree):
    if isinstance(tree, int):
        return tree
    if isinstance(tree, dict):
        return tree['val']
    return 3*magnitude(tree[0]) + 2*magnitude(tree[1])

sample1=open('data_src/2021-day-18-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
#lines=[ eval(s) for s in lines ]
maxmagn=None
for tree1 in lines:
    for tree2 in lines:
        if tree1==tree2:
            continue
        tree1e=eval(tree1)
        tree2e=eval(tree2)
        tree=addition(tree1e, tree2e)
        #print(f'after addi: {printable(tree)}')
        while True:
            enumerate_tree(tree, collections.Counter())
            expl,repl=explode(tree, tree, 0)
            dosplit=False
            if expl:
                #print(f'after expl: {printable(tree)}')
                pass
            else:
                dosplit=split(tree)
                if dosplit:
                    #print(f'after splt: {printable(tree)}')
                    pass
            if not (expl or dosplit):
                break
        magn=magnitude(tree)
        if maxmagn is None or magn>maxmagn:
            maxmagn=magn
#print(f'final: {printable(tree)}')
print('magnitude:', maxmagn)

# 4721

In [None]:
# test single explode
tree=eval('[[3,[2,[8,0]]],[9,[5,[4,[3,2]]]]]')
print(f'A: {tree=}')
enumerate_tree(tree, collections.Counter())
print(f'B: {tree=}')
explode(tree, tree, 0)
print(f'C: {tree=}')

In [None]:
# test magnitude
tree=eval('[[[[0,7],4],[[7,8],[6,0]]],[8,1]]')
print(f'A: {tree=}')
enumerate_tree(tree, collections.Counter())
print(f'B: {tree=}')
print('C:', magnitude(tree))

Conclusion for day 18: tough job, took almost 3 hours, could have been done faster by settling on the right annotated tree approach sooner, started coding too soon, should have designed first in this case for every operation, with a few quick drawings. The eventual solution of dicts as leafs is imho ok-ish (because of the magnitude functon it couldn't have been done directly on the strings!). It doesn't seem to be a trick assignment, but just a check to see how your tree handling is. Doing the explode-add step directly on the tree without annotation / enumeration seems possible but quite tricky. Also - using isinstance instead of type is probably a lot safer. Last improvement: could have counted leafs on the fly (in add_tree and explode) instead of storing the indices in the tree. This would have made it faster and simpler...

In [None]:
# 2021 day 17
# mv ~/Downloads/input data_src/2021-day-17-input.txt
# big input file looks like: small puzzle
# idea: part 1 parse ..., then just try all options

sample1='''
target area: x=20..30, y=-10..-5
'''

def shoot(tx0, tx1, ty0, ty1, vx, vy): # return hit,maxy
    x=0
    y=0
    maxy=0
    while True:
        #print(f'shoot {x=}, {y=}, {vx=}, {vy=}')
        x+=vx
        y+=vy
        if y>maxy:
            maxy=y
        if x>=tx0 and x<=tx1 and y>=ty0 and y<=ty1:
            return True,maxy
        if vx>0:
            vx-=1
        elif vx<0:
            vx+=1
        vy-=1
        if vx==0 and (x<tx0 or x>tx1):
            return False, maxy
        if x>tx1 and vx>=0:
            return False, maxy
        if y<ty0 and vx<=0:
            return False, maxy

sample1=open('data_src/2021-day-17-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
areas=[ result.group(1, 2, 3, 4) for s in lines if (result:= re.match(r'target area:\s*x=(\d+)\.\.(\d+)\s*,\s*y=([\d\-]+)\.\.([\d\-]+)', s)) ]
x0,x1,y0,y1=[ int(s) for s in areas[0] ]
assert x0<=x1
assert y0<=y1
bestmaxy=None
count=0
#hit,maxy=shoot(x0, x1, y0, y1, 17, -4)
#print(f'{hit=}, {maxy=}')
for vx in range(1,x1+5):
    for vy in range(y0,-y0):
        hit,maxy=shoot(x0, x1, y0, y1, vx, vy)
        if hit:
            count+=1
            if bestmaxy is None or maxy>bestmaxy:
                bestmaxy=maxy
print(f'{bestmaxy=}, {count=}')

# part 1 not 3828
# part 2 bestmaxy=35511, count=3282

Conclusion for day 17: good score, hard to be faster on these simple puzzles, but one way to be faster here is to parse the input file yourself, and just go x0,x1,y0,y1=(20, 30, -10, -5) and same for the file, commented in/out. This saves about 5 minutes messing with re, parsing options etc.

In [None]:
# 2021 day 16
# mv ~/Downloads/input data_src/2021-day-16-input.txt
# big input file looks like: one big hex string
# idea: part 1 parse ..., then ...

sample1='''
C200B40A82
04005AC33890
880086C3E88112
CE00C43D881120
D8005AC2A8F0
F600BC2D8F
9C005AC2F8F0
9C0141080250320F1802104A08
'''

def parse(binstr): # return calc value,bits read
    version=int(binstr[:3], 2)
    type_id=int(binstr[3:6], 2)
    #print(f'{version=} {type_id=}')
    if type_id==4: # literal value
        litnum=''
        pos=6
        while True:
            pack=binstr[pos:pos+5]
            pos+=5
            litnum+=pack[1:]
            if pack[0]=='0':
                break
        litval=int(litnum, 2)
        #print(f'{litval=} {pos=}')
        return litval,pos
    else: # operator
        operands=[]
        resultbits=None
        lentypeid=int(binstr[6:7], 2)
        if lentypeid==0:
            totallenbits=int(binstr[7:22], 2)
            pos=22
            while True:
                subval,subpos=parse(binstr[pos:])
                operands.append(subval)
                pos+=subpos
                if pos>=22+totallenbits:
                    break
            resultbits=22+totallenbits
        else:
            numsubpackets=int(binstr[7:18], 2)
            pos=18
            for i in range(numsubpackets):
                subval,subpos=parse(binstr[pos:])
                operands.append(subval)
                pos+=subpos
            resultbits=pos
        match type_id:
            case 0:
                return sum(operands),resultbits
            case 1:
                return math.prod(operands),resultbits
            case 2:
                return min(operands),resultbits
            case 3:
                return max(operands),resultbits
            case 5:
                boolres=1 if operands[0] > operands[1] else 0
                return boolres,resultbits
            case 6:
                boolres=1 if operands[0] < operands[1] else 0
                return boolres,resultbits
            case 7:
                boolres=1 if operands[0] == operands[1] else 0
                return boolres,resultbits
            case _:
                assert False

sample1=open('data_src/2021-day-16-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
for data in lines:
    data=data.strip()
    data_bin=bin(int(data, 16))[2:].zfill(len(data)*4)
    totres,pos=parse(data_bin)
    print(f'{data[:20]} {totres=}')

# totres=96257984154

In [None]:
# 2021 day 15
# mv ~/Downloads/input data_src/2021-day-15-input.txt
# big input file looks like: big
# idea: part 1 parse as board, then dfs with cutoff would take way too long,
# so instead use breadth-first search with score propagation

sample1='''
1163751742
1381373672
2136511328
3694931569
7463417111
1319128137
1359912421
3125421639
1293138521
2311944581
'''

def mod9(num):
    return ((num-1) % 9)+1

def data_expand(data):
    # first expand hor.
    cols=len(data[0])
    for y in range(len(data)):
        for i in range(4):
            for x in range(cols):
                data[y].append( mod9(data[y][x]+i+1)  )
    # now expand vertically
    rows=len(data)
    for i in range(4):
        for y in range(rows):
            newrow=[]
            for num in data[y]:
                newrow.append( mod9(num+i+1) )
            data.append(newrow)
    return data

def bfs(data, y0, x0, score0, scores): # starting from y0, x0 propagate cost and minimize on the whole board
    scores[y0][x0]=score0 # can reach starting point at starting score
    todo={ (y0, x0) }
    while len(todo)>0:
        y, x=todo.pop()
        # from this point generate (new) info
        score=scores[y][x]
        for dy, dx in [ (1, 0), (0, 1), (-1, 0), (0, -1) ]: 
            y2=y+dy; x2=x+dx
            if y2<0 or y2>=len(data) or x2<0 or x2 >= len(data[y2]):
                continue
            newscore=score+data[y2][x2]
            if scores[y2][x2] is None or newscore<scores[y2][x2]:
                scores[y2][x2]=newscore
                todo.add( (y2,x2) ) # we will propagate this new info item later

sample1=open('data_src/2021-day-15-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ list(s) for s in lines ]
data=[ [ int(s) for s in lin ] for lin in lines ]
data=data_expand(data)
scores=[] # min cost to reach, same shape as data
for y in range(len(data)):
    row=[]
    for x in range(len(data[0])):
        row.append(None)
    scores.append(row)
bfs(data, 0, 0, 0, scores)
score=scores[-1][-1]
print(f'{score=}')

# 2842 after ca. 32s.

Conclusion for day 15:
The solution took a really long time, should have thought of breadth-first info propagation right away.

In [None]:
# 2021 day 14
# mv ~/Downloads/input data_src/2021-day-14-input.txt
# big input file looks like:  big list and input, pairs only, all unique
# idea: part 1 parse groups and rules, consider string as overlapping pairs, then work with pair counts

sample1='''
NNCB

CH -> B
HH -> N
CB -> H
NH -> C
HB -> C
HC -> B
HN -> C
NN -> C
BH -> H
NC -> B
NB -> B
BN -> B
BB -> N
BC -> B
CC -> N
CN -> C
'''

def do_step(pairs, rules): # apply one step of rules and return new pair counts
    res=collections.Counter()
    for pair, cnt in pairs.items():
        if cnt<1:
            continue
        c=rules.get(pair)
        if c is None:
            res[pair]+=cnt
        else:
            res[pair[0]+c]+=cnt
            res[c+pair[1]]+=cnt
    return res

def count_chars(pairs, templ): # count chars in overlapping pairs, we count only the second char, and the first char of templ
    # (to understand this, consider the final string as a list of individual overlapping pairs of letters,
    # if you count the second of each pair then you only miss the very first char)
    res=collections.Counter()
    res[templ[0]]+=1
    for pair, cnt in pairs.items():
        if cnt<1:
            continue
        res[pair[1]]+=cnt
    return sorted(list(res.items()), key=lambda tup: tup[1])

sample1=open('data_src/2021-day-14-input.txt').read()
lines=[s for s in sample1.splitlines() ]
groups=get_line_groups(lines)
templ=groups[0][0].strip()
print(f'{templ=}')
rules={ result.group(1) : result.group(2) for s in groups[1] if (result:= re.match(r'(\w\w)\s*->\s*(\w)', s)) }
print(f'{rules=}')
pairs=collections.Counter() # pair of chars to count
for i in range(len(templ)-1):
    pairs[templ[i:i+2]]+=1
print(f'pairs 1 {pairs}')
for i in range(40):
    pairs=do_step(pairs, rules)
#print(f'pairs 2 {pairs}')
counts=count_chars(pairs, templ)
print(f'{counts=}')
diff=counts[-1][1]-counts[0][1]
print(f'{diff=}')

# 3542388214529

In [None]:
# 2021 day 13
# mv ~/Downloads/input data_src/2021-day-13-input.txt
# big input file looks like: big list - and - fold
# idea: part 1 parse groups, then do the folding

sample1='''
6,10
0,14
9,10
0,3
10,4
4,11
6,0
6,12
4,1
0,13
10,12
3,4
3,0
8,4
1,10
2,14
8,10
9,0

fold along y=7
fold along x=5
'''

def count_board(board):
    count=0
    for row in board:
        for cell in row:
            if cell!=0:
                count+=1
    return count

def do_fold_y(board, axnum): # hor axis along y=axnum
    for y in range(axnum+1, len(board)):
        newy=2*axnum-y
        if newy<0:
            continue
        for x in range(0, len(board[0])):
            board[newy][x]+=board[y][x]
            board[y][x]=0 # erase

def do_fold_x(board, axnum): # vert axis along x=axnum
    for x in range(axnum+1, len(board[0])):
        newx=2*axnum-x
        if newx<0:
            continue
        for y in range(0, len(board)):
            board[y][newx]+=board[y][x]
            board[y][x]=0 # erase

def visib_board(board, lasty, lastx):
    res=''
    for y in range(lasty):
        row=''
        for x in range(lastx):
            row+='#' if board[y][x]>0 else ' '
        print(row)

sample1=open('data_src/2021-day-13-input.txt').read()
lines=[s for s in sample1.splitlines() ]
groups=get_line_groups(lines)
points=[ s.split(',') for s in groups[0] ]
points=[ (int(y), int(x)) for x,y in points ]
folds=[ result.group(1, 2) for s in groups[1] if (result:= re.match(r'fold along\s*(\w+)=(\d+)', s)) ]
folds=[ (axis, int(num)) for axis,num in folds ]
print(f'{folds[0]=}')
max_coord=max([ max(tup) for tup in points])
print(f'{max_coord=}')
board=[] # top left is 0,0 and it's board[y][x]
for y in range(max_coord+1):
    board.append([])
    for x in range(max_coord+1):
        board[y].append(0)
for y,x in points:
    board[y][x]=1
count=count_board(board)
print(f'count1 {count}')
lastx=max_coord
lasty=max_coord
for axis,axnum in folds:
    if axis=='x':
        do_fold_x(board, axnum)
        lastx=axnum
    else:
        do_fold_y(board, axnum)
        lasty=axnum
visib_board(board, lasty, lastx)

# #....###...##..###..###..####..##..###..
# #....#..#.#..#.#..#.#..#.#....#..#.#..#.
# #....#..#.#....#..#.#..#.###..#....###..
# #....###..#.##.###..###..#....#....#..#.
# #....#.#..#..#.#....#.#..#....#..#.#..#.
# ####.#..#..###.#....#..#.####..##..###..
#
# LRGPRECB

In [None]:
# 2021 day 12 B
# mv ~/Downloads/input data_src/2021-day-12-input.txt
# big input file looks like: small
# idea: part 1 parse ..., then BFS with info propagation

sample1='''
start-A
start-b
A-c
A-b
b-d
A-end
b-end
'''

sample2='''
dc-end
HN-start
start-kj
dc-start
dc-HN
LN-dc
HN-end
kj-sa
kj-HN
kj-dc
'''

sample3='''
fs-end
he-DX
fs-he
start-DX
pj-DX
end-zg
zg-sl
zg-pj
pj-he
RW-he
fs-DX
pj-RW
zg-RW
start-pj
he-WI
zg-he
pj-fs
start-RW
'''

def path_valid(path):
    count=collections.Counter()
    twice=0
    for node in path.split(','):
        if node[0]>='A' and node[0]<='Z': # no restrictions
            continue
        count[node]+=1
        if count[node]>2:
            return False
        elif count[node]==2:
            twice+=1
    if twice>1:
        return False
    return True    

def bfs(rules, maptoset, node0, path0): # starting from node0 propagate paths and fill the whole board
    toset=maptoset.setdefault(node0, set())
    toset.add(path0) # can reach starting point with starting path
    todo={ node0 }
    while len(todo)>0:
        node=todo.pop()
        toset=maptoset[node]
        assert len(toset)>0 # must be at least one path to here to be here
        # from this point generate (new) info
        for path in toset:
            for nextnode in rules.get(node, set()):
                nextpath=path+','+nextnode
                nexttoset=maptoset.setdefault(nextnode, set())
                if nextpath not in nexttoset and path_valid(nextpath):
                    nexttoset.add(nextpath)
                    todo.add(nextnode) # we will propagate this new info item later

sample1=open('data_src/2021-day-12-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ s.split('-') for s in lines ]
data2={} # maps node to set of all reachable
for p1, p2 in data:
    if p2!='start' and p1!='end':
        reach=data2.setdefault(p1, set())
        reach.add(p2)
    # way back, except from end or towards start
    if p2!='end' and p1!='start':
        reach=data2.setdefault(p2, set())
        reach.add(p1)
maptoset={} # maps node to set of paths (strings with nodes separated by commas) to that node
bfs(data2, maptoset, 'start', 'start' )
num=len(maptoset['end'])
print(f'{num}')

In [None]:
# 2021 day 12
# mv ~/Downloads/input data_src/2021-day-12-input.txt
# big input file looks like: small
# idea: part 1 parse ..., then depth-first search because of counting paths and because input is small

sample1='''
start-A
start-b
A-c
A-b
b-d
A-end
b-end
'''

sample2='''
dc-end
HN-start
start-kj
dc-start
dc-HN
LN-dc
HN-end
kj-sa
kj-HN
kj-dc
'''

def count_end(data, node, visited): # how many different paths to end
    #print(f'{node=} {visited=}')
    added=False
    if node[0]>='a' and node[0]<='z':
        visited[node]+=1
        added=True
        # are we still in allowed state?
        numexcept=0 # number of exceptions, max 1
        for nd,cnt in visited.items():
            if nd=='start' and cnt>1: # never allowed
                numexcept+=100
            elif cnt>2: # never allowed
                numexcept+=100
            elif cnt>1:
                numexcept+=1 # one allowed
        if numexcept>1: # invalid state
            visited[node]-=1
            return 0
    reach=data.get(node, [])
    count=0
    for r in reach:
        if r=='end':
            count+=1
        else:
            count+=count_end(data, r, visited)
    if added:
        visited[node]-=1
    return count

sample1=open('data_src/2021-day-12-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ s.split('-') for s in lines ]
data2={} # maps node to set of all reachable
for p1, p2 in data:
    reach=data2.setdefault(p1, set())
    reach.add(p2)
    reach=data2.setdefault(p2, set())
    reach.add(p1)
visited=collections.Counter() # maps node to count of visited (smallcaps only)
num=count_end(data2, 'start', visited)
print(f'{num}')

# 122880


In [None]:
# 2021 day 11
# mv ~/Downloads/input data_src/2021-day-11-input.txt
# big input file looks like: small!?
# idea: part 1 parse ..., then efficient simulation per step using bfs flashing/flashed

sample1='''
5483143223
2745854711
5264556173
6141336146
6357385478
4167524645
2176841721
6882881134
4846848554
5283751526
'''

sample2='''
11111
19991
19191
19991
11111
'''

def do_step(data):
    flashed=set() # set of y,x tuples that already flashed
    flashing=[] # list of y,x tuples that will flash
    # increase by 1
    for y in range(len(data)):
        for x in range(len(data[0])):
            data[y][x]+=1
            if data[y][x]>9:
                flashing.append( (y,x) )
    # now flash
    while len(flashing)>0:
        y,x=flashing.pop(0)
        if (y,x) in flashed or data[y][x]<=9:
            continue
        flashed.add( (y,x) )
        for dy in [-1, 0, 1]:
            for dx in [-1, 0, 1]:
                if dx==0 and dy==0:
                    continue
                y2=y+dy; x2=x+dx
                if y2>=0 and y2<len(data) and x2>=0 and x2<len(data[0]):
                    data[y2][x2]+=1
                    if data[y2][x2]>9 and (y2,x2) not in flashed:
                        flashing.append( (y2,x2) )
    # now reset
    for y,x in flashed:
        data[y][x]=0
    return len(flashed)    

sample1=open('data_src/2021-day-11-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ list(s) for s in lines ]
data=[ [ int(num) for num in row ] for row in data ]
step=1
while True:
    flashes=do_step(data)
    if flashes>=len(data)*len(data[0]):
        print(f'all flashing {step=}')
        break
    step+=1


In [None]:
# 2021 day 10
# mv ~/Downloads/input data_src/2021-day-10-input.txt
# big input file looks like: 
# idea: part 1 parse ..., then use a stack

sample1='''
[({(<(())[]>[[{[]{<()<>>
[(()[<>])]({[<{<<[]>>(
{([(<{}[<>[]}>{[]{[(<()>
(((({<>}<{<{<>}{[]{[]{}
[[<[([]))<([[{}[[()]]]
[{[{({}]{}}([{[{{{}}([]
{<[[]]>}<{[{[{[]{()[[[]
[<(<(<(<{}))><([]([]()
<{([([[(<>()){}]>(<<{{
<{([{{}}[<[[[<>{}]]]>[]]
'''

matches={ ')': '(', ']': '[', '}': '{', '>': '<' }
mpoints={ '(': 1, '[': 2, '{': 3, '<': 4 }

sample1=open('data_src/2021-day-10-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
plist=[]
for s in lines:
    stack=[]
    skip=False
    for c in s:
        if c in '([{<':
            stack.append(c)
        elif c in ')]}>':
            if len(stack)<1:
                break
            d=stack.pop()
            if d!=matches[c]:
                skip=True
                break
        else:
            assert False
    if skip:
        continue
    #print(f'incomplete {s} with {stack}')
    points=0
    while len(stack)>0:
        c=stack.pop()
        points=5*points+mpoints[c]
    plist.append(points)
    #print(f'{points=}')
plist=sorted(plist)  
score=plist[len(plist) // 2]
print(f'{score=}')

# not 1063586833 but 2421222841

In [None]:
# 2021 day 9
# mv ~/Downloads/input data_src/2021-day-9-input.txt
# big input file looks like: one block
# idea: part 1 parse ..., then ...
# part 2: breadth-first from each low point record a list of todo points with target level, 
#  if all points of target level are already marked with target add, and stop at 9, just number according to low point

sample1='''
2199943210
3987894921
9856789892
8767896789
9899965678
'''

def is_low(data, y, x): # check whether point is low
    num=data[y][x]
    for dy, dx in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
        y2=y+dy
        x2=x+dx
        if y2<0 or y2>=len(data) or x2<0 or x2>=len(data[0]):
            continue
        num2=data[y2][x2]
        if num2<=num:
            return False
    return True

def mark_from_low(lowp, low_index, markers, data): # breadth-first fill from low
    y,x=lowp
    markers[y][x]=low_index
    todos=[]
    num=data[y][x]
    for dy, dx in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
        y2=y+dy
        x2=x+dx
        if y2<0 or y2>=len(data) or x2<0 or x2>=len(data[0]):
            continue
        todos.append( (y2, x2, num) )
    while len(todos)>0:
        # get a candidate, if not 9, not marked, and all surrounding of targetlvl or less are marked with this low_index
        # mark this one as well, add surround to todos
        y, x, targetlvl=todos.pop(0)
        if data[y][x]==9 or markers[y][x]!= -1:
            continue
        todos2=[]
        is_part=True # assume its part of the bassin
        for dy, dx in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
            y2=y+dy
            x2=x+dx
            if y2<0 or y2>=len(data) or x2<0 or x2>=len(data[0]):
                continue
            if (data[y2][x2] <= targetlvl) and (markers[y2][x2]!=low_index): # problem spot found
                is_part=False
                break
            if markers[y2][x2]== -1:
                todos2.append( (y2, x2, data[y][x]) )
        if is_part:
            markers[y][x]=low_index
            todos.extend(todos2)

def count_sizes(markers):
    count=collections.Counter()
    for y in range(len(markers)):
        for x in range(len(markers[0])):
            idx=markers[y][x]
            if idx>=0:
                count[idx]+=1
    return count

sample1=open('data_src/2021-day-9-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ list(s) for s in lines ]
data=[ [ int(num) for num in row ] for row in data ]
low_points=[]
for y in range(len(data)):
    for x in range(len(data[0])):
        if is_low(data, y, x):
            low_points.append( (y, x) )
#print(f'{low_points=}')
markers=[] # filled with index of low point
for y in range(len(data)):
    row=[]
    for x in range(len(data[0])):
        row.append(-1)
    markers.append(row)
for low_index in range(len(low_points)):
    lowp=low_points[low_index]
    mark_from_low(lowp, low_index, markers, data)
sizes=count_sizes(markers)
#print(f'{sizes=}')
sz2=sizes.items()
sz2=sorted(sz2, key=lambda tup: -tup[1])
#print(f'{sz2=}')
score=math.prod([ tup[1] for tup in sz2[:3]])
print(f'{score=}')


In [None]:
# 2021 day 8
# mv ~/Downloads/input data_src/2021-day-8-input.txt
# big input file looks like: long list of lines, about 250
# idea: part 1 parse to list of lists, then to sets, then count sets
# part 2: sort inputs by size, then deduce segments

sample1='''
be cfbegad cbdgef fgaecd cgeb fdcge agebfd fecdb fabcd edb |fdgacbe cefdb cefbgd gcbe
edbfga begcd cbg gc gcadebf fbgde acbgfd abcde gfcbed gfec |fcgedb cgb dgebacf gc
fgaebd cg bdaec gdafb agbcfd gdcbef bgcad gfac gcb cdgabef |cg cg fdcagb cbg
fbegcd cbd adcefb dageb afcb bc aefdc ecdab fgdeca fcdbega |efabcd cedba gadfec cb
aecbfdg fbg gf bafeg dbefa fcge gcbea fcaegb dgceab fcbdga |gecf egdcabf bgf bfgea
fgeab ca afcebg bdacfeg cfaedg gcfdb baec bfadeg bafgc acf |gebdcfa ecba ca fadegcb
dbcfg fgd bdegcaf fgec aegbdf ecdfab fbedc dacgb gdcebf gf |cefg dcbef fcge gbcadfe
bdfegc cbegaf gecbf dfcage bdacg ed bedf ced adcbefg gebcd |ed bcgafe cdgba cbgef
egadfb cdbfeg cegd fecab cgb gbdefca cg fgcdab egfdb bfceg |gbdfcae bgc cg cgb
gcafb gcf dcaebfg ecagb gf abcdeg gaef cafbge fdbac fegbdc |fgae cfgab fg bagce
'''

sample2='''
acedgfb cdfbe gcdfa fbcad dab cefabd cdfgeb eafb cagedb ab | cdfeb fcadb cdfeb cdbaf
'''

def get_freq(inps): # input is list of 10 sets,
    # count and return freq of each individual item
    count=collections.Counter()
    last=inps[-1]
    res={}
    for c in last:
        cnt=0
        for i in inps:
            if c in i:
                cnt+=1
        res[c]=cnt
    assert len(res)==7 # 7 segments
    return res

def get_mapping(inps): # input is list of 10 sets,
    # sort by length, then deduce segments, then digits, return list of tuples of set, digit
    # segments: A
    #          B C
    #           D
    #          E F
    #           G
    assert len(inps)==10
    inps.sort(key=lambda x: len(x))
    #print(f'{inps=}')
    freqs=get_freq(inps)
    #print(f'{freqs}')
    res=[]
    # digit 1
    assert len(inps[0])==2
    res.append((inps[0], 1))
    # digit 7
    assert len(inps[1])==3
    res.append((inps[1], 7))
    # digit 4
    assert len(inps[2])==4
    res.append((inps[2], 4))
    # digit 8
    assert len(inps[9])==7
    res.append((inps[9], 8))
    # digit 6, which uses 6 segments but not segm_C
    segm_A=inps[1]-inps[0] # set w/ single letter of top segment
    segm_E={ k for k,v in freqs.items() if v==4 } # set w/ single letter of bottom left segment
    assert len(segm_E)==1
    segm_F={ k for k,v in freqs.items() if v==9 } 
    assert len(segm_F)==1
    segm_B={ k for k,v in freqs.items() if v==6 } 
    assert len(segm_B)==1
    segm_C={ k for k,v in freqs.items() if v==8 } - segm_A
    assert len(segm_C)==1
    #print(f'{segm_A=} {segm_B=} {segm_C=} {segm_E=} {segm_F=}') 
    digit_6=None
    for i in inps:
        if len(i)==6 and len(segm_C & i)==0:
            digit_6=i
            res.append((i, 6))
            break
    # digit 9, which uses 6 segments but not segm_E
    digit_9=None
    for i in inps:
        if len(i)==6 and len(segm_E & i)==0:
            digit_9=i
            res.append((i, 9))
            break
    # digit 0, which is the remaining 6-segment
    for i in inps:
        if len(i)==6 and i!=digit_6 and i!=digit_9:
            res.append((i, 0))
            break
    # digit 2, which uses 5 segments, incl. segm_E
    digit_2=None
    for i in inps:
        if len(i)==5 and len(segm_E & i)>0:
            digit_2=i
            res.append((i, 2))
            break
    # digit 5, which uses 5 segments, incl. segm_B
    digit_5=None
    for i in inps:
        if len(i)==5 and len(segm_B & i)>0:
            digit_5=i
            res.append((i, 5))
            break
    # digit 3, which is the remaining 5-segment
    for i in inps:
        if len(i)==5 and i!=digit_2 and i!=digit_5:
            res.append((i, 3))
            break
    return res

def use_mapping(mapping, outputs): # find digit for each output part, append and return as number
    # outputs is list of 4 sets
    res=0
    for out in outputs:
        found=False
        for i,digit in mapping:
            if i==out:
                res=10*res+digit
                found=True
                break
        assert found
    return res

sample1=open('data_src/2021-day-8-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data0=[ s.split('|') for s in lines ]
data0=[ [ part.strip().split() for part in row if len(part)>0 ] for row in data0 ]
count=0
for parts in data0:
    input,output=parts
    inputs=[ set(list(p)) for p in input ]
    outputs=[ set(list(p)) for p in output ]
    #print(f'{inputs=} {outputs=}')
    mapping=get_mapping(inputs)
    assert len(mapping)==10
    #print(f'{mapping=}')
    digits=use_mapping(mapping, outputs)
    #print(f'{digits=}')
    count+=digits
print(f'{count=}')

# 1027422

Conclusion for day 8: took quite a bit of time although the solution direction is ok and no big mistakes were made. Two ways to be faster:
* Focusing on segments instead of digits, noticing that each segment has a frequency of occuring in the 10 digits and also has a set of digit pattern lengths where it occurs, e.g. segment A occurs in 8 of the digit patterns, with lengths 3,5,6,7. Combining those two gives a unique signature for each segment, pretty easy to deduce them then, then still have to make the mapping to digits.
* In general probably just typing 10% faster

In [None]:
# 2021 day 7
# mv ~/Downloads/input data_src/2021-day-7-input.txt
# idea: part 1 parse as list, then start at avg, nums around, move in direction of lowest until min found
# part 2: try with new fuel cost algo, then retry with each crab position
# mistakes: assuming range of input data would be huge and that the bottleneck was there,
#  trying multiple vague search strategies without guarantee for a good solution,
#  missing opportunity to efficiently fill a small cache, or to know that sum(1..n) is (n+1)*n/2,
#  also - integer division is //

sample1='''
16,1,2,0,4,2,7,1,2,14
'''

def fuel1(diff): # sum of series 1..n
    return (diff+1)*diff // 2

def fuel_cost(data, target): # calculate fuel cost for aligning
    cost=0
    for num in data:
        cost+=fuel1(abs(num-target))
    return cost

sample1=open('data_src/2021-day-7-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ int(s) for s in lines[0].split(',') ]
costs=[]
print(f'{min(data)=}, {max(data)=}')
for num in range(min(data), max(data)+1):
    costs.append(fuel_cost(data, num))
min(costs)
# 98257206

In [None]:
# 2021 day 6
# mv ~/Downloads/input data_src/2021-day-6-input.txt
# idea: part 1 parse as list of numbers, then put in map as timerval:count, then run simulation to copy

sample1='''
3,4,3,1,2
'''

sample1=open('data_src/2021-day-6-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data0=[ s.split(',') for s in lines ]
data0=data0[0]
#data0
tvals=collections.Counter()
for s in data0:
    tvals[int(s)]+=1
for day in range(256):
    newtvals=collections.Counter()
    for key,val in tvals.items():
        #print(f'{key=} {val=}')
        if key==0:
            newtvals[6]+=val
            newtvals[8]+=val
        else:
            newtvals[key-1]+=val
    #print(f'{newtvals=}')
    tvals=newtvals
print(f'total {sum(tvals.values())}')
# part 2 1650309278600

In [None]:
# 2021 day 5
# mv ~/Downloads/input data_src/2021-day-5-input.txt

sample1='''
0,9 -> 5,9
8,0 -> 0,8
9,4 -> 3,4
2,2 -> 2,1
7,0 -> 7,4
6,4 -> 2,0
0,9 -> 2,9
3,4 -> 1,4
0,0 -> 8,8
5,5 -> 8,2
'''

sample1=open('data_src/2021-day-5-input.txt').read()
tups=[ result.group(1, 2, 3, 4) for s in sample1.splitlines() if (result:= re.match(r'(\d+),(\d+)\s*\-\>\s*(\d+),(\d+)', s)) ]
tups=[ (int(a), int(b), int(c), int(d)) for a, b, c, d in tups ]
max_coord=max([ max(tup) for tup in tups])
board=[] # top left is 0,0 and it's board[y][x]
for y in range(max_coord+1):
    board.append([])
    for x in range(max_coord+1):
        board[y].append(0)

def draw_hor(y, x1, x2):
    if x1>x2:
        swap=x1; x1=x2; x2=swap
    #print(f'hor {y=} {x1=} {x2=}')
    for x in range(x1, x2+1):
        board[y][x]+=1

def draw_vert(x, y1, y2):
    if y1>y2:
        swap=y1; y1=y2; y2=swap
    #print(f'vert {x=} {y1=} {y2=}')
    for y in range(y1, y2+1):
        board[y][x]+=1

def draw_diag(x1, y1, x2, y2):
    if x1>x2:
        swap=x1; x1=x2; x2=swap
        swap=y1; y1=y2; y2=swap
    assert abs(y2-y1) == abs(x2-x1)
    deltay=int((y2-y1) / (x2-x1))
    #print(f'diag {x1=} {y1=} {x2=} {y2=} {deltay=}')
    y=y1
    for x in range(x1, x2+1):
        board[y][x]+=1
        y+=deltay

def count_2():
    count=0
    for y in range(max_coord+1):
        for x in range(max_coord+1):
            if board[y][x]>=2:
                count+=1
    return count

for x1,y1,x2,y2 in tups:
    if x1==x2:
        draw_vert(x1, y1, y2)
    elif y1==y2:
        draw_hor(y1, x1, x2)
    else:
        draw_diag(x1,y1,x2,y2)
count=count_2()
print(f'{count=}')
#board
#count=19172


In [None]:
# 2021 day 4
# mv ~/Downloads/input data_src/2021-day-4-input.txt

sample1='''
7,4,9,5,11,17,23,2,0,14,21,24,10,16,13,6,15,25,12,22,18,20,8,19,3,26,1

22 13 17 11  0
 8  2 23  4 24
21  9 14 16  7
 6 10  3 18  5
 1 12 20 15 19

 3 15  0  2 22
 9 18 13 17  5
19  8  7 25 23
20 11 10 24  4
14 21 16 12  6

14 21 17 24  4
10 16 15  9 19
18  8 23 26 20
22 11 13  6  5
 2  0 12  3  7
'''

sample1=open('data_src/2021-day-4-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
numbers=[ int(n) for n in lines[0].split(',') ] # first row
board0=[ s.split() for s in lines[1:] ]
for i in range(len(board0)):
    board0[i]=[ int(n) for n in board0[i] ]
# now split into separate boards
boards=[] # list of lists of board rows
for firstrow in range(0, len(board0), 5):
    boards.append(board0[firstrow:firstrow+5])

def mark_off(board, lastnum): # mark off a number on a board
    for row in board:
        for col in range(len(row)):
            if row[col] == lastnum:
                row[col]= -1

def win_score(board): # returns sum of unmarked numbers on a board
    total=0
    for row in board:
        for n in row:
            if n!= -1:
                total+=n
    return total

def is_winner(board): # returns True if winner
    # a row?
    for row in board:
        if sum(row) == -5:
            return True
    # a col?
    for col in range(len(board[0])):
        total=0
        for row in board:
            total+=row[col]
        if total== -5:
            return True
    return False

def get_winners(boards): # return set of winning board indices
    winners=set()
    for i, board in enumerate(boards):
        if is_winner(board):
            winners.add(i)
    return winners

lastwin=set()
for num in numbers:
    for board in boards:
        mark_off(board, num)
    winners=get_winners(boards)
    #print(f'{winners=} {num=}')
    if len(winners) == len(boards): # complete
        newi=list(winners-lastwin)[0]
        score=win_score(boards[newi])
        print(f'{num=} {newi=} {score=} {score*num=}')
        break
    lastwin=winners

# num=14 newi=13 score=227 score*num=3178

In [None]:
# 2021 day 3
# mv ~/Downloads/input data_src/2021-day-3-input.txt

sample1='''
00100
11110
10110
10111
10101
01111
00111
11100
10000
11001
00010
01010
'''

sample1=open('data_src/2021-day-3-input.txt').read()
lines0=[s for s in sample1.splitlines() if len(s)>0 ]

def select_lines(lines, keep_ones=True):
    '''from a list of lines either keep lines with most ones or least zeros'''
    for col in range(len(lines[0])):
        ones=[ s for s in lines if s[col]=='1' ]
        zeros=[ s for s in lines if s[col]=='0' ]
        if len(ones) >= len(zeros):
            lines=ones if keep_ones else zeros
        else:
            lines=zeros if keep_ones else ones
        if len(lines) <= 1:
            break
    rating=lines[0]
    rating10=int(f'0b{rating}', 2)
    print(f'{rating=} {rating10=}')
    return rating10

oxygen=select_lines(lines0, keep_ones=True)
co2=select_lines(lines0, keep_ones=False)
print(f'{oxygen} {co2} {oxygen*co2}')

# oxygen='100010111011' oxygen10=2235
# co2='000111000011' co210=451
# 2235 451 1007985

In [None]:
# 2021 day 2
# mv ~/Downloads/input data_src/2021-day-2-input.txt

sample1='''
forward 5
down 5
forward 8
up 3
down 8
forward 2
'''

sample1=open('data_src/2021-day-2-input.txt').read()
nums=[s.split() for s in sample1.splitlines() if len(s)>0 ]
nums=[ (tup[0], int(tup[1])) for tup in nums ]
depth=0
fwd=0
aim=0
for cmd, val in nums:
    match cmd:
        case 'forward':
            fwd+=val
            depth+=aim*val
        case 'down':
            aim+=val
        case 'up':
            aim-=val
        case _:
            assert False
print(f'{depth=}, {fwd=}, {aim=}, {depth*fwd=}')

# 900 / 1864715580

In [None]:
# 2021 day 1
# mv ~/Downloads/input data_src/2021-day-1-input.txt

sample1='''
199
200
208
210
200
207
240
269
260
263
'''

sample1=open('data_src/2021-day-1-input.txt').read()
nums=[int(s) for s in sample1.splitlines() if len(s)>0 ]
nums
last=None
count=0
for i in range(2, len(nums)):
    sm=nums[i]+nums[i-1]+nums[i-2] # part 2, part 1 is only nums[i] and starting at position 0
    if last is not None and sm>last:
        count+=1
    last=sm
print(f'{count}')

In [None]:
# TEMPLATE
# 2021 day 6
# mv ~/Downloads/input data_src/2021-day-6-input.txt
# big input file looks like: 
# idea: part 1 parse ..., then ...

sample1='''

'''

#sample1=open('data_src/2021-day-6-input.txt').read()
lines=[s for s in sample1.splitlines() if len(s)>0 ]
data=[ int(s) for s in lines[0].split(',') ]
groups=get_line_groups(lines)
data0=[ s.split() for s in lines ]
data0=[ [cmd, int(num), 0] for cmd, num in data0 ]
# template, remove what's not needed

In [None]:
# experiments with various ways of creating constants in Python

import time
import enum
from typing import Final

start = time.time()
class Const(enum.Enum):
    NUM1=5
total=0
for x in range(0,999999):
    total+=Const.NUM1.value
print(time.time() - start, "for enum")
assert total==4999995
# invalid solution, does not allow reassignment but very slow
# both Const.NUM1.value=6 and Const.NUM1=6 not allowed

start = time.time()
CONST_NUM2: Final[int]=5
total=0
for x in range(0,999999):
    total+=CONST_NUM2
print(time.time() - start, "for final")
assert total==4999995
# valid solution, fast and does allow reassigment, but flagged as a syntax error by VS Code
CONST_NUM2=6

start = time.time()
Constants = collections.namedtuple('Constants', ['NUM1', 'NUM2'])
constants = Constants(5, 6)
total=0
for x in range(0,999999):
    total+=constants.NUM1
print(time.time() - start, "for named tuple")
assert total==4999995
# valid solution, fast (although a bit slower than final) and does not allow reassigment
# constants.NUM1=6 not allowed

start = time.time()
class Const3(object):
    __slots__=()
    NUM1=5
CONST3=Const3()
total=0
for x in range(0,999999):
    total+=CONST3.NUM1
print(time.time() - start, "for class member with empty slot")
assert total==4999995
# favorite solution, fast (although a bit slower than final) and does not allow reassigment
# CONST3.NUM1=6 # not allowed

In [None]:
# test of record arrays and structured arrays vs. plain numpy arrays

NUMNODES=200000
nodes=np.zeros(NUMNODES, dtype=np.dtype([
    ('corner_x', np.int32), ('corner_y', np.int32), ('corner_z', np.int32),
    ('corner2_x', np.int32), ('corner2_y', np.int32), ('corner2_z', np.int32),
    ('val', np.int32), ('first_child', np.int32)], align=True))
nodes=np.rec.array(nodes)
nodes[17].corner_x=1
nodes[17].corner_y=2
nodes[17].corner_z=3
nodes[17].corner2_x=4
time0=time.time()
total=0
for i in range(999999):
    node=nodes[17]
    total+=node.corner_x+node.corner_y+node.corner_z+node.corner2_x
print('record array:', time.time()-time0)
assert total==9999990

nodes=np.zeros(NUMNODES, dtype=np.dtype([
    ('corner_x', np.int32), ('corner_y', np.int32), ('corner_z', np.int32),
    ('corner2_x', np.int32), ('corner2_y', np.int32), ('corner2_z', np.int32),
    ('val', np.int32), ('first_child', np.int32)], align=True))
nodes[17]['corner_x']=1
nodes[17]['corner_y']=2
nodes[17]['corner_z']=3
nodes[17]['corner2_x']=4
time0=time.time()
total=0
for i in range(999999):
    node=nodes[17]
    total+=node['corner_x']+node['corner_y']+node['corner_z']+node['corner2_x']
print('structured array:', time.time()-time0)
assert total==9999990

nodes=np.zeros( (NUMNODES, 8), dtype=np.dtype(np.int32, align=True))
class Nd(object):
    __slots__=()
    CORNER_X=0
    CORNER_Y=1
    CORNER_Z=2
    CORNER2_X=3
ND=Nd()
nodes[17][ND.CORNER_X]=1
nodes[17][ND.CORNER_Y]=2
nodes[17][ND.CORNER_Z]=3
nodes[17][ND.CORNER2_X]=4
time0=time.time()
total=0
for i in range(999999):
    node=nodes[17]
    total+=node[ND.CORNER_X]+node[ND.CORNER_Y]+node[ND.CORNER_Z]+node[ND.CORNER2_X]
print('plain array:', time.time()-time0)
assert total==9999990

### Lessons learned for competing in the Advent of Code
* It's a grind to win, so get up 10-15 minutes early and be ready at 6:00 AM every day, with 2 glasses of water, music, etc. This is most crucial on the first days when many people can potentially finish quickly.
* Due to the scoring system if you want to be in the top, and likely will be competing with a small group of fanatics, if you 'take a break' one day and let 20 other people finish first you'll be set back 20 points that you can regain only very slowly as in the top you can gain only 1 or 2 points per day on the people above you (unless they take a break).
* VS Code with python and a jupyter notebook feels like very much the right tool for the job
* First focus on getting your input in the best shape, don't rush through this as it will only take a little time anyway
* First few days the problems are so simple you likely don't need functions
* Afterwards split the problem into 3-5 smaller functions that are easy to write, each with a short comment to describe what it does, this will often speed up part 2 a lot
* However, it's worthwhile to smash out part 1 / the silver star asap, because that will help with ranking (and with a python notebook you have an edge over some other languages on simple problems)
* There are only a limited number of input structure options (broadly speaking) which can be supported with a starter template as included above.
* Verify assumptions about the input data file, e.g. do not blindly assume that it contains huge numbers
* Use opportunities to fill a small cache efficiently
* When doing a (depth-first) search look out for opportunities to split the search space, in general check whether you have a tricky search problem or not
* Do not jump into a weird search strategy without looking at some other options
* On problems with a lot of administrative details consider defining some data declaratively to simplify
* Do not jump into writing a long function with a lot of variables before considering some other options