We should be able to just remove the rows as collisions are found. I don't know if that's going to make the task intractable if I do it naively. Start like that, then see how it goes.

I think it'll be easier to do if I can use the SQL extension to query the dataframes: it's easier for using groupings.

In [1]:
import pandas as pd
import re

import pandasql
pysqldf = lambda q: pandasql.sqldf(q, globals())

In [2]:
testInput_str='''
p=<-6,0,0>, v=< 3,0,0>, a=< 0,0,0>    
p=<-4,0,0>, v=< 2,0,0>, a=< 0,0,0>
p=<-2,0,0>, v=< 1,0,0>, a=< 0,0,0>
p=< 3,0,0>, v=<-1,0,0>, a=< 0,0,0>
'''

To parse the input, let's assume that each line of the input contains exactly 9 distinct integers, and put those integers into columns:

In [3]:

state_ls=[[] for i in range(9)]

for nl in testInput_str.strip().split('\n'):
    values_ls=re.findall('\-?\d+', nl)
    assert len(values_ls)==9
    for (i,v) in enumerate(values_ls):
        state_ls[i].append(int(v))

state_df=pd.DataFrame(state_ls)
state_df=state_df.T
state_df.columns=['px', 'py', 'pz', 'vx', 'vy', 'vz', 'ax', 'ay', 'az']

# make a copy for future reference:
testState_df=state_df.copy()

state_df

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az
0,-6,0,0,3,0,0,0,0,0
1,-4,0,0,2,0,0,0,0,0
2,-2,0,0,1,0,0,0,0,0
3,3,0,0,-1,0,0,0,0,0


Now, let's create a query that tells us whether there are any colliding particles.

Actually, it'll be easier if I can have another column representing the index, because the pysqldf function loses indexing information:

In [4]:
state_df['particle_index']=state_df.index

# Also add to testState_df because I use it again later:
testState_df['particle_index']=testState_df.index

state_df

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index
0,-6,0,0,3,0,0,0,0,0,0
1,-4,0,0,2,0,0,0,0,0,1
2,-2,0,0,1,0,0,0,0,0,2
3,3,0,0,-1,0,0,0,0,0,3


In [5]:
pysqldf('''
        SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
        FROM state_df
        GROUP BY px, py, pz
        ''')

Unnamed: 0,px,py,pz,number_of_particles
0,-6,0,0,1
1,-4,0,0,1
2,-2,0,0,1
3,3,0,0,1


And use this as a nested query:

In [6]:
pysqldf('''
        SELECT t.px, t.py, t.pz, number_of_particles
        FROM (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
              FROM state_df
              GROUP BY px, py, pz) AS t
        WHERE number_of_particles > 1
        ''')

Unnamed: 0,t.px,t.py,t.pz,number_of_particles


We should get a result on the third tick:

In [7]:
def tick(state_df):
    '''
    Horribly use the mutability of a dataframe in a function
    to update state_df to the state after one tick.
    '''
    # First update the velocities:
    state_df['vx']=state_df['vx']+state_df['ax']
    state_df['vy']=state_df['vy']+state_df['ay']
    state_df['vz']=state_df['vz']+state_df['az']
    
    # And then update the positions:
    state_df['px']=state_df['px']+state_df['vx']
    state_df['py']=state_df['py']+state_df['vy']
    state_df['pz']=state_df['pz']+state_df['vz']
    

In [8]:
state_df=testState_df.copy()

print(state_df)

pysqldf('''
        SELECT t.px, t.py, t.pz, number_of_particles
        FROM (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
              FROM state_df
              GROUP BY px, py, pz) AS t
        WHERE number_of_particles > 1
        ''')

   px  py  pz  vx  vy  vz  ax  ay  az  particle_index
0  -6   0   0   3   0   0   0   0   0               0
1  -4   0   0   2   0   0   0   0   0               1
2  -2   0   0   1   0   0   0   0   0               2
3   3   0   0  -1   0   0   0   0   0               3


Unnamed: 0,t.px,t.py,t.pz,number_of_particles


In [9]:
tick(state_df)

print(state_df)

pysqldf('''
        SELECT t.px, t.py, t.pz, number_of_particles
        FROM (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
              FROM state_df
              GROUP BY px, py, pz) AS t
        WHERE number_of_particles > 1
        ''')

   px  py  pz  vx  vy  vz  ax  ay  az  particle_index
0  -3   0   0   3   0   0   0   0   0               0
1  -2   0   0   2   0   0   0   0   0               1
2  -1   0   0   1   0   0   0   0   0               2
3   2   0   0  -1   0   0   0   0   0               3


Unnamed: 0,t.px,t.py,t.pz,number_of_particles


In [10]:
tick(state_df)

print(state_df)

pysqldf('''
        SELECT t.px, t.py, t.pz, number_of_particles
        FROM (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
              FROM state_df
              GROUP BY px, py, pz) AS t
        WHERE number_of_particles > 1
        ''')

   px  py  pz  vx  vy  vz  ax  ay  az  particle_index
0   0   0   0   3   0   0   0   0   0               0
1   0   0   0   2   0   0   0   0   0               1
2   0   0   0   1   0   0   0   0   0               2
3   1   0   0  -1   0   0   0   0   0               3


Unnamed: 0,t.px,t.py,t.pz,number_of_particles
0,0,0,0,3


We can extend the query so that it gives just the particles which are in a collision:

In [11]:
pysqldf('''
        SELECT *
        FROM state_df AS s JOIN (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
                                 FROM state_df
                                 GROUP BY px, py, pz) AS t
        WHERE s.px=t.px
                AND s.py=t.py
                AND s.pz=t.pz
                AND number_of_particles>1
        ''')

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index,px.1,py.1,pz.1,number_of_particles
0,0,0,0,3,0,0,0,0,0,0,0,0,0,3
1,0,0,0,2,0,0,0,0,0,1,0,0,0,3
2,0,0,0,1,0,0,0,0,0,2,0,0,0,3


and from that we can get the necessary indices:

In [12]:
pysqldf('''
        SELECT particle_index
        FROM state_df AS s JOIN (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
                                 FROM state_df
                                 GROUP BY px, py, pz) AS t
        WHERE s.px=t.px
                AND s.py=t.py
                AND s.pz=t.pz
                AND number_of_particles>1
        ''')

Unnamed: 0,particle_index
0,0
1,1
2,2


Great, that seems to work. So now we can extend the `tick` function so that it also removes all the particles that are involved in collisions:

(Note: need to use `locals()` as the second parameter to `pandasql.sqldf` in order to work with the function parameter:

In [13]:
def tick_with_collision_detection(stateIn_df):
    '''
    Horribly use the mutability of a dataframe in a function
    to update state_df to the state after one tick.
    '''
    # First update the velocities:
    stateIn_df['vx']=stateIn_df['vx']+stateIn_df['ax']
    stateIn_df['vy']=stateIn_df['vy']+stateIn_df['ay']
    stateIn_df['vz']=stateIn_df['vz']+stateIn_df['az']
    
    # And then update the positions:
    stateIn_df['px']=stateIn_df['px']+stateIn_df['vx']
    stateIn_df['py']=stateIn_df['py']+stateIn_df['vy']
    stateIn_df['pz']=stateIn_df['pz']+stateIn_df['vz']
    
    # Now find the indices of all the particles which are
    # involved in collisions:
    collisions_df=pandasql.sqldf('''
        SELECT *
        FROM stateIn_df AS s JOIN (SELECT px, py, pz, COUNT(particle_index) AS number_of_particles
                                   FROM stateIn_df
                                   GROUP BY px, py, pz) AS t
        WHERE s.px=t.px
                AND s.py=t.py
                AND s.pz=t.pz
                AND number_of_particles>1
        ''', locals())
    collidingParticles_ls=list(collisions_df['particle_index'])

    # And remove the rows corresponding to those particles:
    for idx in collidingParticles_ls:
        stateIn_df.drop(idx, axis=0, inplace=True)

Now we should find that at the appropriate point, the colliding particles are removed from the dataframe:

In [14]:
state_df=testState_df.copy()

state_df

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index
0,-6,0,0,3,0,0,0,0,0,0
1,-4,0,0,2,0,0,0,0,0,1
2,-2,0,0,1,0,0,0,0,0,2
3,3,0,0,-1,0,0,0,0,0,3


In [15]:
tick_with_collision_detection(state_df)

state_df

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index
0,-3,0,0,3,0,0,0,0,0,0
1,-2,0,0,2,0,0,0,0,0,1
2,-1,0,0,1,0,0,0,0,0,2
3,2,0,0,-1,0,0,0,0,0,3


In [16]:
tick_with_collision_detection(state_df)

state_df

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index
3,1,0,0,-1,0,0,0,0,0,3


In [17]:
tick_with_collision_detection(state_df)

state_df

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index
3,0,0,0,-1,0,0,0,0,0,3


Cool, that seems to be working. Now try it on the puzzle input. As before, just keep trying until it seems to have settled down...

In [18]:
with open('data/day20.txt') as fIn:
    puzzleInput_str=fIn.read()

state_ls=[[] for i in range(9)]

for nl in puzzleInput_str.strip().split('\n'):
    values_ls=re.findall('\-?\d+', nl)
    assert len(values_ls)==9
    for (i,v) in enumerate(values_ls):
        state_ls[i].append(int(v))

puzzleState_df=pd.DataFrame(state_ls)
puzzleState_df=puzzleState_df.T
puzzleState_df.columns=['px', 'py', 'pz', 'vx', 'vy', 'vz', 'ax', 'ay', 'az']
puzzleState_df['particle_index']=puzzleState_df.index
puzzleState_df.head()

Unnamed: 0,px,py,pz,vx,vy,vz,ax,ay,az,particle_index
0,-717,-4557,2578,153,21,30,-8,8,-7,0
1,1639,651,-987,29,-19,129,-5,0,-6,1
2,-10482,-248,-491,4,10,81,21,0,-4,2
3,-6607,-2542,1338,-9,52,-106,14,2,4,3
4,-4468,1178,-6474,146,44,66,0,-5,9,4


In [19]:
state_df=puzzleState_df.copy()

totalIterations_i=10000
countTally_idx=len(puzzleState_df)

for i in range(totalIterations_i):

    tick_with_collision_detection(state_df)
    
    if len(state_df)<countTally_idx:
        countTally_idx=len(state_df)
        print('{}\t{}'.format(i, countTally_idx))


9	974
10	947
11	942
13	927
14	915
15	904
16	882
17	848
18	821
19	805
20	776
21	761
22	744
23	715
24	696
25	674
26	657
27	624
28	604
29	597
30	569
31	553
32	545
33	530
34	470
35	445
36	438
37	434
38	404


I think that that result of 404 is probably OK...