https://adventofcode.com/2021/day/14

<article class="day-desc"><h2>--- Day 14: Extended Polymerization ---</h2><p>The incredible pressures at this depth are starting to put a strain on your submarine. The submarine has <a href="https://en.wikipedia.org/wiki/Polymerization" target="_blank">polymerization</a> equipment that would produce suitable materials to reinforce the submarine, and the nearby volcanically-active caves should even have the necessary input elements in sufficient quantities.</p>
<p>The submarine manual contains <span title="HO
HO -> OH">instructions</span> for finding the optimal polymer formula; specifically, it offers a <em>polymer template</em> and a list of <em>pair insertion</em> rules (your puzzle input). You just need to work out what polymer would result after repeating the pair insertion process a few times.</p>
<p>For example:</p>
<pre><code>NNCB

CH -&gt; B
HH -&gt; N
CB -&gt; H
NH -&gt; C
HB -&gt; C
HC -&gt; B
HN -&gt; C
NN -&gt; C
BH -&gt; H
NC -&gt; B
NB -&gt; B
BN -&gt; B
BB -&gt; N
BC -&gt; B
CC -&gt; N
CN -&gt; C
</code></pre>
<p>The first line is the <em>polymer template</em> - this is the starting point of the process.</p>
<p>The following section defines the <em>pair insertion</em> rules. A rule like <code>AB -&gt; C</code> means that when elements <code>A</code> and <code>B</code> are immediately adjacent, element <code>C</code> should be inserted between them. These insertions all happen simultaneously.</p>
<p>So, starting with the polymer template <code>NNCB</code>, the first step simultaneously considers all three pairs:</p>
<ul>
<li>The first pair (<code>NN</code>) matches the rule <code>NN -&gt; C</code>, so element <code><em>C</em></code> is inserted between the first <code>N</code> and the second <code>N</code>.</li>
<li>The second pair (<code>NC</code>) matches the rule <code>NC -&gt; B</code>, so element <code><em>B</em></code> is inserted between the <code>N</code> and the <code>C</code>.</li>
<li>The third pair (<code>CB</code>) matches the rule <code>CB -&gt; H</code>, so element <code><em>H</em></code> is inserted between the <code>C</code> and the <code>B</code>.</li>
</ul>
<p>Note that these pairs overlap: the second element of one pair is the first element of the next pair. Also, because all pairs are considered simultaneously, inserted elements are not considered to be part of a pair until the next step.</p>
<p>After the first step of this process, the polymer becomes <code>N<em>C</em>N<em>B</em>C<em>H</em>B</code>.</p>
<p>Here are the results of a few steps using the above rules:</p>
<pre><code>Template:     NNCB
After step 1: NCNBCHB
After step 2: NBCCNBBBCBHCB
After step 3: NBBBCNCCNBBNBNBBCHBHHBCHB
After step 4: NBBNBNBBCCNBCNCCNBBNBBNBBBNBBNBBCBHCBHHNHCBBCBHCB
</code></pre>
<p>This polymer grows quickly. After step 5, it has length 97; After step 10, it has length 3073. After step 10, <code>B</code> occurs 1749 times, <code>C</code> occurs 298 times, <code>H</code> occurs 161 times, and <code>N</code> occurs 865 times; taking the quantity of the most common element (<code>B</code>, 1749) and subtracting the quantity of the least common element (<code>H</code>, 161) produces <code>1749 - 161 = <em>1588</em></code>.</p>
<p>Apply 10 steps of pair insertion to the polymer template and find the most and least common elements in the result. <em>What do you get if you take the quantity of the most common element and subtract the quantity of the least common element?</em></p>
</article>

In [4]:
import pandas as pd
import numpy as np

In [5]:
df = pd.read_csv("polimero.txt",sep="->",header=None, names=["p1", "p2"])
df.head()

  df = pd.read_csv("polimero.txt",sep="->",header=None, names=["p1", "p2"])


Unnamed: 0,p1,p2
0,NCOPHKVONVPNSKSHBNPF,
1,ON,C
2,CK,H
3,HC,B
4,NP,S


In [6]:
df["p1"] = df["p1"].str.strip()
df["p2"] = df["p2"].str.strip()
df.head()

Unnamed: 0,p1,p2
0,NCOPHKVONVPNSKSHBNPF,
1,ON,C
2,CK,H
3,HC,B
4,NP,S


In [7]:
reglas = {df.iloc[i]["p1"]:df.iloc[i]["p2"] for i in range(1,df[1:].shape[0]+1)}

In [8]:
lista_ = df["p2"].unique()
data = {e:np.zeros((len(lista_[1:])), dtype=int) for e in lista_[1:]}
data

{'C': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'H': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'B': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'S': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'V': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'P': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'O': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'F': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'K': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
 'N': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])}

In [9]:
formula = df.iloc[0]["p1"]
formula

'NCOPHKVONVPNSKSHBNPF'

In [10]:
matriz = pd.DataFrame(data, columns = lista_[1:], index = lista_[1:])
for i in lista_[1:]:
    matriz[i] = matriz[i].astype('int64')

In [11]:
for i in range(len(formula)-1):
    l1 = formula[i]
    l2 = formula[i+1]
    matriz.loc[l1,l2]+=1
matriz

Unnamed: 0,C,H,B,S,V,P,O,F,K,N
C,0,0,0,0,0,0,1,0,0,0
H,0,0,1,0,0,0,0,0,1,0
B,0,0,0,0,0,0,0,0,0,1
S,0,1,0,0,0,0,0,0,1,0
V,0,0,0,0,0,1,1,0,0,0
P,0,1,0,0,0,0,0,1,0,1
O,0,0,0,0,0,1,0,0,0,1
F,0,0,0,0,0,0,0,0,0,0
K,0,0,0,1,1,0,0,0,0,0
N,1,0,0,1,1,1,0,0,0,0


In [14]:
def armandoPolimero(matriz, lista_, num_iteraciones):
    m1=pd.DataFrame(data, columns = lista_[1:], index = lista_[1:])
    for i in lista_[1:]:
        m1[i] = m1[i].astype('Int64')
    pd.options.mode.chained_assignment = None 
    for i in range(num_iteraciones):
        m1[:] = 0
        for fila in lista_[1:]:
            for columna in lista_[1:]:
                cantidad = matriz.loc[fila, columna]
                if cantidad>0:
                    valor = reglas[fila+columna]
                    m1.loc[fila, valor]+=cantidad
                    m1.loc[valor, columna]+=cantidad
        matriz = m1.copy()
    return matriz

In [15]:
matriz = armandoPolimero(matriz, lista_, 10)
matriz

Unnamed: 0,C,H,B,S,V,P,O,F,K,N
C,557,349,445,547,429,0,416,40,413,83
H,356,296,440,0,0,0,0,0,203,532
B,240,228,236,119,482,0,97,46,247,134
S,1089,388,224,1025,204,441,0,0,119,279
V,19,542,46,589,202,307,393,83,0,96
P,116,0,0,889,147,144,115,73,260,0
O,204,0,0,180,49,320,225,83,173,163
F,0,0,0,0,359,0,0,179,0,0
K,674,0,1,147,309,147,102,35,93,0
N,24,24,437,273,96,385,49,0,0,0


In [16]:
frecuencia = matriz.sum()
frecuencia

C    3279
H    1827
B    1829
S    3769
V    2277
P    1744
O    1397
F     539
K    1508
N    1287
dtype: int64

In [17]:
l_ordenada = list(frecuencia)
l_ordenada.sort()
l_ordenada

[539, 1287, 1397, 1508, 1744, 1827, 1829, 2277, 3279, 3769]

In [18]:
l_ordenada[-1] - l_ordenada[0]

3230

<article class="day-desc"><h2 id="part2">--- Part Two ---</h2><p>The resulting polymer isn't nearly strong enough to reinforce the submarine. You'll need to run more steps of the pair insertion process; a total of <em>40 steps</em> should do it.</p>
<p>In the above example, the most common element is <code>B</code> (occurring <code>2192039569602</code> times) and the least common element is <code>H</code> (occurring <code>3849876073</code> times); subtracting these produces <code><em>2188189693529</em></code>.</p>
<p>Apply <em>40</em> steps of pair insertion to the polymer template and find the most and least common elements in the result. <em>What do you get if you take the quantity of the most common element and subtract the quantity of the least common element?</em></p>
</article>

In [21]:
matriz = pd.DataFrame(data, columns = lista_[1:], index = lista_[1:])
for i in lista_[1:]:
    matriz[i] = matriz[i].astype('int64')
for i in range(len(formula)-1):
    l1 = formula[i]
    l2 = formula[i+1]
    matriz.loc[l1,l2]+=1
matriz

Unnamed: 0,C,H,B,S,V,P,O,F,K,N
C,0,0,0,0,0,0,1,0,0,0
H,0,0,1,0,0,0,0,0,1,0
B,0,0,0,0,0,0,0,0,0,1
S,0,1,0,0,0,0,0,0,1,0
V,0,0,0,0,0,1,1,0,0,0
P,0,1,0,0,0,0,0,1,0,1
O,0,0,0,0,0,1,0,0,0,1
F,0,0,0,0,0,0,0,0,0,0
K,0,0,0,1,1,0,0,0,0,0
N,1,0,0,1,1,1,0,0,0,0


In [22]:
matriz = armandoPolimero(matriz, lista_, 40)
matriz

Unnamed: 0,C,H,B,S,V,P,O,F,K,N
C,761017306095,440784860311,602195320947,656117427662,548413494001,0,271379225745,47335977706,600552307312,61344406854
H,432469861990,281270381406,425668075559,0,0,0,0,0,300170445990,535212165954
B,256312754219,234489021732,212751307016,140266777248,561250904667,0,80250490269,53962871195,280582997298,106474707368
S,1418051568696,398992670277,206203341771,1250755532520,199477879660,364002400483,0,0,140266777248,212800595456
V,26969773077,591848627719,53962871195,707546466430,219372353848,353740798602,438779970676,107967522246,0,109676317290
P,64745791266,1,0,921822017241,161548729971,70526550954,75702992069,80762372565,140830019623,0
O,129338941860,0,0,138534148298,54832646109,276925446430,135827332702,94757425572,189069670247,122589415880
F,0,0,0,0,432152841131,0,0,216009710450,0,0
K,872821093560,0,0,161493818405,323139534406,161548729971,85102369528,47366671848,94486493116,0
N,27413235870,27405369453,425560914524,214014578307,109676317290,289194547250,54832646109,0,0,0


In [23]:
frecuencia = matriz.sum()
frecuencia

C    3989140326633
H    1974790930899
B    1926341831012
S    4190550766111
V    2609864701083
P    1515938473690
O    1141875027098
F     648162551582
K    1745958710834
N    1148097608802
dtype: int64

In [24]:
l_ordenada = list(frecuencia)
l_ordenada.sort()
l_ordenada

[648162551582,
 1141875027098,
 1148097608802,
 1515938473690,
 1745958710834,
 1926341831012,
 1974790930899,
 2609864701083,
 3989140326633,
 4190550766111]

In [25]:
l_ordenada[-1] - l_ordenada[0]

3542388214529