## Day 7

https://adventofcode.com/2021/day/7

In [1]:
hpos0 = [16,1,2,0,4,2,7,1,2,14]

filename = "data/input07.txt"
with open(filename) as f:
    hpos = [ int(h) for h in f.readlines()[0].split(",") ]

### Brute force search

In [2]:
def part1(hpos):
    return min([ sum([ abs(h0-h) for h0 in hpos ]) for h in range(min(hpos),max(hpos)+1) ] )

def part2(hpos):
    return min([ sum([ ((abs(h0-h)+1)*abs(h0-h))//2 for h0 in hpos ]) for h in range(min(hpos),max(hpos)+1) ] )

print("Test 1:",part1(hpos0))
print("Part 1:",part1(hpos))
print("Test 2:",part2(hpos0))
print("Part 2:",part2(hpos))

Test 1: 37
Part 1: 344138
Test 2: 168
Part 2: 94862124


### Slightly better approach using sample statistical properties 

Reduce brute force search to +/- 1 sigma interval around sample mean

In [3]:
import statistics as stat

def part1_1sigma(hpos):
    mean, stdev = stat.mean(hpos), stat.stdev(hpos)
    return min([ sum([ abs(h0-h) for h0 in hpos ]) for h in range(int(mean-stdev),int(mean+stdev)) ] )

def fuel2(h1,h2):
    dh = abs(h1-h2)
    return ((dh+1)*dh)//2

def part2_1sigma(hpos):
    mean, stdev = stat.mean(hpos), stat.stdev(hpos)
    return min([ sum([ fuel2(h0,h) for h0 in hpos ]) for h in range(int(mean-stdev),int(mean+stdev)) ] )
    
print("Test 1:",part1_1sigma(hpos0))
print("Part 1:",part1_1sigma(hpos))
print("Test 2:",part2_1sigma(hpos0))
print("Part 2:",part2_1sigma(hpos))

Test 1: 37
Part 1: 344138
Test 2: 168
Part 2: 94862124


### Full use of statistical properties

Statistics can in principle be used to avoid brute force. For Part 1 the closer minimiser of the linear fuel cost is the median, for part 2 where the cost is quadratic it's the mean. The annoying aspect of this approach is that neither the median or the mean are integer numbers, while the respective costs need to be computed with respect to an integer reference. The correct solutions can be found by approximating "by hand" median and mean to a close integer...

In [4]:
import statistics as stat
from math import floor, ceil

def part1stat(hpos):
    median = int(stat.median(hpos))
    return int(sum([ abs(h-median) for h in hpos ]))

def part2stat(hpos):
    mean = int(stat.mean(hpos)+0.4) # neither floor() nor ceil() give the correct result, why?
    return int(sum([ fuel2(h,mean) for h in hpos ]))
    
print("Test 1:",part1stat(hpos0))
print("Part 1:",part1stat(hpos))
print("Test 2:",part2stat(hpos0))
print("Part 2:",part2stat(hpos))

Test 1: 37
Part 1: 344138
Test 2: 168
Part 2: 94862124


#### Why does the best minimiser not exactly work?

Indeed the best minimiser for Part 2 would be the mean - 0.5 (assuming a _real_ function for the fuel cost based on the triangular number formula, that is indeed already a stretch), but this exact resul does not produce a _integer_ fuel cost as requested by the problem!

In [26]:
def fuel2R(h1,h2):
    dh = abs(h1-h2)
    return ((dh+1.)*dh)/2.

def part2statR(hpos):
    mean = stat.mean(hpos)+0.5
    return sum([ fuel2(h,mean) for h in hpos ])

In [27]:
p2I = part2stat(hpos)
p2R = part2statR(hpos)
print("Puzzle result = {}, true minimum = {}, difference = {}".format(p2I,p2R,p2I-p2R))

Puzzle result = 94862124, true minimum = 94861682.0, difference = 442.0
