## Advent of Code - Day 6

In [1]:
import tempfile
from contextlib import contextmanager

In [2]:
@contextmanager
def test_file(test_input):
    with tempfile.NamedTemporaryFile('r+') as f:
        f.write(test_input)
        f.seek(0)
        yield f

### Part 1

> Using only the Manhattan distance, determine the area around each coordinate by counting the number of integer X,Y locations that are closest to that coordinate (and aren't tired in distance to any other coordinate).

> Your goal is to find the size of the largest area that isn't infinite.

The first thing to notice is that we only need to consider locations within the space bounded by the extreme points. 

Let $n_x$ and $n_y$ be the length of the x and y dimensions respectively in this bounded space. The naive solution is to iterate over each of the $O(n_x n_y)$ possible locations in the space and find the closest point. Finding the closest point is $O(k)$ where $k$ is the total number of points (Manhattan distance between two points is $O(1)$). Hence the total complexity is $O(n_x n_y k)$.

Can we do better than $O(k)$ to find the nearest neighbor via Manhattan distance? Probably (using a k-d tree?) but for now lets write the naive solution pending more time than one evening after work permits.

We start by parsing the points from the `input` file:

In [30]:
!ls

day_6.ipynb  input


In [31]:
!head input

242, 164
275, 358
244, 318
301, 335
310, 234
159, 270
82, 142
229, 286
339, 256
305, 358


In [32]:
import re
from operator import itemgetter

In [33]:
def get_points(path):
    """Yields each point from the file at path."""
    pattern = re.compile('^(\d+), (\d+)\s?$')
    with open(path, 'r') as f:
        for line in f:
            match = pattern.match(line)
            if not match:
                raise ValueError('invalid input file: %r' % path)
            yield (tuple(int(c) for c in match.groups()))

Check that this works for the example input:

In [34]:
test_input = """1, 1
1, 6
8, 3
3, 4
5, 5
8, 9
"""

In [35]:
with test_file(test_input) as f:
    for p in get_points(f.name):
        print(p)

(1, 1)
(1, 6)
(8, 3)
(3, 4)
(5, 5)
(8, 9)


Now compute the space bounds:

In [37]:
def get_bounds(points):
    """Returns the min and max bounnds for the space."""
    (x_min, y_min), (x_max, y_max) = (None, None), (None, None)
    for x, y in points:
        if x_min is None:
            (x_min, y_min), (x_max, y_max) = (x, y), (x, y)
        else:
            x_min = min(x, x_min)
            y_min = min(y, y_min)
            x_max = max(x, x_max)
            y_max = max(y, y_max)
    return (x_min, y_min), (x_max, y_max)

In [38]:
with test_file(test_input) as f:
    print(get_bounds(get_points(f.name)))

((1, 1), (8, 9))


Note that points with infinite area will occur on the edges of the bounded space so we'll need a function to determine if a given location is on the edge of the space:

In [39]:
def on_edge(location, bounds):
    """Returns True if the location is on the edge of the space."""
    x, y = location
    (x_min, y_min), (x_max, y_max) = bounds
    if (x == x_min or x == x_max) and y_min <= y <= y_max:
        return True
    if (y == y_min or y == y_max) and x_min <= x <= x_max:
        return True
    return False

Now to iterate over the entire space:

In [40]:
def locations(lower, upper):
    """Yields each possible location within the bounded space (inclusive)."""
    x_min, y_min = lower
    x_max, y_max = upper
    for x in range(x_min, x_max + 1):
        for y in range(y_min, y_max + 1):
            yield x, y

In [41]:
def manhattan(a, b):
    """Returns the Manhattan distance between a and b."""
    return abs(a[0] - b[0]) + abs(a[1] - b[1])

OK! We have now bounded the space, can compute the locations that we need to iterate over, which locations are on the edge of the space, and the manhattan distance between two points. Putting this all together for the naive $O(n_x n_y k)$ method:

In [42]:
def largest_area(path):
    points = list(get_points(path))
    bounds = get_bounds(points)
    areas = [0] * len(points)
    infinite = [False] * len(points)
    
    for location in locations(*bounds):
        closest = None
        equidistant = False
        for i, point in enumerate(points):
            distance = manhattan(location, point)
            if closest is None or distance < closest[1]:
                equidistant = False
                closest = (i, distance)
            elif distance == closest[1]:
                equidistant = True
                
        if closest is not None and not equidistant:
            areas[closest[0]] += 1
            
        if on_edge(location, bounds):
            infinite[closest[0]] = True
            
    largest = None
    for point, area, inf in zip(points, areas, infinite):
        if inf:
            continue
        if largest is None or area > largest:
            largest = area
    
    return largest

In [43]:
with test_file(test_input) as f:
    print(largest_area(f.name))

17


Looks good! Now for the real input:

In [44]:
largest_area('input')

5532

### Part 2

We may need to enlarge our search space given that the total distance is large. Expanding this ~10000 in each direction would result in computing 100M locations! That would take a while...

Instead we'll first compute the result for the original space. We'll then consider a 1-coordinate wide band of locations around our original space. We continue growing the space by a band of width 1 until none of the new locations are within the required total distance. It is OK to stop once this happens as going out in any direction from here can only increase the distance from any point and hence the total distance.

There may be a more optimal solution but it's late in the evening...

In [45]:
 def band_iter(lower, upper):
    """Yields values for the bounding box edge locations."""
    (x_min, y_min), (x_max, y_max) = lower, upper
    
    for x in range(x_min, x_max + 1):
        yield x, y_min
        yield x, y_max
    for y in range(y_min + 1, y_max):
        yield x_min, y
        yield x_max, y

def grow(lower, upper):
    """Returns the new bounds and a band iterator."""
    (x_min, y_min), (x_max, y_max) = lower, upper
    
    new_bounds = (x_min - 1, y_min - 1), (x_max + 1, y_max + 1)
            
    return new_bounds, band_iter(*new_bounds)

In [46]:
def region_size(points, locations, total_distance):
    """Returns number of locations with total distance to points < `total_distance`."""
    size = 0
    for location in locations:
        distance = 0
        for point in points:
            distance += manhattan(location, point)
            if distance >= total_distance:
                break
        else:
            size += 1
    return size

In [47]:
def region(path, total_distance):
    points = list(get_points(path))
    bounds = get_bounds(points)
    band_iter = locations(*bounds)
    
    size = 0
    last_size = None
    while last_size is None or last_size > 0:
        last_size = region_size(points, 
                                band_iter, 
                                total_distance)
        size += last_size
        bounds, band_iter = grow(*bounds)
        
    return size

Make sure this works on the example:

In [48]:
with test_file(test_input) as f:
    print(region(f.name, total_distance=32))

16


Apply to real input:

In [50]:
region('input', total_distance=10000)

36216