# Coordinated expansion is the new politics

- https://adventofcode.com/2023/day/11

This problem shouts numpy and scipy to me, specifically because of the existance of the [`scipy.spatial.distance.pdist()` function](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html#scipy.spatial.distance.pdist), which handles calculation of _Pairwise distances between observations in n-dimensional space_.

All we really need to worry about then, is how to handle the expansion of the observed galaxies. Here is the plan:

- read the map into a boolean matrix
- use [`numpy.argwhere()`](https://numpy.org/doc/stable/reference/generated/numpy.argwhere.html) to get a matrix of galaxy coordinates.
- For both the (sorted) x and y columns, use [`numpy.diff()`](https://numpy.org/doc/stable/reference/generated/numpy.diff.html) to calculate the number of columns / rows there are between. If this value is greater than 1 there are empty columns of rows in between.
- In reverse order (so from highest to lowest coordinate), add the number of empty columns / rows to that coordinate _and all coordinates that are larger_. By going in reverse order the lower coordinates remain unchanged and so are easier to select for when it is their time to be shifted.

Once the coordinates have been adjusted to account for the expansion the `pdist()` function takes care of pairing up the coordinates and calculating their distances. The `cityblock` metric matches how the puzzle specificies distances should be calculated.


In [1]:
import numpy as np
from scipy.spatial.distance import pdist


class GalaxyMap:
    def __init__(self, image: str) -> None:
        self._galaxies = np.array(
            [[c == "#" for c in line] for line in image.splitlines()], dtype=bool
        )

    def pairwise_distances_sum(self, expansion=2) -> int:
        coords = np.argwhere(self._galaxies)
        # expand coords
        for col in (0, 1):
            # Get the values that are at least 2 greater than their preceding value
            # The values need sorting, as the y coords are not in order, and reversing
            # for easier processing.
            sorted = np.flip(np.sort(coords[:, col]))
            # append the last value again so we get differences to the next lower
            # coordinate plus a 0 difference for the lowest coordinate.
            # negate the differences as we are working with coordinates in
            # descending order.
            dist = -np.diff(sorted, append=sorted[-1])
            gaps = dist > 1
            for c, d in zip(sorted[gaps], dist[gaps]):
                # the space *doubles*; d is the number of empty cols
                # in between plus one, so add (d - 1) * (expansion - 1) more space
                coords[coords[:, col] >= c, col] += (d - 1) * (expansion - 1)
        return pdist(coords, "cityblock").astype(int).sum()


test_image = """\
...#......
.......#..
#.........
..........
......#...
.#........
.........#
..........
.......#..
#...#.....
"""
test_map = GalaxyMap(test_image)
assert test_map.pairwise_distances_sum() == 374

In [2]:
import aocd

galaxy_map = GalaxyMap(aocd.get_data(day=11, year=2023))
print("Part 1:", galaxy_map.pairwise_distances_sum())

Part 1: 9693756


# Expanding on the expansion

For part 2 I added an `expansion` argument to control how much space is added; instead of $d - 1$ (the number of empty columns or rows), add $(d - 1) * (expansion - 1)$ to the coordinates (so each empty column or row becomes $expansion$ number of columns or rows, including the one that's already there). The rest of the calculations don't change!


In [3]:
assert test_map.pairwise_distances_sum(10) == 1030
assert test_map.pairwise_distances_sum(100) == 8410

In [4]:
print("Part 2:", galaxy_map.pairwise_distances_sum(1_000_000))

Part 2: 717878258016
