# On the range

https://adventofcode.com/2023/day/5

Today's puzzle is all about ranges. Don't store each possible individual value, because the puzzle input uses the full 32 bit unsingned integer range!

Instead, store just a tuple of the source, the length and the destination values. If you keep the list of ranges sorted my their starting points, you can then use [bisection](https://docs.python.org/3/library/bisect.html) to quickly find a matching source range and verify that the mapped value falls inside the range. If it does, map the value to the destination, if it doesn't, return the original value.

The implementation for part one not only returns the mapped value, but also how many value remain in the range that was used to map the source value to the destination. This is used in part two.


In [1]:
import typing as t
from bisect import bisect
from dataclasses import dataclass
from operator import itemgetter


@dataclass
class AlmanacMap:
    from_: str
    to_: str
    ranges: list[tuple[int, int, int]]

    @classmethod
    def from_entry(cls, entry: str) -> t.Self:
        first, *lines = entry.splitlines()
        from_, _, to_ = first.partition(" ")[0].partition("-to-")
        ranges = [
            (int(src), int(length), int(dst))
            for dst, src, length in map(str.split, lines)
        ]
        return cls(from_, to_, sorted(ranges, key=itemgetter(0)))

    def __getitem__(self, value: int) -> tuple[int, int | None]:
        """Map a value through the almanac table

        Returns the new value, and the remaining length of the source section it
        was mapped through, or None if the value lies outside the maximum value
        of the table.

        """
        if (idx := bisect(self.ranges, value, key=itemgetter(0))) > 0:
            src, length, dst = self.ranges[idx - 1]
            if (offset := value - src) < length:
                return dst + offset, length - offset
        if idx < len(self.ranges):
            return value, self.ranges[idx][0] - value
        return value, None


@dataclass
class Almanac:
    seeds: list[int]
    maps: dict[str, AlmanacMap]

    @classmethod
    def from_entries(cls, *entries: str) -> t.Self:
        seeds_line, *entries = entries
        seeds = [int(seed) for seed in seeds_line.partition(": ")[-1].split()]
        maps = {map_.from_: map_ for map_ in map(AlmanacMap.from_entry, entries)}
        return cls(seeds, maps)

    def __getitem__(self, seed: int) -> int:
        current = "seed"
        value = seed
        while current != "location":
            map_ = self.maps[current]
            current = map_.to_
            value, _ = map_[value]
        return value


test_almanac_text = """\
seeds: 79 14 55 13

seed-to-soil map:
50 98 2
52 50 48

soil-to-fertilizer map:
0 15 37
37 52 2
39 0 15

fertilizer-to-water map:
49 53 8
0 11 42
42 0 7
57 7 4

water-to-light map:
88 18 7
18 25 70

light-to-temperature map:
45 77 23
81 45 19
68 64 13

temperature-to-humidity map:
0 69 1
1 0 69

humidity-to-location map:
60 56 37
56 93 4
"""
test_almanac = Almanac.from_entries(*test_almanac_text.split("\n\n"))
assert min(test_almanac[seed] for seed in test_almanac.seeds) == 35

In [2]:
import aocd

almanac = Almanac.from_entries(*aocd.get_data(day=5, year=2023).split("\n\n"))
print("Part 1:", min(almanac[seed] for seed in almanac.seeds))

Part 1: 382895070


# The green revolution is here!

Part 2 just scales up part one. Luckily we are already using bisection to handle the lookups, it's the fastest way to handle any given seed lookup!

However, there are still an _awful lot of seeds_ to process here. The total length of my puzzle input seed ranges covers more than 2 billion values. Even if you can map a given seed value to its location in 1 microsecond, it would still take about 40 minutes to map this much seed to locations.

Instead of mapping individual values, we could map ranges; any given range might need to be split up by each map as they won't all be using the same mapping entries, but the splitting can be done entirely based on the lengths of the source ranges. This cuts down the amount of work significantly.

I first refactored the code for part one to not only return the mapped value, but also the remaining length in the source range. The extra return value is not used anywhere else in part one, but in part two we can use this to then split up source ranges. The almanac then only has to return the start value of the smallest range after mapping.


In [3]:
from collections import deque


class RangeAlmanacMap(AlmanacMap):
    def __getitem__(self, values: tuple[range, ...]) -> tuple[range, ...]:
        results = []
        queue = deque(values)
        while queue:
            value = queue.popleft()
            size = len(value)
            dst, remainder = super().__getitem__(value.start)
            if remainder and size > remainder:
                # process the section that doesn't fit
                queue.append(value[remainder:])
                size = remainder
            # map the part of the range that fits
            results.append(range(dst, dst + size))
        return tuple(results)


@dataclass
class RangeAlmanac(Almanac):
    maps: dict[str, RangeAlmanacMap]

    @classmethod
    def from_entries(cls, *entries: str) -> t.Self:
        inst = super().from_entries(*entries)
        inst.maps = {
            to_: RangeAlmanacMap(**vars(map_)) for to_, map_ in inst.maps.items()
        }
        return inst

    def __getitem__(self, values: tuple[range, ...]) -> int:
        current = "seed"
        while current != "location":
            map_ = self.maps[current]
            current = map_.to_
            values = map_[values]
        return min(v.start for v in values)

    @property
    def seed_ranges(self) -> tuple[range, ...]:
        it = iter(self.seeds)
        return tuple(range(start, start + length) for start, length in zip(it, it))


test_almanac = RangeAlmanac.from_entries(*test_almanac_text.split("\n\n"))
assert test_almanac[test_almanac.seed_ranges] == 46

In [4]:
almanac = RangeAlmanac.from_entries(*aocd.get_data(day=5, year=2023).split("\n\n"))
print("Part 2:", almanac[almanac.seed_ranges])

Part 2: 17729182
