## 2025-12-03: Intro

I saw my kid playing <a href="https://youtube.com/watch?v=VAdmJ525lXY">Stitch.</a>, so naturally I wanted to write a solver for it. If I understand correctly, the game is based on <a href="https://en.wikipedia.org/wiki/Shikaku">Shikaku</a>:

> Shikaku is played on a rectangular grid. Some of the squares in the grid are numbered. The objective is to divide the grid into rectangular $\dots$ pieces such that each piece contains exactly one number, and that number represents the area of the rectangle.

I will try to consistently use terms as follows:
- **Grid** is used as described above and refers to the puzzle as presented to the player. It is composed of squares and contains numbers but not rectangles.
- **Square** refers to an individual cell in the grid.
- **Rectangle** refers to a solid rectangle formed by one or more squares.
- **Number** refers to one of the numbers in the grid. It resides in a particular square and in a solution is associated with a particular rectangle.

### Thoughts about a solver

Which of the following will be more efficient?

- choosing as the next step in the search a particular square containing a number, or a particular square regardless of whether it contains a number, in either case finding all possible rectangles containing that square
- if choosing only squares with numbers — starting from larger numbers, or smaller numbers, or prime numbers, or numbers with the most factors
- choosing a next step adjacent to or distant from the previous step
- starting from the inside or the outside
- breadth-first search or depth-first search

For example, would doing all four corners first be more efficient? Walking the perimeter? Starting from the largest numbers?

Also, what should I compute in advance?

- Which numbers' rectangle could overlap with a given square?
- Should I skip right to which rectangles could overlap with a given square?
- Is it practical to cache results for a given partial solution?
- Are there arrangements of numbers that can be efficiently reduced into their possible combinations of rectangles? (My gut says no, not efficiently.)

How large of a grid am I willing to consider? How high of numbers will I allow?

And I quickly realized that generating sample grids is not at all trivial.

### Thoughts about grid generation

I have the sense that attempting to generate a "random" grid is <a href="https://www.youtube.com/watch?v=mZBwsm6B280">almost a meaningless statement</a>.

Let's look at how many possible 2 $\times$ 2 grids there are. For any given collection of rectanges, each rectangle's number $n$ could have been be placed in any of the $n$ squares within the rectangle. This means the quantity of grids produced by any given collection of rectangles is equal to the product of the rectangles' areas, or numbers, less any duplication that may occur from different collections of rectangles producing the same grid. These are the possible combinations of rectangle areas and how many distinct ways they can occur:

- (4,) — 1 distinct partition
- (2, 2) — 2 distinct partitions, vertically or horizontally
- (2, 1, 1) — 4 distinct partitions, as the 2 can be vertical or horizontal and in the first or second row/column
- (1, 1, 1, 1) — 1 distinct partition

These suggest 21 total grids from these 8 distinct partitions, but only 19 of the grids are distinct. In the (2, 2) case, the numbers can be on a diagonal whether the rectangles are vertical or horizontal, so instead of $(2 \times 2) \times 2 = 8$ there are only 6 distinct grids from this case.

Still, that's 19 distinct grids from 8 distinct partitions, and most of these partitions have a significant number of rectangles of size 1. (My kid says size 1 is allowed, but I haven't seen this in the examples I've found online.) I imagine that grids-to-partitions ratio may grow as the size of the grid grows. And even if it doesn't, <a href="https://oeis.org/A333476">OEIS A333476</a> indicates that there are over 84 million distinct partitions for a $5 \times 5$ rectangle.

Getting back to grid generation — what would a "random $5 \times 5$ grid" mean? Would it mean generating all 84 million distinct partitions, and all their however-many distinct grids, and then randomly selecting one? That's not a practical approach. It's also probably misguided. A better approach may be to check for 

To the extent that there is some real "population" of grids that I would try to make my solver efficient at solving, those grids would have been created by some algorithm, and I'd effectively be trying to figure out what that algorithm might be.

Perhaps as I go about testing my own algorithms for grid generation, I can evaluate them with statistics. It's not clear to me whether on average I should expect the locations of numbers to be uniformly distributed throughout the grid, but I would expect numbers to be equally likely between symmetrical positions. For example, there shouldn't be a bias toward the upper-left.

Also, I may be able to find enough examples of puzzles intended for humans to solve that I can establish a preference for the relative frequency of different numbers. On average there should probably be more 3s than 9s, but how much more? Should there be more 7s or 8s?

I'm also interested in how many solutions a grid has. One of my favorite puzzle games, <a href="https://en.wikipedia.org/wiki/Flow_Free">Flow Free</a>, seems to limit its puzzles to those with a unique solution. I wonder whether, for example, the numbers in a grid, or their positions, may be correlated with how many solutions the grid has. And then by tweaking the generation algorithm, I would be more likely to create grids with very few or very many solutions.

### Objectives

- Write a solver. Make it quantifiably better.
- Write a grid generator. Make it statistically defensible.
- Analyze the number of solutions for various grids. Tweak the generator as desired without losing its statistical defensibility.