The second challenge here is cleaning up all of those junk rows in existing *_fts_docsize tables. Doing that just to the demo database from https://github-to-sqlite.dogsheep.net/github.db dropped its size from 22MB to 16MB! Here's the SQL:
DELETE FROM [licenses_fts_docsize] WHERE id NOT IN (
SELECT rowid FROM [licenses_fts]);
I can do that as part of the existing table.optimize() method, which optimizes FTS tables.
Originally posted by @simonw in #149 (comment)