FTS space efficiency #4
iansinnott
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Full text documents could be stored more efficiently. Current thoughts:
No. 1 seems like a no brainer, it just hasn't been implemented yet. Since we distill html contents a greater number of pages that are only trivially different in HTML (different css states for instance) would still hash to the same page.
No 2 requires a bit more thought, but would probably lead to huge space savings. These are all text documents we're talking about so they should compress well. FTS requires plain text to operate over, so we couldn't store everything in compressed form, but we can tweak the FTS params to create a separate search index so that documents could be located and then the full content only decompressed as needed.
Beta Was this translation helpful? Give feedback.
All reactions