refactor(core): simplify and future-proof database schema #34
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This is a major refactor to remove a lot of the complexity focused on optimising storage. As DuckDB uses FSST compression on strings, a lot of the manual enum logic we had to encourage better bitpacking wasn't necessary and thus made the code more complicated. That had also made filtering much more difficult for these enums as we could not run our string search algorithms on them. This change was also applied to countries, in which we now use full country names instead of country codes.
Languages
Additionally, to partially address #10 and #11, I've renamed a bunch of the columns in advance so we don't have an additional migration when we add these features post-release. This also changed how we processed languages, only storing the base language and not the dialect (until we address the mentioned issues).