Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(core): simplify and future-proof database schema #34

Merged
merged 6 commits into from
Jun 22, 2024
Merged

Conversation

ayuhito
Copy link
Member

@ayuhito ayuhito commented Jun 22, 2024

Overview

This is a major refactor to remove a lot of the complexity focused on optimising storage. As DuckDB uses FSST compression on strings, a lot of the manual enum logic we had to encourage better bitpacking wasn't necessary and thus made the code more complicated. That had also made filtering much more difficult for these enums as we could not run our string search algorithms on them. This change was also applied to countries, in which we now use full country names instead of country codes.

Languages

Additionally, to partially address #10 and #11, I've renamed a bunch of the columns in advance so we don't have an additional migration when we add these features post-release. This also changed how we processed languages, only storing the base language and not the dialect (until we address the mentioned issues).

@ayuhito ayuhito marked this pull request as ready for review June 22, 2024 09:25
@ayuhito ayuhito merged commit 81a8678 into main Jun 22, 2024
5 checks passed
@ayuhito ayuhito deleted the fix/lang branch June 22, 2024 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant