Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Sharded) Dictionaries reload by parts #45390

Open
UnamedRus opened this issue Jan 18, 2023 · 0 comments
Open

(Sharded) Dictionaries reload by parts #45390

UnamedRus opened this issue Jan 18, 2023 · 0 comments
Labels

Comments

@UnamedRus
Copy link
Contributor

UnamedRus commented Jan 18, 2023

Use case

Really big in-memory dictionaries (over 200GB-500GB)

During dictionary reload ClickHouse hold 2 hash tables: one is old dictionary which is being active (used in dictGets) right now and
new one which is being currently loaded.
It does mean, that at some point it needs to have 2x of dictionary size, which is a lot of memory.

So idea is following, after introduction of sharded Dictionaries #40003
We can reload dictionary by shards, to total memory footprint should be much smaller. (One dictionary + shard_size*concurrent_reload_threads)

Only question here is consistency, but for some cases it doesnt matter or may be we can come up with some solution which will make this reload consistent. (having 2 versions of attributes in new hashtables and switch after complete reload of all hashtables to new version and remove old versions(basically recreate hashtables) after the switch, but it will for sure increase memory usage, but not in 2 times i guess)

Additional context

#40003 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant