-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
using druid maha lookups as a replacement for lookups-cached-global #420
Comments
Of your 50-100 lookups, how many have the same key? How long does it currently take to load the lookups? You could convert your lookups to RocksDB based lookups where you create new snapshots once a day and publish updates via Kafka. This would require you to build a new RocksDB instance once a day, zip it up and publish it to HDFS. But it also means you would need some daemon process to do change data capture and publish the updated or new rows to Kafka. In your 50-100 lookups, if many of your lookups share the same key, you could replace them with our JDBC lookup since it allows for multiple values to be loaded in one lookup, saving duplication of key space. E.g. lookups-cached-global you have one key to one value: Map(a -> aa, b -> bb) Map(a-> 123, b -> 456), our JDBC lookups allow for just one lookup : Map( a -> (aa, 123), b -> (bb, 456)). At query time, you just specific which column you want in the extraction function. |
We haven't properly monitored the loading time.For one large lookup(around 10 million entries) , it takes around 45 minutes. |
@vsharathchandra might be easier to talk about this on gitter or hangouts |
okay sure will contact you on gitter. |
Hi ,
Currently we are using lookups-cached-global extension for loading lookups in druid(version - 0.12.3).We load lookups from different Mssql and Msql servers.We load around 50-100 lookups of which the top 10 have around 10-15 million entries.Because of such huge size of lookups we are having a lot of issues(high gc pauses,not able to query) while loading lookups on historicals and brokers.So I would like to use your extension as a replacement for lookups-cached-global.
Are there any queries that could be affected ?
Do you support extracting lookups from msql servers?
The text was updated successfully, but these errors were encountered: