You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While current "queue-like system" on top of clickhouse worked quite well for testing it's no near as good as required for any serious high-volume use
Recently I did some testing on a beefy AWS hardware and fixed some internal bottlenecks(not yet merged) and in some testing scenarios where I could temporary alleviate the last left bottleneck - job distribution(writing new/updating completed/selecting), Crusty was capable of doing over 900MiB/sec - a whooping 7+gbit/sec! on 48 core(96 logical) c5.metal with a 25gbit/s port
using correct underlying data types(mostly sets and bloom filter for history) + batching and pipelining we can have solid throughput, low cpu usage per redis node, decent reliability and scalability
careful expiration could help to avoid memory overflow on redis node - we always discover domains faster than we can process them
The text was updated successfully, but these errors were encountered:
While current "queue-like system" on top of clickhouse worked quite well for testing it's no near as good as required for any serious high-volume use
Recently I did some testing on a beefy AWS hardware and fixed some internal bottlenecks(not yet merged) and in some testing scenarios where I could temporary alleviate the last left bottleneck - job distribution(writing new/updating completed/selecting),
Crusty
was capable of doing over 900MiB/sec - a whooping 7+gbit/sec! on 48 core(96 logical) c5.metal with a 25gbit/s portNew job queue should be solely redis-based using redis modules: https://redis.io/topics/modules-intro
rust has good enough library to allow writing redis module logic: https://github.com/RedisLabsModules/redismodule-rs
We will use pre-sharded queue(based on
addr_key
)Atomic operations:
using correct underlying data types(mostly sets and bloom filter for history) + batching and pipelining we can have solid throughput, low cpu usage per redis node, decent reliability and scalability
careful expiration could help to avoid memory overflow on redis node - we always discover domains faster than we can process them
The text was updated successfully, but these errors were encountered: