Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate multi commands to support clustering with redislabs #548

Closed
wants to merge 1 commit into from

Conversation

sioked
Copy link

@sioked sioked commented Mar 24, 2015

...s clustering

@@ -197,12 +197,15 @@ exports.removeBadJob = function (id) {
client.multi()
.del (client.getKey('job:' + id + ':log'))
.del (client.getKey('job:' + id))
.exec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate on this? @sioked

@behrad
Copy link
Collaborator

behrad commented Apr 20, 2015

BUZZ!

@sioked
Copy link
Author

sioked commented Jul 8, 2015

Sorry- this is a really late response. I was using Kue in an environment where we were attempting to process millions of jobs a day across approximately 50 servers, each with at least 8 different workers running at the same time. We experienced performance issues and took the opportunity to test clustering using redislabs' clustered redis instances.

To make this work I had to separate some of the multi-commands so that multi commands would be executed against the same node, and I couldn't guarantee that the jobs: zsets would be on the same node as a specific job hash. By separating these multi commands, I could enforce that all hash data for a job would be on a specific node, and that the zsets would all be on their own specific node and multi functionality would work.

The same would be true with the new cluster spec in Redis 3, except we would need to update all of the keys in kue to follow the keys hash tags.

@sioked
Copy link
Author

sioked commented Jul 8, 2015

BTW, we ended up moving away from kue - even with clustering we kept running into performance issues on redis. The ZSETs were just too slow for our needs, and when we got a lot of failures or any of our ZSETS got large, it just slowed down even more until eventually redis would essentially not function. We switched over to a simpler (non-priority queue) queueing library - node-resque. Kue is still my preference, but at that scale I had to switch.

@sioked sioked changed the title Reverting some multi operations to allow separation of commands for redi... Separate multi commands to support clustering with redislabs Jul 8, 2015
@behrad
Copy link
Collaborator

behrad commented Jul 8, 2015

happy to hear from you @sioked

The same would be true with the new cluster spec in Redis 3, except we would need to update all of the keys in kue to follow the keys hash tags.

I'd love to add support for Redis 3 cluster, Do you confirm this PR is functioning Kue wide and tested with Redis 3 cluster also? (performance aside)

BTW, we ended up moving away from kue - even with clustering we kept running into performance issues on redis. The ZSETs were just too slow for our needs, and when we got a lot of failures or any of our ZSETS got large, it just slowed down even more until eventually redis would essentially not function. We switched over to a simpler (non-priority queue) queueing library - node-resque. Kue is still my preference, but at that scale I had to switch

What a pity! You mean you chose SETs over ZSETs !? Hadn't experienced such a situation before myself with ZSETS processing over 1.5-2 million jobs a day with a single server. your scale should be great and interesting.
I'd love to hear more on this. Can you give us some more numbers?

How many millions jobs per day?

With how many redis instances?

What was concurrency of each of those 8 workers (job-types) on each machine?

@sioked
Copy link
Author

sioked commented Jul 8, 2015

What was concurrency of each of those 8 workers (job-types) on each machine?

We had about 20 different job types that we were running - a message would come in and be translated to a new job type so that the correct server (or group of servers) would pick it up. We only had single concurrency per worker, but we would run one worker per core on the machine - hence 8 workers. That increased the number of connections to redis, but improved the load on our machines by spreading it out across each CPU core. This also allowed us to scale the servers independently (but had more connections to redis which was likely part of our performance issues).

How many millions jobs per day?
With how many redis instances?

We were processing ~25mm jobs per day with a single redis instance. Moved to a clustered environment for a short period and ran through the same numbers, but still received some performance issues. Our redis was through a hosted provider, so we also had a significant amount of network overhead (performance would be better if the servers and the redis instance were on the same network).

You mean you chose SETs over ZSETs

Actually, no- we used lists and serialized the json (for our job) directly into the list instead of referencing a separate hash. I wasn't sure how the serialized json would work, but because we significantly reduced the number of calls to redis and are using faster data access (LPUSH, LPOP) performance is no longer a problem. However, we don't get prioritized jobs, a nice admin panel, easy retry functionality with delays, etc.

I'd love to add support for Redis 3 cluster, Do you confirm this PR is functioning Kue wide and tested with Redis 3 cluster also? (performance aside)

Are you comfortable with changing the keys to use key hashing? I'd be happy to make the update to the PR and can test with Redis 3 cluster. We can start with a hash around: q:{jobs} for any of the major kue zsets and q:job:{id} for any job-specific hashes or logs. This would force q:{jobs}, q:{jobs}:*, q:{jobs}:*:*, all onto a single redis instance while the individual jobs would end up on different instances.

Not sure if that would give us a huge performance improvement, but it at least would move some of the commands off the main redis instance? Let me know what you think.

@behrad
Copy link
Collaborator

behrad commented Jul 8, 2015

As i expected from your first comment,

  1. You are creating too many client connections this way... and redis is a single thread process.
  2. I don't know about your workers characteristics but you may be ruining performance by setting worker concurrency to 1 (means 8*1 jobs on each machine at the same time)
  3. Using redis instances out of your local network would also hurt... I would liked to use a near/dedicated redis per each of my kue cluster.

I haven't searched on LIST vs ZSET performance, redis guys could give us more detailed hints on this, however kue could also support non-priority queues for better performance. I see this possible at the first glance :)

Are you comfortable with changing the keys to use key hashing? I'd be happy to make the update to the PR and can test with Redis 3 cluster

that sounds great to me. Can you also read #642 and #652 so that we get these all into one right direction.

@behrad
Copy link
Collaborator

behrad commented Oct 14, 2016

Closing as being old enough

@behrad behrad closed this Oct 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants