Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidekiq 5 UI compatibility #458

Merged
merged 5 commits into from Nov 28, 2021
Merged

Sidekiq 5 UI compatibility #458

merged 5 commits into from Nov 28, 2021

Conversation

ananthakumaran
Copy link
Collaborator

@ananthakumaran ananthakumaran commented Oct 10, 2021

Adds support for sidekiq 5 UI. Threads (aka concurrency) is set to 0 since we allow custom dequeuers which might allow jobs based on different constrain. Busy count should be accurate. The Quiet button would cause the node to unsubscribe from all queues.

Code

  1. This change introduces the concept of worker nodes. The list of active worker nodes is tracked in a set exq:processes. For each worker node, some metadata like subscribed queues, number of busy jobs, start time etc are tracked. A new GenServer called Exq.Node.Server is introduced to send heartbeat and the metadata info to the redis every 5 seconds. It also cleans up the dead worker nodes' info from redis.
  2. The concept of per worker signal is introduced. Currently we support TSTP (quiet), which will ask the worker node to unsubscribe all active queue subscription.
  3. Per worker metadata is stored under exq:{node_id} hash and kept updated by the GenServer. This info will expire in 60 seconds (which should happen when the worker dies)
  4. Per worker active jobs are stored in exq:{node_id}:workers hash. Each individual job will get removed once it's done. The whole hash is deleted on node startup and on dead node cleanup

Upgrade Instructions

There are 2 types of breaking changes introduced by this PR.

  1. We used to store the set of actives jobs under exq:processes, but the list of active worker nodes node_id is being stored now. The new version gracefully handles unknown entries (it will be cleaned up eventually by prune_dead_nodes). So upgrade should not be an issue. But the old code won't handle node_id (non json), so it can't be rolled back without first deleting the exq:processes set manually from redis console.
  2. Some of the data structures used also got changed which will break ExqUI. Specially Exq.Api.processes/1 will return struct with different attributes. Those who use ExqUI should plan the upgrade of both Exq and ExqUI. The next release of ExqUI will be compatible with the new struct.

screenshot from single node
image

screenshot from multiple nodes (using custom identifier functionality)
image

fixes #439 #289 #288

The old version used to store job details under processes key. The
periodic prune will remove those jobs as well any dead nodes from the
processes set
@ananthakumaran ananthakumaran marked this pull request as ready for review October 15, 2021 08:48
@akira
Copy link
Owner

akira commented Oct 25, 2021

Are these changes compatible with existing ExqUI or is the idea to release it together with new UI rewrite?

@ananthakumaran
Copy link
Collaborator Author

Are these changes compatible with existing ExqUI or is the idea to release it together with new UI rewrite?

I haven't tested it, but I guess it would be broken because the Process struct got updated. Yes, my plan is to make a exq release followed by the new exq_ui release.

JobStat.node_ping(redis, namespace, node)
|> process_signal(state)

if Integer.mod(state.ping_count, 10) == 0 do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 10 for? Should be configured elsewhere or constant?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to remove the dead nodes from the processes set, the mod 10 is just a optimization to avoid doing the cleanup check on each ping instead do it on every 10th ping. The number 10 itself comes from sidekiq implementation. If you feel this should be configurable, I can move it to application config

@akira
Copy link
Owner

akira commented Nov 3, 2021

Possible to add more description of changes in PR / commit such as adding new GenServer for Node, changes to JobStat, etc.


defp process_signal(nil, _), do: :ok

defp process_signal("TSTP", state) do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this one for? I didn't see it used elsewhere..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quiet button in the UI pauses the worker (the worker stops picking up any new job). This is done via a signal list. UI will push TSTP string into the list and the worker will pop that and perform the action. I plan to add same feature in the Exq UI later as well.

@akira
Copy link
Owner

akira commented Nov 3, 2021

Anything special instructions needed when upgrading to this branch once it lands? For handling existing inflight jobs / working nodes, job stats, etc..

@ananthakumaran
Copy link
Collaborator Author

ananthakumaran commented Nov 4, 2021

Anything special instructions needed when upgrading to this branch once it lands? For handling existing inflight jobs / working nodes, job stats, etc..

If ExqUI is not used, then nothing needs to be done. There are some breaking changes related to where which info is stored, but the prune_dead_nodes parts will take care of the migration (it will remove the old job entries from exq:processes set). Though, the upgrade can't be rolled back easily. As the old code will expect jobs in exq:processes set. To rollback the exq:processes needs to be deleted (we can include this instruction as part of release info)

If ExqUI is used, then the user has to wait for the new ExqUI release and upgrade ExqUI after Exq upgrade.

@ananthakumaran
Copy link
Collaborator Author

Possible to add more description of changes in PR / commit such as adding new GenServer for Node, changes to JobStat, etc.

I have updated the PR description with more info. Let me know if anything else needs to be covered

@akira
Copy link
Owner

akira commented Nov 12, 2021

Thanks for the updates! Looks good I think - is the other PR also ready in ExqUI or still pending work on it?

@ananthakumaran
Copy link
Collaborator Author

Thanks for the updates! Looks good I think - is the other PR also ready in ExqUI or still pending work on it?

akira/exq_ui#102 is ready for review. It includes the busy tab and related changes introduced in this branch.

@akira
Copy link
Owner

akira commented Nov 15, 2021

@ananthakumaran Great thanks, will take a look at the other one as well, sorry for delay

@akira
Copy link
Owner

akira commented Nov 28, 2021

Thanks for the updates! Looks good I think - is the other PR also ready in ExqUI or still pending work on it?

akira/exq_ui#102 is ready for review. It includes the busy tab and related changes introduced in this branch.

Sounds good, I tried it out and works well! 🎉 Let me know your preference for release timing on these. For the exq_ui maybe some additional unit tests (can be separate PR also).

@ananthakumaran
Copy link
Collaborator Author

Both PRs can be merged to master. As for release, we could take a bit more time. I will open separate PRs for CI setup and tests for exq_ui. We will also try to get this tested out in our prod service (which is getting delayed due to phoenix version dependency)

@akira
Copy link
Owner

akira commented Nov 28, 2021

Both PRs can be merged to master. As for release, we could take a bit more time. I will open separate PRs for CI setup and tests for exq_ui. We will also try to get this tested out in our prod service (which is getting delayed due to phoenix version dependency)

Sounds good! I added as maintainer to the other project as well, please feel free to merge.

@akira
Copy link
Owner

akira commented Dec 12, 2021

This has been released, Exq 0.16.0 and ExqUI 0.1.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants