Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JobSchedulerService "Couldn't execute next scheduler loop iteration" on non-master #6451

Closed
coffee-squirrel opened this issue Sep 16, 2019 · 4 comments · Fixed by #6816
Closed
Assignees

Comments

@coffee-squirrel
Copy link

@coffee-squirrel coffee-squirrel commented Sep 16, 2019

After upgrading from 3.0.2 to 3.1.2 we see the following logged every second for the 1 non-master node in this cluster:

2019-09-16T13:45:14.378-05:00 INFO  [JobSchedulerService] Couldn't execute next scheduler loop iteration. Waiting and trying again.
  • This message is not logged on the master node
  • The master node is the only one with is_master = true
  • A restart (of either node) makes no difference
  • Each node has a unique ID in /etc/graylog/server/node-id
  • Message processing doesn't seem to be impacted

Expected Behavior

Clean upgrade from 3.0.2 to 3.1.2.

Current Behavior

The job scheduler seems to have started on a non-master node?

Steps to Reproduce (for bugs)

  1. Have a 3.0.2 cluster (packages installed via repo+yum) with 1 master node and 1 non-master node
    • Plugins: AWS plugins, Collector, Enterprise Integrations, Graylog Enterprise, Integrations, Threat Intelligence Plugin
  2. Remove the 3.0 repo on the master node
  3. Add the 3.1 repo to the master node
  4. Install version 3.1.2-1 of graylog-server (and aforementioned plugins) on the master node
  5. Restart graylog-server on the master node
  6. Repeat 2-5 for the non-master node

Context

Attempting to roll out 3.1.x.

Your Environment

  • Graylog Version: 3.1.2
  • Java Version: AdoptOpenJDK 1.8.0_222
  • Elasticsearch Version: 6.6.2-1
  • MongoDB Version: 4.0.6-1
  • Operating System: RHEL 6.9
@mpfz0r

This comment has been minimized.

Copy link
Member

@mpfz0r mpfz0r commented Sep 17, 2019

This can happen, if you installed the enterprise plugin, but either don't have a vaild license,
or exceeded your license.

@coffee-squirrel

This comment has been minimized.

Copy link
Author

@coffee-squirrel coffee-squirrel commented Sep 17, 2019

@mpfz0r Thanks, that matches up. This particular lower environment has the Enterprise plugins (as the default behavior of the Chef cookbook) but does not have a license applied.

Are there plans to change this behavior, or should we just exclude the plugins whenever there won't be a license (or a valid license) applied?

@mpfz0r

This comment has been minimized.

Copy link
Member

@mpfz0r mpfz0r commented Sep 17, 2019

I guess the least we should do is reduce the amount of log messages this creates right now.

@coffee-squirrel

This comment has been minimized.

Copy link
Author

@coffee-squirrel coffee-squirrel commented Sep 18, 2019

We rolled 3.1.2 out to an environment with a valid Enterprise license and (as expected based upon the earlier comment) don't see the message logged on non-master nodes.

@bernd bernd added the #S label Oct 28, 2019
@thll thll self-assigned this Nov 14, 2019
thll added a commit that referenced this issue Nov 15, 2019
When the enterprise login is installed but there is no valid license
for a non-master node, job execution will be disabled. The job scheduler
would log every second that the next loop iteration cannot proceed.

It will now log only when the state of the config changes.

Fixes #6451
@mpfz0r mpfz0r closed this in #6816 Nov 15, 2019
mpfz0r added a commit that referenced this issue Nov 15, 2019
#6816)

When the enterprise login is installed but there is no valid license
for a non-master node, job execution will be disabled. The job scheduler
would log every second that the next loop iteration cannot proceed.

It will now log only when the state of the config changes.

Fixes #6451
thll added a commit that referenced this issue Nov 20, 2019
#6816)

When the enterprise login is installed but there is no valid license
for a non-master node, job execution will be disabled. The job scheduler
would log every second that the next loop iteration cannot proceed.

It will now log only when the state of the config changes.

Fixes #6451

(cherry picked from commit 9a47813)
mpfz0r added a commit that referenced this issue Nov 21, 2019
#6816) (#6837)

When the enterprise login is installed but there is no valid license
for a non-master node, job execution will be disabled. The job scheduler
would log every second that the next loop iteration cannot proceed.

It will now log only when the state of the config changes.

Fixes #6451

(cherry picked from commit 9a47813)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.