Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScheduledTaskStatistics.getLastRunDuration() returns negative values for quick executions #11929

Closed
bedrin opened this issue Dec 7, 2017 · 5 comments

Comments

@bedrin
Copy link

@bedrin bedrin commented Dec 7, 2017

I can see that for some of my scheduled jobs getLastRunDuration() returns negative values.
Issue is intermittent. It seems that it is triggered by fast jobs

@tkountis
Copy link
Contributor

@tkountis tkountis commented Dec 7, 2017

Interesting finding, thanks for the report.
I will have a look and try to reproduce, but if you have any logs that could help please share.
Maybe members going offline in the logs?

@bedrin
Copy link
Author

@bedrin bedrin commented Dec 7, 2017

I have reproduced it on my local workstation with a single member in cluster.

Looking at the code I can see that race-condition between updating lastRunStart/lastRunEnd fields and taking snapshot is possible:

  1. Thread 1 updates lastRunStart
  2. Thread 2 takes a snapshot (effectively copy values from lastRunStart and lastRunEnd)
  3. Thread 1 updates lastRunEnd

In this case result of getLastRunDuration() operation would be negative

@tkountis
Copy link
Contributor

@tkountis tkountis commented Dec 7, 2017

Can you share the reproducer here?

@bedrin
Copy link
Author

@bedrin bedrin commented Dec 7, 2017

Sorry, cannot share any code from office.

Basically what I do is call schedule job at fixed rate R (say 1 minute) and immediately after that start a timer (vanila java Timer from JRE) with the same rate which calls IScheduledFuture.getStats() method

If my job takes a few seconds to execute, IScheduledFuture.getStats() called from Timer will get into race-condition described above and print the difference between lastRunEnd from previous execution and lastRunStart from current execution which is obviously negative

@tkountis
Copy link
Contributor

@tkountis tkountis commented Dec 7, 2017

I see. This is a good finding,
and i think there are a couple of ways to hit this.

Will fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants
You can’t perform that action at this time.