Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage after the system time is changed #36

Closed
swebb2066 opened this issue Aug 13, 2015 · 15 comments
Closed

High CPU usage after the system time is changed #36

swebb2066 opened this issue Aug 13, 2015 · 15 comments
Assignees

Comments

@swebb2066
Copy link

The following code from networkworkhandler.c consumes 100% of the CPU for some time after the system clock is advanced a lot (e.g. > 1 day).

  actualtime = getmilliseconds();
  TheNetworkStatus.elapsedtime += actualtime - lasttime;
  lasttime = actualtime;
 /* check if we had been not able to update the connection manager for several OPENER_TIMER_TICK.
   * This should compensate the jitter of the windows timer
   */
  while (TheNetworkStatus.elapsedtime >= OPENER_TIMER_TICK)
    {
      /* call manage_connections() in connection manager every OPENER_TIMER_TICK ms */
      manageConnections();
      TheNetworkStatus.elapsedtime -= OPENER_TIMER_TICK;
    }
@azoitl
Copy link
Contributor

azoitl commented Aug 13, 2015

The problem is that this code handles the fact when opener is blocked by the operating system longer then the OPENER_TIMER_TICK. This helps to keep the overall timing behavior stable. The problem now with step changes is that this code wants to compensate for the step change. Because for OpENer it looks like the operating system has blocked OpENer for a long time.

In order to handle this correctly the process of making the step change would need to update lasttime. or that the getmilliseconds uses a timer which is not effected by the step change.

@swebb2066
Copy link
Author

I do not see why multiple calls to manageConnections() (in a busy loop) is the right thing to do.

How about passing "elapsedtime" to manageConnections() so the "InnacitvityWatchdogTimer" and "TransmissionTriggerTimer" are updated all at once and changing "while" to an "if"? Then TheNetworkStatus.elapsedtime can be zeroed.

@swebb2066 swebb2066 reopened this Aug 15, 2015
@azoitl
Copy link
Contributor

azoitl commented Aug 25, 2015

Never thought about given the full time elapsed to manageConnections. Maybe even passing the current time could help simplifying the implementations of the timers because now instead of performing calculations every time only simple comparisons could be done.

This would at least help if the step change is positive. For negative step changes a different solution needs to be found.

Also applications depending on certain cycle times may get in trouble.

But it is definitely worth a try.

@liftoff-sr
Copy link

No. The solution is to use a bug free getmilliseconds(). Fundamentally, this is one implemented such that it not even see the change in the calendar time. You need to use a monotonically increasing function, devoid of influence from changes in calendar time. The rest of the logic is probably fine, I've kept it in my implementation.

@MartinMelikMerkumians
Copy link
Member

I like your solution. This would also solve the problem if the step change is negative due to time synchronization.

@swebb2066
Copy link
Author

I still do not see why multiple calls to manageConnections() (in a busy loop) is the right thing to do.

@liftoff-sr
Copy link

unsigned GetMilliSeconds( void )
{
struct timespec now;

clock_gettime( CLOCK_MONOTONIC, &now );

unsigned msecs = ((unsigned)now.tv_nsec)/(1000*1000) + now.tv_sec * 1000;
return msecs;

}

Dick

@azoitl
Copy link
Contributor

azoitl commented Mar 2, 2016

@swebb2066: the problem is that select may be blocked longer then the given time. As the manageConnections does internal some sums with expected cycle times the some can drift more and more away from the real time. This can lead to cases that timers are then much longer then they should be. The mulitple mangeConnection invocation does compensate this behavior.

@swebb2066
Copy link
Author

Why would multiple calls to m_pfTimeOutFunc and m_pfSendDataFunc from manageConnections() be the correct thing to do when cycle times drift?

Why would using the (bug free) elapsedtime to update timers in manageConnections() be worse than using the fixed constant OPENER_TIMER_TICK in manageConnections()?

@azoitl
Copy link
Contributor

azoitl commented Mar 2, 2016

You are right when changing the interface to give manageConnections the elapsed time you don't need to multiple call manageConnections. Still you need the correct time function as presented above by @liftoff-sr

@MartinMelikMerkumians
Copy link
Member

@swebb2066 I am with you that your suggested fix should also be applied, but as I had troubles with negative time steps already in the original implementation, so a fix for that is also needed.

@liftoff-sr
Copy link

To top most CMakeLists.txt, please add these two lines:

enable clock_gettime:

add_definitions( -D_POSIX_C_SOURCE=199309L )

in order for clock_gettime( CLOCK_MONOTONIC, &now ); to compile with the C language. (C++ does not need this.)

Then you may have to link to library "rt" additionally to provide that clock_gettime() function on linux.

@MartinMelikMerkumians
Copy link
Member

Putting the definition in the top most CMakeLists.txt would add the define also to the Windows build.

I added it to the POSIX specific CMake file, and applied your suggested fix to the GetMicroSeconds function. At least on my machine I didn't need the rt library.

@MartinMelikMerkumians
Copy link
Member

@swebb2066 I'm not sure, but this could also fix your issue as well, as CLOCK_MONOTONIC should not experience discontinuities due to NTP. (At least thats what I have interpreted from the description I found of CLOCK_MONOTONIC behavior)

But a fix for the Windows version is still missing.

@MartinMelikMerkumians
Copy link
Member

After digging into the topic if the Windows implementation also needs changes, I found out that the use performance counter is behaving monotonic.
So the Windows implementation does not need to change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants