Timing issue (despite PREEMPT RT patch) #330

miscz · 2019-08-07T07:44:51Z

Hello!

I am facing a similar problem like robotixdeveloper in issue #296 . Running in cyclic sync position mode I am loosing packages and the servo is making cracking noises.
My setting is alo very similar: It is a Lenovo T460s with an core i5 with an Intel Ethernet Connection I219-V. Following the instruction from #296 and others I did the following:

patched my ubuntu 16.04 with the preempt rt 5.2.0 kernel
added a function set_realtime_priority() (see below) in order to set the rt thread ecatthread from the red_test example to priority 99

void set_realtime_priority() {


     int ret;

     // We'll operate on the currently running thread.
     pthread_t this_thread = pthread_self();

     // struct sched_param is used to store the scheduling priority
     struct sched_param params;

     // We'll set the priority to the maximum.
     params.sched_priority = sched_get_priority_max(SCHED_FIFO);

     std::cout << "Trying to set thread realtime prio = " << params.sched_priority << std::endl;

     // Attempt to set thread real-time priority to the SCHED_FIFO policy
     ret = pthread_setschedparam(this_thread, SCHED_FIFO, &params);
     if (ret != 0) {
         // Print the error
         std::cout << "Unsuccessful in setting thread realtime prio" << std::endl;
         return;
     }

     // Now verify the change in thread priority
     int policy = 0;
     ret = pthread_getschedparam(this_thread, &policy, &params);
     if (ret != 0) {
         std::cout << "Couldn't retrieve real-time scheduling paramers" << std::endl;
         return;
     }

     // Check the correct policy was applied
     if(policy != SCHED_FIFO) {
         std::cout << "Scheduling is NOT SCHED_FIFO!" << std::endl;
     } else {
         std::cout << "SCHED_FIFO OK" << std::endl;
     }

     // Print thread scheduling priority
     std::cout << "Thread priority is " << params.sched_priority << std::endl;
}

changed the TG3 settings of my ehternet adapter like mentioned in the drvcomment.txt (ethtool -C eth0 rx-usecs 0 rx-frames 1 tx-usecs 0 tx-frames 1)
added time measuring to the rt thread to measure the cycle time (this does not have an effect on the timing behavior of the thread):

clock_gettime(CLOCK_MONOTONIC, &tspec_now);
cycletime[i] = (tspec_now.tv_sec*NSEC_PER_SEC + tspec_now.tv_nsec)-(tspec_b4.tv_sec*NSEC_PER_SEC + tspec_b4.tv_nsec);
ec_send_processdata();
clock_gettime(CLOCK_MONOTONIC, &tspec_b4);

The test program runs for 20 seconds with a cycle time of 2 miliseconds. I made some plots of the cycle time with its histogram

of toff with its histogram

and of ec_DCtime.

You can also find the wireshark capture of this test here:
test3.pcapng.txt

Any suggestions what might be wrong with my code or the setup? Is it correct, that ec_DCtime has a modulo behavior? Limited range?

Thank you for your help!

The text was updated successfully, but these errors were encountered:

miscz · 2019-08-07T08:03:13Z

By the way here is a cycle test for the rt kernel:

miscz · 2019-08-30T10:46:53Z

Someone a suggestion?

I think one reason could be, that my kernel is not configured right. So I would try to get an older rt-kernel that is closer to my non-rt kernel version, to avoid major changes in the kernel config.

But one issue I still have is the modulo behavior of ec_DCtime. Is it correct like it is shown in the diagram? I expected to see a straight line like the identity function. It seems that a lot of the cracking noises occur when the ec_DCtime changes from the very high number 4.xE+09 to 0. Might this cause a problem in the ec_sync() function with its PI paramters which would result in skipping some packages because toff is getting too high?

ArthurKetels · 2019-08-31T21:15:41Z

Sorry for not answering earlier. OS timing issues are not exactly on topic for SOEM.

But one issue I still have is the modulo behavior of ec_DCtime.

This is an issue indeed. The sample code assumes a 64bit DCtime. Some slaves only provide 32 bits. It is up to you to extend this to 64 bits. Only then you can take the direct difference as in the example.

Extending or not. Some code change is necessary.

Next to that you do have some configuration issues. Linux kernel with preempt-rt patch should have a max latency jitter around 50us on modern hardware. My advice is to stick to a vanilla kernel and patch that with the latest preempt-rt patch set. Configure a minimal system with debugging off. CPU isolation for non critical processes might help too. Keep kernel processes at a higher priority than your SOEM application (SOEM needs the kernel to be responsive).

Pirority stack:
1 Hardware drivers
2 Kernel
3 SOEM + application
4 All others

miscz · 2019-09-06T07:56:17Z

Thank you for your answer! I will try to reconfigure the kernel.

Regarding the DC: Can you tell me which register the DC uses, so I can check if this is 32 or 64 bit long?

PaninP · 2019-09-12T04:42:20Z

May I ask you some question regarding how have you got the histogram data from ethernet port?
I means to investigate my timing issue too.
Thank you for your help.

tecodrive · 2019-11-07T21:27:13Z

I use Ubuntu Server 18.04 with low-latency kernel, important, no monitor or USB device (keyboard, mouse) connected, NIC Intel i210at or i211at, isolcpu kernel-Parms to isolate a physical core. If hyperthreading is active, you must isolate two logical CPUs (0 & 2) or (1 & 3) as a combination within two physical CPUs. Interrupt of NIC rx & tx should be assigned to the isolated CPU. RT thread runs on isolated CPU. All non-RT threads and interrupts should be run on the non-isolated CPU. My machine runs with 250us cycle time and latency max. <5 us. I first use a while loop to catch the exact next clock, to send and receive ecat pdo, then ~ 70% of the cycle time with clock_nanosleep.

I219 is very bad because of the very high latency on the receiving side. No chance to get performance.

Note the following: Do not use SDOwrite or SDOread in the same RT thread. Both send a lot of Ethercat frames and wait for the answer. Your RT thread is blocked. However, running SDOwrite or SDOread on a separate, non-RT capable thread also gives you high latencies. If non-RT threads have access to the NIC and set a mutex, the RT thread is blocked. The trick is a small patch in the SOEM that introduces a delay into the SDO commands. The RT thread therefore reads all incoming frames (sdo & pdo) from ec_receive_processdata.

Linux attempts to increase the priority of a mutex non-RT thread when an RT thread requests that mutex. However, this process is extremely indeterminate over time. I had tried it and there were these problems.

rt preempt is unnecessary. Everything important in the patch to the Intel processors is contained in the low latency kernel.

nakarlsson · 2020-03-09T07:46:28Z

Can we close this issue?

nakarlsson · 2020-05-09T11:55:53Z

Inactive

nakarlsson closed this as completed May 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timing issue (despite PREEMPT RT patch) #330

Timing issue (despite PREEMPT RT patch) #330

miscz commented Aug 7, 2019

miscz commented Aug 7, 2019

miscz commented Aug 30, 2019

ArthurKetels commented Aug 31, 2019 •

edited

miscz commented Sep 6, 2019

PaninP commented Sep 12, 2019

tecodrive commented Nov 7, 2019 •

edited

nakarlsson commented Mar 9, 2020

nakarlsson commented May 9, 2020

Timing issue (despite PREEMPT RT patch) #330

Timing issue (despite PREEMPT RT patch) #330

Comments

miscz commented Aug 7, 2019

miscz commented Aug 7, 2019

miscz commented Aug 30, 2019

ArthurKetels commented Aug 31, 2019 • edited

miscz commented Sep 6, 2019

PaninP commented Sep 12, 2019

tecodrive commented Nov 7, 2019 • edited

nakarlsson commented Mar 9, 2020

nakarlsson commented May 9, 2020

ArthurKetels commented Aug 31, 2019 •

edited

tecodrive commented Nov 7, 2019 •

edited