Avoid bloat of CuiAbsolute CAN messages #217

PeterBowman · 2019-07-09T13:58:35Z

CuiAbsolute nodes are configured in CanBusControlboard initialization step to start in continuous publishing mode: ref. We only need these devices to query absolute encoder reads on start, all subsequent measures of built-in relative encoders would take into account this initial offset.

Options:

Configure Cui nodes in pull (aka RPC) mode, that is, don't send an encoder read unless requested. This option is preferable, reads would be performed on CanBusControlboard start only.
Enable Cui firmware to handle a stop publishing command.

PeterBowman · 2019-07-25T09:38:15Z

ASWJμ (ASW Juanmi), make sure that a relative encoder-only configuration does not lead to accumulation of errors on the long run, which would be mitigated with periodic checks by reading absolute encoders.

PeterBowman · 2019-07-31T21:02:07Z

If I'm correct, each Cui publishes encoder reads every 8 microseconds:

yarp-devices/firmware/CuiAbsolute/Pic_source/main.c

Lines 76 to 82 in 89e22a7

    
           /* SEND DELAY (Valor que utilizará Delay10TCYx en el envío. Valor recomendado de 1 a 100) 
        
            * El byte 3 (data[2]) que recibirá el PIC (valor comprendido entre [0-255]) se multiplicará por el tiempo que 
        
            * tarde en ejecutar el delay marcado por la función Delay10TCYx(sendDelay) 
        
            * A tener en cuenta: 
        
           	- La velocidad de ejecución de cada ciclo de instrucción son 0.8 microsegundos 
        
                   - Delay10TCYx(i) -> 10.Tcy.i genera una demora de 10 ciclos de instrucciones * i . Por tanto Delay10TCYc(1) equivale a 8 microsegundos (10 ciclos de reloj) */ 
        
           BYTE sendDelay = 1; //Default: 1

yarp-devices/firmware/CuiAbsolute/Pic_source/main.c

Lines 180 to 190 in 89e22a7

    
           while(!stop_flag) 
        
           { 
        
               send();	// -- envia 
        
               for( i=0; (i<= data[2]) && (!stop_flag) ; i++ )  // DELAY: data[2] recibirá un valor comprendido en [0 - 255] 
        
               { 
        
                   Delay10TCYx(sendDelay); 
        
                   ECANReceiveMessage(&picId, data, &dataLen, &rxflags); 
        
                   if((data[0]==0x02 && data[1]==0x01 && data[3]==0x00 && data[4]==0x00 && data[5]==0x00 && data[6]==0x00 && data[7]==0x00) && (picId == canId-384))  
        
                   	stop_flag=1; 
        
               } 
        
           }

PeterBowman · 2019-07-31T21:09:58Z

The ICuiAbsolute interface allows setting a Cui encoder device in either push or pull mode, that is, continuous or on-request message publishing, respectively:

yarp-devices/libraries/YarpPlugins/ICuiAbsolute.h

Lines 24 to 26 in 6f5c2da

    
           virtual bool startContinuousPublishing(uint8_t time) = 0; 
        
           virtual bool startPullPublishing() = 0; 
        
           virtual bool stopPublishingMessages() = 0;

The default behavior is to enable push mode on device init, since 2ed1bff. @smcdiaz, @jgvictores why wasn't pull mode chosen instead? Perhaps due to #217 (comment)?

jgvictores · 2019-08-09T12:32:49Z

Perhaps due to #217 (comment)?

Nope. Something like this sequence happened:

@smcdiaz had some control schemes in mind that involved using both relative and absolute encoders
We set absolute encoders to push continuously (to potentially save bandwith on constantly polling)
Had to take a design decision because YARP methods involve only reading from 1 type of encoder per joint, decided to provide the reading of the relative encoder (specially since some absolute encoders fail sometimes), and decided to use the absolute stream only on init (and by then we had already forgotten that we should be using the pull mechanism for that use case)

Feel free to switch to the pull mechanism, IMHO it's more efficient for the current workflow.

PeterBowman · 2019-08-09T12:42:21Z

Thanks! I guess 5f66286 was an attempt to test the behavior of relative vs absolute encoders. Can we now delete the printRelEncs branch?

jgvictores · 2019-08-09T13:15:15Z

We can delete the printRelEncs branch. I do not recall the exact functionality, but it was definitely a hack for debugging. If we ever need a similar mechanism, we can try to open a specific issue and work on a better solution.

PeterBowman · 2019-09-05T17:28:19Z

@rsantos88 has measured the amount of Cui and non-Cui messages on normal operation per 60 seconds to estimate the CAN bus load:

36500 non-Cui (13%)
252255 Cui (87%)

This adds up to 288755 CAN messages. If we assume that all incoming frames consist of 108 bits (ref), the overall RX transfer rate is ~520 kbits, far from the 1 Mpbs but not accounting for TX nor filtered ids. Also, 252255 frames transferred in 60 seconds equals one frame per 240 microseconds, also far from the 8 microsecond period we extract from the firmware code.

PS would you mind checking this again with a minor change, @rsantos88? There is a potential pitfall in 70b1fa9: you are never initializing the cuiMessages and otherMessages class members, therefore they may contain some garbage value before first use. It wouls suffice to put cuiMessages = otherMessages = 0; in DeviceDriverImpl.cpp, for instance.

rsantos88 · 2019-09-06T09:05:57Z

PS would you mind checking this again with a minor change, @rsantos88? There is a potential pitfall in 70b1fa9: you are never initializing the cuiMessages and otherMessages class members, therefore they may contain some garbage value before first use. It wouls suffice to put cuiMessages = otherMessages = 0; in DeviceDriverImpl.cpp, for instance.

Ok, thanks for your advice. That has been fixed now (f90abd0).
I've taken measures three times to be sure of the results and doing this with one thread, only with one can bus of right arm + ID28(head) using oneCanBusOneWrapper --from oneCanBusOneWrapper-right-arm.ini (oneCanBusOneWrapper-right-arm.ini).

Here are the results:

Time: 60.000303
[CUI messages (198215)] 
[NON-CUI messages (39990)] 

Time: 60.000152
[CUI messages (200496)] 
[NON-CUI messages (40214)]

Time: 60.001211
[CUI messages (200176)] 
[NON-CUI messages (39380)]

rsantos88 · 2019-09-06T09:34:13Z

IDEA: We could use the software PCAN-View for Linux (Software for Displaying CAN and CAN FD Messages, page 4 of website driver) to monitor can packages

rsantos88 · 2019-09-06T10:36:03Z

Other IDEA: use USB-to-CAN compact interface and canAnalyser 3 on windows

rsantos88 · 2019-09-06T18:44:12Z

Related to the latest tests carried out today I comment on the following points:

I have performed tests with the PCAN-View for Linux software and have encountered problems when I try to read the information from the CAN network while the oneCanBusOneWrapper software is running, causing it to be blocked at the time of the connection to the can network . This program allows you to read packages that are being received over the network (continuous publishing mode of the absolute encoders), as well as the last packages received from the drivers status. You can perfectly differentiate the ID of the messages of each node, the header of your message, number of messages that are arriving, content of the same... but apparently the software is not compatible with real-time monitoring and sending messages , all through the use of the same CanBusPeak device.
As a result of this, I have carried out tests connecting the USB-to-CAN compact interface to the can network of the left arm, serving this device as another node of the network that allows monitoring all traffic. It may be interesting to analyze some of the results obtained by this software. I have recorded a small video of a few seconds so that some details can be appreciated:

As you can see the first 6 packages are the presence of the 6 drivers corresponding to the left arm
It seems that the network does not show signs of saturation, reaching the Busload a maximum value of 63% in the execution period.
A high number of error packets can be seen (total error counter: 67410), what might these packages be due to?
There is an overflow of 1403 packages. Why?
Finally, the software has let me save the information of the last 10,000 packets captured, performed in this test. I leave then for further analysis here:
IXXAT_canAnalyser3_Mini_19-09-06_145740

PS: I don't know if it's the best location for this extensive issue. If necessary, you can move it to a better place

PeterBowman · 2019-09-07T11:30:46Z

Great analysis!

PS: I don't know if it's the best location for this extensive issue. If necessary, you can move it to a better place

I'm forking this issue into #231 to delegate bus load discussions. Let's focus here on Cui devices, push/pull modes and send rates.

rsantos88 · 2019-09-10T13:02:16Z

I have been researching the improvement obtained by reducing the frequency of sending messages from CUIs (which is the same, increasing the period of sending messages) and the effects produced in the statistics we obtain in canAnaliser3-mini. FYI, there are 3 versions of this software and we have the simplest version: mini. Here you can see the limitations of this version. Anyway, mini is free. With respect to the period of sending messages of the PICs, these are regulated with the function Delay10TCYc (unsigned char), being Delay10TCYc (1) equivalent to 8µs = 10 clock cycles. This is the current default delay between sending messages by sending 0 as a parameter of startContinuousPublishing (0).
In this code we can appreciate that the number of times we´ll call Delay10TCYc (1) will be the parameter sent to startContinuousPublishing +1, detecting a possible interruption between each wait iteration.

Here I´ll show you the results obtained, doing diferent tests of pcan2 (left arm + ID27):

Video 1: Period between sending CUI messages: 8µs using startContinuousPublishing (0)
It's the example seen above. Current operation of the encoders, sending messages at the maximum frequency allowed by the PIC of CUI.
Results:

Receive counter: 126536
Error counter: 67410
Overruns: 1403

Video 2: Period between sending CUI messages: 16µs using startContinuousPublishing (1)
Results:

Receive counter: 405496
Error counter: 0
Overruns: 818

Video 3 : Period between sending CUI messages: 24µs using startContinuousPublishing (2)
Results:

Receive counter: 398003
Error counter: 2
Overruns: 0

Video 4 : Period between sending CUI messages: 32µs using startContinuousPublishing (3) no errors or overruns
Results:

Receive counter: 320016
Error counter: 0
Overruns: 0

Video 5: Without CUIs no errors or overruns . As you can see, the maximum Busload value is 17%

You can conclude that with the default option, the sending frequency of the CUIs is so high that it produces a large number of errors in the CAN, leading to a large number of packages being lost (in 1 minute a total of 126536 correct packages are received). While the following tests receive correct packages between 300,000 and 400,000 in 1 minute without errors or overruns. It could be concluded that the optimal configuration for a correct push send without errors and losses would be with a period of 24µs per message, changing startContinuousPublishing (0) to startContinuousPublishing (3).

PeterBowman · 2019-09-17T11:20:31Z

https://mrchunckuee.blogspot.com/2014/09/mplab-x-y-c18-uso-de-la-libreria-delaysh.html

rsantos88 · 2019-09-17T14:21:06Z

Going deeper, I am trying to understand the results I get (total number of packages) depending on the period of sending between messages (push type sending), controlled by the Delay10TCYx (unsigned char); located in the CUI code . Regarding the code executed in the CUIs, it has been cleaned and simplified (thanks @PeterBowman for the help) here are the result (WIP yet).
By testing with testCuiAbsolute modified for this purpose and varying the period of sending messages with startContinuousPublishing, I've obtained the following results:

Note: Debugger CUI was a way to confirm that USB-CAN didn't lose packets, using an internal counter in the CUI code itself and checking the number of times it called the send () function. this value is the same, adding two packages (push and stop)

using delay10TCYx(unsigned char):

startContinupusPublishing(x)	testCuiAbsolute	USB-CAN interface	Debugger CUI
1	949	1048	1046
10	958	1058	1056
100	1049	1158	1156
200	1175	1297	1295

using delay100TCYx(unsigned char):

startContinupusPublishing(x)	testCuiAbsolute	USB-CAN interface	Debugger CUI
1	957	1057	1055
10	1049	1158	1156
100	425	470	468

rsantos88 · 2019-09-17T14:37:59Z

After these tests and talking with @PeterBowman the possibility of not understanding how the Delay10TCYx (unsigned char) function really works with respect to the oscillation frequency of the crystal, (20.000Hz ??), I've simplified the PIC code to the maximum, eliminating functions of reception and construction of the sending message, getting the least number of cycles consumed per instruction:

while(1)
{
  ECANSendMessage(canId, &degrees, sizeof(degrees), txFlags); 
  Delay100TCYx(delay);
}

The results are:

delay	packages / sec
100	474
10	3363
1	7400

rsantos88 · 2019-09-17T14:47:48Z

Another test that I've done has been remove the delay function in order to find out the number of packets that I would send in a second, taking into account the time spent processing the functions of ECANReceiveMessage and the send() function dedicated of constructing the message and sending it.
The result has been 1132 packages in 1 second, which would be equivalent to a period of 0.88 ms per message sent.

rsantos88 · 2019-09-17T15:57:49Z

Testing this code :

    while(1)
    {
	ECANSendMessage(canId, &degrees_1, sizeof(degrees_1), txFlags); 
	Delay10KTCYx(100);
	ECANSendMessage(canId, &degrees_2, sizeof(degrees_2), txFlags); 
    }

the result is 200ms of difference between the two messages. Taking into account the equation of this link, the crystal oscillation frequency is 20Mhz

PeterBowman · 2019-09-18T10:44:28Z

the crystal oscillation frequency is 20Mhz

Verified with the Yokogawa oscilloscope.

PeterBowman · 2019-09-18T21:49:01Z

By applying the formula given at ref, I assembled a table for this particular cristal (20 MHz) unifying all available delay functions. First, I obtain the i iterations necessary to (theoretically) perform a 1 second delay. On the third and four columns, the minimum and maximum delays for i=1 and i=255 are shown, respectively (since the input parameter type is unsigned char):

DelayKTCYx(i)	i	T (i=1) [ms]	T (i=255) [ms]
1	5000000	0.0002	0.051
10	500000	0.002	0.51
100	50000	0.02	5.1
1000	5000	0.2	51
10000	500	2	510

As @rsantos88 pointed out, even with i=0 there is an implicit ~1 millisecond delay due to low-level PIC state transitions and hardcoded delays. The i=500 case would fit; however, I doubt we are ever going to need a 510 millisecond delay. Since 51 milliseconds is already high enough, and the 0.2 millisecond resolution is also fine, I'd choose the Delay1KTCYx function.

PS currently using Delay10TCYx(1), which translates to 2 microseconds (not accounting for the implicit PIC delay).

rsantos88 · 2019-09-19T13:43:36Z

Send code (permalink) based on documentation (page 3,4,5) (copy, copy (permalink))

PeterBowman · 2019-09-20T21:46:05Z

Another test that I've done has been remove the delay function in order to find out the number of packets that I would send in a second, taking into account the time spent processing the functions of ECANReceiveMessage and the send() function dedicated of constructing the message and sending it.
The result has been 1132 packages in 1 second, which would be equivalent to a period of 0.88 ms per message sent.

We learned today that this result is unpredictable, but seems to entail no more than a 1 ms delay due to internal hardware message buffering/queueing. By adding back the delay function call, messages are sent with the expected frequency.

Another issue arised: sometimes, the start push command renders no response from the PIC firmware. Also, we noticed that one or two error CAN frames are observed by the traffic analyzer right on application start. I presume some queued message originating from the previous run is malformed and sent in such shape by whatever device.

PeterBowman · 2019-09-21T10:19:02Z

Current state of the apocanlypse branch fulfills the goals of the CAN-TEO project as stated in the issue description:

a123bc9 refactors the CuiAbsolute device and wraps it in TechnosoftIpos so that absolute encoder reads are only requested on initial configuration; then, this value is used on normal operation
a9d0ed1 accomplishes Expose Cui reset encoder command in firmware #233 (comment):

We want to store absolute encoder readings on device init (Avoid bloat of CuiAbsolute CAN messages #217) so that the resulting offset value is added to the relative encoder reads on runtime. It's certainly a bit less of a trouble to call setEncoder on init to synchronize relative and absolute encoders instead. There is no need to involve a local variable (which should be properly set/reset by setEncoder in that scenario).

rsantos88 · 2019-09-23T14:31:17Z

with i=5, the total number of packets are between 810-819 received in 1 second. Taking the highest value, the period is 1,22ms per message sent.

rsantos88 · 2019-09-24T09:24:48Z

Due to the new code modifications, tests will be carried out on a single TEO member (for example, left arm) to check with a startPushPublishing (5) value, the CAN traffic level.
Rescued issues with problems encountered when reprogramming PICs (https://github.com/roboticslab-uc3m/teo-hardware-issues/issues/38)

jgvictores · 2019-10-05T09:26:48Z

Small note regarding amt203-dmk-appnote.pdf (a.k.a. AMT203_AppNote.pdf): was exported at commit d8b2c2b, imported at commit https://github.com/roboticslab-uc3m/datasheets-and-manuals/commit/f7354a4e80a362cebcc75526a61d7594d7239083.
New location: https://github.com/roboticslab-uc3m/datasheets-and-manuals/blob/master/cui/amt203-dmk-appnote.pdf (permalink)

PeterBowman · 2019-10-05T14:03:33Z

Pull command (re)implemented and successfully tested by @rsantos88, see #233 (comment).

PeterBowman · 2019-12-29T22:19:35Z

Commit 8bba682 introduced retries on failed CAN transfers (current default: 5 retries).

PeterBowman added the firmware label Jul 31, 2019

PeterBowman added dev: CuiAbsolute question labels Aug 9, 2019

PeterBowman mentioned this issue Sep 7, 2019

Investigate CAN bus load #231

Closed

PeterBowman assigned PeterBowman and rsantos88 Sep 7, 2019

PeterBowman mentioned this issue Sep 20, 2019

Major CAN rework #229

Merged

26 tasks

PeterBowman mentioned this issue Sep 20, 2019

Expose Cui reset encoder command in firmware #233

Closed

rsantos88 added the blocked label Sep 24, 2019

rsantos88 removed the blocked label Oct 8, 2019

PeterBowman closed this as completed in #229 Dec 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid bloat of CuiAbsolute CAN messages #217

Avoid bloat of CuiAbsolute CAN messages #217

PeterBowman commented Jul 9, 2019

PeterBowman commented Jul 25, 2019

PeterBowman commented Jul 31, 2019

PeterBowman commented Jul 31, 2019

jgvictores commented Aug 9, 2019

PeterBowman commented Aug 9, 2019

jgvictores commented Aug 9, 2019

PeterBowman commented Sep 5, 2019 •

edited

Loading

rsantos88 commented Sep 6, 2019

rsantos88 commented Sep 6, 2019

rsantos88 commented Sep 6, 2019

rsantos88 commented Sep 6, 2019

PeterBowman commented Sep 7, 2019

rsantos88 commented Sep 10, 2019

PeterBowman commented Sep 17, 2019

rsantos88 commented Sep 17, 2019

rsantos88 commented Sep 17, 2019 •

edited

Loading

rsantos88 commented Sep 17, 2019

rsantos88 commented Sep 17, 2019

PeterBowman commented Sep 18, 2019

PeterBowman commented Sep 18, 2019 •

edited

Loading

rsantos88 commented Sep 19, 2019 •

edited by jgvictores

Loading

PeterBowman commented Sep 20, 2019

PeterBowman commented Sep 21, 2019

rsantos88 commented Sep 23, 2019

rsantos88 commented Sep 24, 2019

jgvictores commented Oct 5, 2019

PeterBowman commented Oct 5, 2019 •

edited

Loading

PeterBowman commented Dec 29, 2019

Avoid bloat of CuiAbsolute CAN messages #217

Avoid bloat of CuiAbsolute CAN messages #217

Comments

PeterBowman commented Jul 9, 2019

PeterBowman commented Jul 25, 2019

PeterBowman commented Jul 31, 2019

PeterBowman commented Jul 31, 2019

jgvictores commented Aug 9, 2019

PeterBowman commented Aug 9, 2019

jgvictores commented Aug 9, 2019

PeterBowman commented Sep 5, 2019 • edited Loading

rsantos88 commented Sep 6, 2019

rsantos88 commented Sep 6, 2019

rsantos88 commented Sep 6, 2019

rsantos88 commented Sep 6, 2019

PeterBowman commented Sep 7, 2019

rsantos88 commented Sep 10, 2019

PeterBowman commented Sep 17, 2019

rsantos88 commented Sep 17, 2019

rsantos88 commented Sep 17, 2019 • edited Loading

rsantos88 commented Sep 17, 2019

rsantos88 commented Sep 17, 2019

PeterBowman commented Sep 18, 2019

PeterBowman commented Sep 18, 2019 • edited Loading

rsantos88 commented Sep 19, 2019 • edited by jgvictores Loading

PeterBowman commented Sep 20, 2019

PeterBowman commented Sep 21, 2019

rsantos88 commented Sep 23, 2019

rsantos88 commented Sep 24, 2019

jgvictores commented Oct 5, 2019

PeterBowman commented Oct 5, 2019 • edited Loading

PeterBowman commented Dec 29, 2019

PeterBowman commented Sep 5, 2019 •

edited

Loading

rsantos88 commented Sep 17, 2019 •

edited

Loading

PeterBowman commented Sep 18, 2019 •

edited

Loading

rsantos88 commented Sep 19, 2019 •

edited by jgvictores

Loading

PeterBowman commented Oct 5, 2019 •

edited

Loading