Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid bloat of CuiAbsolute CAN messages #217

Closed
PeterBowman opened this issue Jul 9, 2019 · 28 comments · Fixed by #229
Closed

Avoid bloat of CuiAbsolute CAN messages #217

PeterBowman opened this issue Jul 9, 2019 · 28 comments · Fixed by #229

Comments

@PeterBowman
Copy link
Member

CuiAbsolute nodes are configured in CanBusControlboard initialization step to start in continuous publishing mode: ref. We only need these devices to query absolute encoder reads on start, all subsequent measures of built-in relative encoders would take into account this initial offset.

Options:

  • Configure Cui nodes in pull (aka RPC) mode, that is, don't send an encoder read unless requested. This option is preferable, reads would be performed on CanBusControlboard start only.

  • Enable Cui firmware to handle a stop publishing command.

@PeterBowman
Copy link
Member Author

ASWJμ (ASW Juanmi), make sure that a relative encoder-only configuration does not lead to accumulation of errors on the long run, which would be mitigated with periodic checks by reading absolute encoders.

@PeterBowman
Copy link
Member Author

If I'm correct, each Cui publishes encoder reads every 8 microseconds:

/* SEND DELAY (Valor que utilizará Delay10TCYx en el envío. Valor recomendado de 1 a 100)
* El byte 3 (data[2]) que recibirá el PIC (valor comprendido entre [0-255]) se multiplicará por el tiempo que
* tarde en ejecutar el delay marcado por la función Delay10TCYx(sendDelay)
* A tener en cuenta:
- La velocidad de ejecución de cada ciclo de instrucción son 0.8 microsegundos
- Delay10TCYx(i) -> 10.Tcy.i genera una demora de 10 ciclos de instrucciones * i . Por tanto Delay10TCYc(1) equivale a 8 microsegundos (10 ciclos de reloj) */
BYTE sendDelay = 1; //Default: 1

while(!stop_flag)
{
send(); // -- envia
for( i=0; (i<= data[2]) && (!stop_flag) ; i++ ) // DELAY: data[2] recibirá un valor comprendido en [0 - 255]
{
Delay10TCYx(sendDelay);
ECANReceiveMessage(&picId, data, &dataLen, &rxflags);
if((data[0]==0x02 && data[1]==0x01 && data[3]==0x00 && data[4]==0x00 && data[5]==0x00 && data[6]==0x00 && data[7]==0x00) && (picId == canId-384))
stop_flag=1;
}
}

@PeterBowman
Copy link
Member Author

The ICuiAbsolute interface allows setting a Cui encoder device in either push or pull mode, that is, continuous or on-request message publishing, respectively:

virtual bool startContinuousPublishing(uint8_t time) = 0;
virtual bool startPullPublishing() = 0;
virtual bool stopPublishingMessages() = 0;

The default behavior is to enable push mode on device init, since 2ed1bff. @smcdiaz, @jgvictores why wasn't pull mode chosen instead? Perhaps due to #217 (comment)?

@jgvictores
Copy link
Member

Perhaps due to #217 (comment)?

Nope. Something like this sequence happened:

  1. @smcdiaz had some control schemes in mind that involved using both relative and absolute encoders
  2. We set absolute encoders to push continuously (to potentially save bandwith on constantly polling)
  3. Had to take a design decision because YARP methods involve only reading from 1 type of encoder per joint, decided to provide the reading of the relative encoder (specially since some absolute encoders fail sometimes), and decided to use the absolute stream only on init (and by then we had already forgotten that we should be using the pull mechanism for that use case)

Feel free to switch to the pull mechanism, IMHO it's more efficient for the current workflow.

@PeterBowman
Copy link
Member Author

Thanks! I guess 5f66286 was an attempt to test the behavior of relative vs absolute encoders. Can we now delete the printRelEncs branch?

@jgvictores
Copy link
Member

We can delete the printRelEncs branch. I do not recall the exact functionality, but it was definitely a hack for debugging. If we ever need a similar mechanism, we can try to open a specific issue and work on a better solution.

@PeterBowman
Copy link
Member Author

PeterBowman commented Sep 5, 2019

@rsantos88 has measured the amount of Cui and non-Cui messages on normal operation per 60 seconds to estimate the CAN bus load:

  • 36500 non-Cui (13%)
  • 252255 Cui (87%)

This adds up to 288755 CAN messages. If we assume that all incoming frames consist of 108 bits (ref), the overall RX transfer rate is ~520 kbits, far from the 1 Mpbs but not accounting for TX nor filtered ids. Also, 252255 frames transferred in 60 seconds equals one frame per 240 microseconds, also far from the 8 microsecond period we extract from the firmware code.

PS would you mind checking this again with a minor change, @rsantos88? There is a potential pitfall in 70b1fa9: you are never initializing the cuiMessages and otherMessages class members, therefore they may contain some garbage value before first use. It wouls suffice to put cuiMessages = otherMessages = 0; in DeviceDriverImpl.cpp, for instance.

@rsantos88
Copy link
Contributor

PS would you mind checking this again with a minor change, @rsantos88? There is a potential pitfall in 70b1fa9: you are never initializing the cuiMessages and otherMessages class members, therefore they may contain some garbage value before first use. It wouls suffice to put cuiMessages = otherMessages = 0; in DeviceDriverImpl.cpp, for instance.

Ok, thanks for your advice. That has been fixed now (f90abd0).
I've taken measures three times to be sure of the results and doing this with one thread, only with one can bus of right arm + ID28(head) using oneCanBusOneWrapper --from oneCanBusOneWrapper-right-arm.ini (oneCanBusOneWrapper-right-arm.ini).

Here are the results:

Time: 60.000303
[CUI messages (198215)] 
[NON-CUI messages (39990)] 

Time: 60.000152
[CUI messages (200496)] 
[NON-CUI messages (40214)]

Time: 60.001211
[CUI messages (200176)] 
[NON-CUI messages (39380)] 

@rsantos88
Copy link
Contributor

IDEA: We could use the software PCAN-View for Linux (Software for Displaying CAN and CAN FD Messages, page 4 of website driver) to monitor can packages

@rsantos88
Copy link
Contributor

Other IDEA: use USB-to-CAN compact interface and canAnalyser 3 on windows

@rsantos88
Copy link
Contributor

Related to the latest tests carried out today I comment on the following points:

  1. I have performed tests with the PCAN-View for Linux software and have encountered problems when I try to read the information from the CAN network while the oneCanBusOneWrapper software is running, causing it to be blocked at the time of the connection to the can network . This program allows you to read packages that are being received over the network (continuous publishing mode of the absolute encoders), as well as the last packages received from the drivers status. You can perfectly differentiate the ID of the messages of each node, the header of your message, number of messages that are arriving, content of the same... but apparently the software is not compatible with real-time monitoring and sending messages , all through the use of the same CanBusPeak device.
  2. As a result of this, I have carried out tests connecting the USB-to-CAN compact interface to the can network of the left arm, serving this device as another node of the network that allows monitoring all traffic. It may be interesting to analyze some of the results obtained by this software. I have recorded a small video of a few seconds so that some details can be appreciated:

480

  • As you can see the first 6 packages are the presence of the 6 drivers corresponding to the left arm
  • It seems that the network does not show signs of saturation, reaching the Busload a maximum value of 63% in the execution period.
  • A high number of error packets can be seen (total error counter: 67410), what might these packages be due to?
  • There is an overflow of 1403 packages. Why?
  • Finally, the software has let me save the information of the last 10,000 packets captured, performed in this test. I leave then for further analysis here:
    IXXAT_canAnalyser3_Mini_19-09-06_145740

PS: I don't know if it's the best location for this extensive issue. If necessary, you can move it to a better place

@PeterBowman
Copy link
Member Author

Great analysis!

PS: I don't know if it's the best location for this extensive issue. If necessary, you can move it to a better place

I'm forking this issue into #231 to delegate bus load discussions. Let's focus here on Cui devices, push/pull modes and send rates.

@rsantos88
Copy link
Contributor

I have been researching the improvement obtained by reducing the frequency of sending messages from CUIs (which is the same, increasing the period of sending messages) and the effects produced in the statistics we obtain in canAnaliser3-mini. FYI, there are 3 versions of this software and we have the simplest version: mini. Here you can see the limitations of this version. Anyway, mini is free. With respect to the period of sending messages of the PICs, these are regulated with the function Delay10TCYc (unsigned char), being Delay10TCYc (1) equivalent to 8µs = 10 clock cycles. This is the current default delay between sending messages by sending 0 as a parameter of startContinuousPublishing (0).
In this code we can appreciate that the number of times we´ll call Delay10TCYc (1) will be the parameter sent to startContinuousPublishing +1, detecting a possible interruption between each wait iteration.

Here I´ll show you the results obtained, doing diferent tests of pcan2 (left arm + ID27):

  • Video 1: Period between sending CUI messages: 8µs using startContinuousPublishing (0)
    It's the example seen above. Current operation of the encoders, sending messages at the maximum frequency allowed by the PIC of CUI.
    Results:
Receive counter: 126536
Error counter: 67410
Overruns: 1403
  • Video 2: Period between sending CUI messages: 16µs using startContinuousPublishing (1)
    Results:
Receive counter: 405496
Error counter: 0
Overruns: 818
  • Video 3 : Period between sending CUI messages: 24µs using startContinuousPublishing (2)
    Results:
Receive counter: 398003
Error counter: 2
Overruns: 0
  • Video 4 : Period between sending CUI messages: 32µs using startContinuousPublishing (3) no errors or overruns
    Results:
Receive counter: 320016
Error counter: 0
Overruns: 0
  • Video 5: Without CUIs no errors or overruns . As you can see, the maximum Busload value is 17%

You can conclude that with the default option, the sending frequency of the CUIs is so high that it produces a large number of errors in the CAN, leading to a large number of packages being lost (in 1 minute a total of 126536 correct packages are received). While the following tests receive correct packages between 300,000 and 400,000 in 1 minute without errors or overruns. It could be concluded that the optimal configuration for a correct push send without errors and losses would be with a period of 24µs per message, changing startContinuousPublishing (0) to startContinuousPublishing (3).

@PeterBowman
Copy link
Member Author

@rsantos88
Copy link
Contributor

Going deeper, I am trying to understand the results I get (total number of packages) depending on the period of sending between messages (push type sending), controlled by the Delay10TCYx (unsigned char); located in the CUI code . Regarding the code executed in the CUIs, it has been cleaned and simplified (thanks @PeterBowman for the help) here are the result (WIP yet).
By testing with testCuiAbsolute modified for this purpose and varying the period of sending messages with startContinuousPublishing, I've obtained the following results:

Note: Debugger CUI was a way to confirm that USB-CAN didn't lose packets, using an internal counter in the CUI code itself and checking the number of times it called the send () function. this value is the same, adding two packages (push and stop)

  • using delay10TCYx(unsigned char):
startContinupusPublishing(x) testCuiAbsolute USB-CAN interface Debugger CUI
1 949 1048 1046
10 958 1058 1056
100 1049 1158 1156
200 1175 1297 1295
  • using delay100TCYx(unsigned char):
startContinupusPublishing(x) testCuiAbsolute USB-CAN interface Debugger CUI
1 957 1057 1055
10 1049 1158 1156
100 425 470 468

@rsantos88
Copy link
Contributor

rsantos88 commented Sep 17, 2019

After these tests and talking with @PeterBowman the possibility of not understanding how the Delay10TCYx (unsigned char) function really works with respect to the oscillation frequency of the crystal, (20.000Hz ??), I've simplified the PIC code to the maximum, eliminating functions of reception and construction of the sending message, getting the least number of cycles consumed per instruction:

while(1)
{
  ECANSendMessage(canId, &degrees, sizeof(degrees), txFlags); 
  Delay100TCYx(delay);
} 

The results are:

delay packages / sec
100 474
10 3363
1 7400

@rsantos88
Copy link
Contributor

Another test that I've done has been remove the delay function in order to find out the number of packets that I would send in a second, taking into account the time spent processing the functions of ECANReceiveMessage and the send() function dedicated of constructing the message and sending it.
The result has been 1132 packages in 1 second, which would be equivalent to a period of 0.88 ms per message sent.

@rsantos88
Copy link
Contributor

Testing this code :

    while(1)
    {
	ECANSendMessage(canId, &degrees_1, sizeof(degrees_1), txFlags); 
	Delay10KTCYx(100);
	ECANSendMessage(canId, &degrees_2, sizeof(degrees_2), txFlags); 
    }

the result is 200ms of difference between the two messages. Taking into account the equation of this link, the crystal oscillation frequency is 20Mhz

@PeterBowman
Copy link
Member Author

the crystal oscillation frequency is 20Mhz

Verified with the Yokogawa oscilloscope.

@PeterBowman
Copy link
Member Author

PeterBowman commented Sep 18, 2019

By applying the formula given at ref, I assembled a table for this particular cristal (20 MHz) unifying all available delay functions. First, I obtain the i iterations necessary to (theoretically) perform a 1 second delay. On the third and four columns, the minimum and maximum delays for i=1 and i=255 are shown, respectively (since the input parameter type is unsigned char):

DelayKTCYx(i) i T (i=1) [ms] T (i=255) [ms]
1 5000000 0.0002 0.051
10 500000 0.002 0.51
100 50000 0.02 5.1
1000 5000 0.2 51
10000 500 2 510

As @rsantos88 pointed out, even with i=0 there is an implicit ~1 millisecond delay due to low-level PIC state transitions and hardcoded delays. The i=500 case would fit; however, I doubt we are ever going to need a 510 millisecond delay. Since 51 milliseconds is already high enough, and the 0.2 millisecond resolution is also fine, I'd choose the Delay1KTCYx function.

PS currently using Delay10TCYx(1), which translates to 2 microseconds (not accounting for the implicit PIC delay).

@rsantos88
Copy link
Contributor

rsantos88 commented Sep 19, 2019

Send code (permalink) based on documentation (page 3,4,5) (copy, copy (permalink))

@PeterBowman
Copy link
Member Author

Another test that I've done has been remove the delay function in order to find out the number of packets that I would send in a second, taking into account the time spent processing the functions of ECANReceiveMessage and the send() function dedicated of constructing the message and sending it.
The result has been 1132 packages in 1 second, which would be equivalent to a period of 0.88 ms per message sent.

We learned today that this result is unpredictable, but seems to entail no more than a 1 ms delay due to internal hardware message buffering/queueing. By adding back the delay function call, messages are sent with the expected frequency.

Another issue arised: sometimes, the start push command renders no response from the PIC firmware. Also, we noticed that one or two error CAN frames are observed by the traffic analyzer right on application start. I presume some queued message originating from the previous run is malformed and sent in such shape by whatever device.

@PeterBowman
Copy link
Member Author

Current state of the apocanlypse branch fulfills the goals of the CAN-TEO project as stated in the issue description:

  • a123bc9 refactors the CuiAbsolute device and wraps it in TechnosoftIpos so that absolute encoder reads are only requested on initial configuration; then, this value is used on normal operation

  • a9d0ed1 accomplishes Expose Cui reset encoder command in firmware #233 (comment):

    We want to store absolute encoder readings on device init (Avoid bloat of CuiAbsolute CAN messages #217) so that the resulting offset value is added to the relative encoder reads on runtime. It's certainly a bit less of a trouble to call setEncoder on init to synchronize relative and absolute encoders instead. There is no need to involve a local variable (which should be properly set/reset by setEncoder in that scenario).

@rsantos88
Copy link
Contributor

with i=5, the total number of packets are between 810-819 received in 1 second. Taking the highest value, the period is 1,22ms per message sent.

@rsantos88
Copy link
Contributor

Due to the new code modifications, tests will be carried out on a single TEO member (for example, left arm) to check with a startPushPublishing (5) value, the CAN traffic level.
Rescued issues with problems encountered when reprogramming PICs (https://github.com/roboticslab-uc3m/teo-hardware-issues/issues/38)

@jgvictores
Copy link
Member

@PeterBowman
Copy link
Member Author

PeterBowman commented Oct 5, 2019

Pull command (re)implemented and successfully tested by @rsantos88, see #233 (comment).

@rsantos88 rsantos88 removed the blocked label Oct 8, 2019
@PeterBowman
Copy link
Member Author

Commit 8bba682 introduced retries on failed CAN transfers (current default: 5 retries).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants