Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Bus off (IDFGH-393) #2519

Closed
MrAlexandrKolesnikov opened this issue Oct 4, 2018 · 18 comments
Closed

Can Bus off (IDFGH-393) #2519

MrAlexandrKolesnikov opened this issue Oct 4, 2018 · 18 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Resolved Issue is done internally

Comments

@MrAlexandrKolesnikov
Copy link

MrAlexandrKolesnikov commented Oct 4, 2018

Hi! Is there any way to restart the can module when bus off without waiting for 128 bus free signal?
Reset Mode or full reinit has no effect.

@Alvin1Zhang Alvin1Zhang changed the title Can Bus off [TW#26640] Can Bus off Oct 8, 2018
@Dazza0 Dazza0 self-assigned this Oct 8, 2018
@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 9, 2018

@MrAlexandrKolesnikov I'm able to restart the CAN module and send messages after a bus off by doing the following

can_driver_uninstall();
can_driver_install(&g_config, &t_config, & f_config);
can_start();

Could you send a code snippet of your bus off handling code to recreate the issue you mentioned above?

@MrAlexandrKolesnikov
Copy link
Author

MrAlexandrKolesnikov commented Oct 9, 2018

I use only can_types.h on idf version 3.0. Any way restart can use cleaning and setting can module register?

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 9, 2018

I'm not sure where you got can_types.h. However can try the following

#include "driver/periph_ctrl.h"
...

periph_module_disable(PERIPH_CAN_MODULE);
periph_module_enable(PERIPH_CAN_MODULE);

This should have the effect of resetting the CAN module on the hardware level (gates the clock signal and sets the reset line) after which the registers in the CAN module should have the same default values as a power-on reset.

@MrAlexandrKolesnikov
Copy link
Author

MrAlexandrKolesnikov commented Oct 9, 2018

I use only direct access to can registers like CAN object on your code. And can work fine. But sometimes i get bus-off interrupt from it. I try clear register and set it as you do in

can_driver_uninstall();
can_driver_install(&g_config, &t_config, & f_config);
can_start();

but it has no effect. I also try:

#include "driver/periph_ctrl.h"
periph_module_disable(PERIPH_CAN_MODULE);
periph_module_enable(PERIPH_CAN_MODULE);

then call my init function
but it has no effect

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 9, 2018

Do you have a code snippet you could show of which registers you set/clear after a bus off?
Read outs of the mode, status, interrupt reason, and error counter registers after your re-initialization would also be helpful.

@MrAlexandrKolesnikov
Copy link
Author

MrAlexandrKolesnikov commented Oct 9, 2018

Of course!

When I get bus-off interrupt i do:
CAN.mode_reg.reset = 1;
(void)CAN.interrupt_reg.val
(void)CAN.arbitration_lost_captue_reg.val
(void)CAN.error_code_capture_reg.val
periph_module_disable(PERIPH_CAN_MODULE);
periph_module_enable(PERIPH_CAN_MODULE);
init();

init(); - call on esp start for initialize can register and always work fine

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 10, 2018

I'm still unable to recreate the issue. Could you send your init() function as well? I suspect the error may be due to a fault re-initialization.

@MrAlexandrKolesnikov
Copy link
Author

MrAlexandrKolesnikov commented Oct 10, 2018

Sory, i can't show you everything, but this is the whole part responsible for initializing the driver and our register description
can_driver.txt
can_types.txt

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 10, 2018

I don't see anything wrong with your init. Try running the following code but check the following points listed below.

periph_module_disable(PERIPH_CAN_MODULE);
periph_module_enable(PERIPH_CAN_MODULE);
init();
  • Check that the CAN module is in the correct state after re-enabling and re-initializing by reading the values of the mode register, status register, and both error counter registers. Confirm that reset mode has been exited, the bus-status bit is 0, and that both error counters have been reset to 0.

How do you determine that the re-init was unsuccessful? Are you attempting to transmit/receive a frame after re-initialization with no results? If so, check the following

  • Is the condition that caused the bus-off still present on the bus (i.e. is another node on the bus pulling the bus into a dominant state)? What is the state of the CAN bus after re-initialization? Please verify with a logic analyzer.
  • Is there any activity on the CAN bus when attempting to transmit or receive after re-initialization. Please verify with a logic analyzer.
  • Try re-initializing into self-test mode and transmit a frame to verify that the connection between the CAN module and the external transceiver is still operating correctly (i.e. the GPIOs have been configured correctly). Failure to do so can cause a CAN module to constantly interpret an attempted transmission as a loss in arbitration resulting in no activity on the CAN bus.

@MrAlexandrKolesnikov
Copy link
Author

MrAlexandrKolesnikov commented Oct 10, 2018

Ok, i see that RXERROR counter not clear after enable/disable. I try clear it but it steel equal 128. I check the line and see than CAN H and CAN L have the same level. Do you have any example how init into self-test mode?
Also i see that Receive Status register equal 0x1 after re-init but now I double check line and other node - none of the nodes are transmitting. And Arbitration Lost Capture don't clean if read it in reset mode. On init this register equal 0 but when I get bus-off and re-init he equal 7 and then always equal 7.

Yes, after re-init I can't transmit/receive a frame but other nodes can communicate with each other

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 10, 2018

CAN Self Test Example

Here's what should ideally occur

  1. When bus off occurs, the CAN module should automatically be put into reset mode. The RX error counter should be set to 0 and the TX error counter is set to 128.
  2. After a hardware reset (module disable then enable), the CAN module should be automatically be put into reset mode. Both error counters, ALC and ECC should be cleared to zero. TX and RX status should both be set. The rest of the configuration registers should remain unchanged.
  3. After reset mode exits (i.e. after calling init()), TX and RX status will be unset after observing 11 consecutive recessive bits on the bus.

Could you verify at which stage the RX error counter becomes 128 and ALC becomes 7. If the RX counter is 0 after a hardware reset and becomes 128 after reset mode exits, it suggests that something on the bus or the RX pin is causing the CAN module to interpret RX errors after re-initialization.

If the RX error counter becomes 128 after calling init() (i.e. reset mode has exit) WITHOUT transmitting anything. This suggests something external to the CAN module causing this increase. Double check that your selected GPIOs are not being shared with any other functionality (e.g. UART, SPI) and that your GPIO reconfiguration is correct. Anything that pulls the RX pin low can cause RX error counter to increase (e.g. GPIO pad being set to a pull down).

If the RX error counter becomes 128 after calling init() whilst something is being transmitted, the attempted transmission could be what is causing the RX error counter to increase.

@MrAlexandrKolesnikov
Copy link
Author

After a hardware reset ECC clear to zero, ALC not clear. My bad - RX error counter after init clear to zero but Receive Status register equal 0x1.

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 10, 2018

Is the status register = 0x1 or just the receive status bit in the status register?

@MrAlexandrKolesnikov
Copy link
Author

receive status bit steel equal 0x1

@Dazza0
Copy link
Collaborator

Dazza0 commented Oct 10, 2018

As mentioned above, once init() is called, both the receive status bit and transmit status bit should both be set to 0x1 until 11 recessive bits are observed on the bus. Try putting a short delay (e.g. 500ms) after init() and check the status register again. If receive status bit is still 0x1 after the delay, this indicates that the CAN module is receiving a message. If that is the case, you should check the state of your RX pin.

@Alvin1Zhang
Copy link
Collaborator

@MrAlexandrKolesnikov Could you help share if any updates for this issue? Thanks.

@projectgus projectgus changed the title [TW#26640] Can Bus off Can Bus off (IDFGH-393) Mar 12, 2019
@Dazza0
Copy link
Collaborator

Dazza0 commented Aug 19, 2019

@MrAlexandrKolesnikov closing this issue due to lack of updates. Please feel free to open a new issue if you are still experiencing problems and suspect it is a bug in ESP-IDF. Please also take a look at the ESP32 ECO document to see if any of the CAN Errata are relevant to the issues you are facing.

@Dazza0 Dazza0 closed this as completed Aug 19, 2019
@Dazza0 Dazza0 reopened this May 21, 2020
@Dazza0
Copy link
Collaborator

Dazza0 commented May 21, 2020

Reopening as further testing has shown a possible edge case that can cause Bus Off Recovery to get stuck. When the Bus Off interrupt occur, the REC can still increase after Bus Off. The ISR handles this by freezing both counters. However, if the ISR is delayed in running, the REC will be non zero when Bus Off recovery begins. This in turn will result in interrupt that indicates Bus Off recovery completion to fail to trigger.

Workaround: In the ISR that handles Bus Off condition. Set the TEC to 0 and the back to 255 immediately. I'll push a commit to fix this shortly.

@espressif-bot espressif-bot added Status: In Progress Work is in progress Status: Reviewing Issue is being reviewed and removed Status: In Progress Work is in progress labels Mar 26, 2021
@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Resolved Issue is done internally and removed Status: Reviewing Issue is being reviewed labels Apr 8, 2021
espressif-bot pushed a commit that referenced this issue May 8, 2021
This commit adds handling for FIFO overruns and
adds workarounds for HW errats on the ESP32.

Closes #2519
Closes #4276
espressif-bot pushed a commit that referenced this issue May 22, 2022
This commit adds handling for FIFO overruns and
adds workarounds for HW erratas on the ESP32.

Closes #2519
Closes #4276
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Resolved Issue is done internally
Projects
None yet
Development

No branches or pull requests

4 participants