Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: can: mcan: Add CAN statistics #62758

Merged
merged 1 commit into from Sep 21, 2023

Conversation

gramsay0
Copy link
Contributor

@gramsay0 gramsay0 commented Sep 18, 2023

I had a go at adding MCAN statistics, although I ran into a couple of issues:

  • I tested a TX with nothing connected to the CAN transceiver and got endless AckError ISRs (>100k in 30 seconds)
    • i.e. shouldn't this stop when the controller enters an error state?
  • The PSR Protocol Exception Event (PXE) bit never seems to get set - I misunderstood what this is used for

I am testing with this MCAN version:

can_mcan: IP rel: 3.2.3 08.6.8

Any thoughts/ideas on these?

@henrikbrixandersen

@gramsay0
Copy link
Contributor Author

gramsay0 commented Sep 19, 2023

I tested this with Linux MCAN and got the same result, hundreds of thousands of bus errors:

image

Interestingly, if I disable the transceiver, lower the bitrate (testing with 250k) or sneeze then the behavior is different (usually the ISR is called just once and the controller goes to bus-off).

The front end of the CAN transceiver is terminated:
image

Does the ISR maybe need a hold-off time to prevent chewing up CPU cycles under certain circumstances?

@henrikbrixandersen
Copy link
Member

  • I tested a TX with nothing connected to the CAN transceiver and got endless AckError ISRs (>100k in 30 seconds)
    • i.e. shouldn't this stop when the controller enters an error state?

No, that is expected. The Bosch M_CAN will attempt retransmission when it sees a missing ACK (and trigger the AckError interrupt). The retransmission also sees a missing ACK and so it continues.

drivers/can/can_mcan.c Outdated Show resolved Hide resolved
include/zephyr/drivers/can/can_mcan.h Show resolved Hide resolved
drivers/can/can_mcan.c Outdated Show resolved Hide resolved
drivers/can/can_mcan.c Outdated Show resolved Hide resolved
@henrikbrixandersen
Copy link
Member

I wonder how you verified this? Did you use an out-of-tree driver front-end? You neglected to update all the in-tree driver front-ends from using DEVICE_DT_INST_DEFINE() to using CAN_DEVICE_DT_INST_DEFINE().

@gramsay0
Copy link
Contributor Author

gramsay0 commented Sep 19, 2023

  • I tested a TX with nothing connected to the CAN transceiver and got endless AckError ISRs (>100k in 30 seconds)

    • i.e. shouldn't this stop when the controller enters an error state?

No, that is expected. The Bosch M_CAN will attempt retransmission when it sees a missing ACK (and trigger the AckError interrupt). The retransmission also sees a missing ACK and so it continues.

Yes, although I thought the controller would go to bus-off after enough failures.

I tested this with a zero byte CAN payload at 1Mbps, it generated an ISR every 54us until the controller state went to error-passive, and then every 62us indefinitely.

Do you think this could cause loading/performance issues on some CPUs (especially with external controllers like tcan4x5x over SPI)?

On second thought, this driver is only really an interface to the hardware, and that's just what the hardware does 🤷
Maybe a warning in the CAN_STATS kconfig would suffice, and let the user decide if it meets their requirements

@zephyrbot zephyrbot added platform: Microchip SAM Microchip SAM Platform (formerly Atmel SAM) platform: STM32 ST Micro STM32 platform: NXP NXP labels Sep 19, 2023
@gramsay0 gramsay0 changed the title dnm: drivers: can: mcan: Add CAN statistics drivers: can: mcan: Add CAN statistics Sep 19, 2023
drivers/can/can_mcan.c Outdated Show resolved Hide resolved
drivers/can/can_mcan.c Outdated Show resolved Hide resolved
@henrikbrixandersen
Copy link
Member

Yes, although I thought the controller would go to bus-off after enough failures.

Do you have CAN_AUTO_BUS_OFF_RECOVERY=y?

@gramsay0
Copy link
Contributor Author

Yes, although I thought the controller would go to bus-off after enough failures.

Do you have CAN_AUTO_BUS_OFF_RECOVERY=y?

I did try toggling that, same result.

Is auto-recovery implemented for MCAN, it looked like the config only optionally compiled the recover function?

Copy link
Member

@henrikbrixandersen henrikbrixandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One thing missing is a call to CAN_STATS_RESET() in can_mcan_start(). See

CAN_STATS_RESET(dev);
for inspiration.

@henrikbrixandersen
Copy link
Member

Is auto-recovery implemented for MCAN, it looked like the config only optionally compiled the recover function?

That implementation does look a bit fishy. We will need to look into that. From a quick test here, it looks like the controller is going into bus-off, but auto-recovers. I think this explains the jump from 54 usec to 62 usec.

@gramsay0
Copy link
Contributor Author

That implementation does look a bit fishy. We will need to look into that.

👍

I think this explains the jump from 54 usec to 62 usec.

I think it's the "8 bit suspend transmission time" when in error-passive state (1Mbps == 1bit/us):
image
Bus-off recovery should be 128*11 = 1408us

@henrikbrixandersen
Copy link
Member

I think it's the "8 bit suspend transmission time" when in error-passive state (1Mbps == 1bit/us):

Ahh, right. Good find. I still think it is okay behaviour for now. If you can add the call to reset the statistics on start, I think it's ready to go in.

Add CAN stats for MCAN drivers.

Update MCAN drivers to use CAN_DEVICE_DT_INST_DEFINE
which initialises and registers CAN stats if enabled.

Signed-off-by: Grant Ramsay <gramsay@enphaseenergy.com>
@carlescufi carlescufi merged commit 663826a into zephyrproject-rtos:main Sep 21, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: CAN platform: Microchip SAM Microchip SAM Platform (formerly Atmel SAM) platform: NXP NXP platform: STM32 ST Micro STM32
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants