Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socketcan write() errors in csp_can_tx_frame(): ENOBUFS vs EAGAIN #453

Open
Buzzzz opened this issue Sep 20, 2023 · 2 comments
Open

socketcan write() errors in csp_can_tx_frame(): ENOBUFS vs EAGAIN #453

Buzzzz opened this issue Sep 20, 2023 · 2 comments
Assignees

Comments

@Buzzzz
Copy link

Buzzzz commented Sep 20, 2023

I'm using a Raspi 3B+ as a CAN bus participant, with the can0 interface. I have noticed that when sending many CSP packets, the write() in csp_can_tx_frame() eventually returns with EAGAIN = Resource temporarily unavailable. But the code only tests for ENOBUFS for retrying until timeout.
Is is problematic to check for both errno values? I have changed the code to this and so far have no problem. The EAGAIN condition goes away within the timeout. I have never seen write() fail with ENOBUFS.

Origial code in src/drivers/can/can_socketcan.c, function csp_can_tx_frame():

    while (write(ctx->socket, &frame, sizeof(frame)) != sizeof(frame)) {
        if ((errno != ENOBUFS) || (elapsed_ms >= 1000)) {

My new code:

    while (write(ctx->socket, &frame, sizeof(frame)) != sizeof(frame)) {
        if (((errno != ENOBUFS) && (errno != EAGAIN)) || (elapsed_ms >= 1000)) {

Is ENOBUFS actually a possible error for an AF_CAN socket? The man page says "ENOBUFS No buffer space available (POSIX.1 (XSI STREAMS option))."

@yashi
Copy link
Member

yashi commented Sep 20, 2023

I'd do

  1. check whether write() returned negative, if so get out.
  2. check the returned value against sizeof(frame). Because this is writing to a network socket, write() could return with a partial write. I dout it though since the frame size is 8 bytes. Handle it if we need to. write() could return 1 every time it's called then you are sending one byte for 1000 times. -- edit: I assume we want a partial write to be a failure.

Note that,

  • Adding yet another error number doesn't catch all erros. A manpage for write(2) lists many more including EINTR.
  • errno is basically a thread-local global variable, meaning you have to set to 0 before calling a system call. Apearently we don't. If system call failed with -1 it is guaranteed to be update. No functions set it to 0 in any situations.
  • It might be better to change usleep() to a monotinic timer because using usleep() doesn't count while reading. This might not be important since it desn't take long to write or it doesn't loop that many.

Just my two cents.

@johandc
Copy link
Contributor

johandc commented Sep 28, 2023

ENOBUFS will come when the TX queue on the CAN bus becomes full, this happens quite frequently, unless the can interface init increases the number of TX buffers. We run the TX buffers at 1000. But we do se ENOBUFFS after loosing the connection to the CAN bus for a very long time.

I think adding the EAGAIN to the list does not hurt, if thats what your system is giving you. The philosophical troubles of the ERRNO is annoying, and indead we should have something like write_r which returns its reentrant error codes. That would be better, but its not standardised. I think instead using TLS (thread local storage) is the preferred method, but i might be wrong.

The usleep does not exactly need to be precise, so for simplicity i think we should keep it.

@yashi yashi self-assigned this Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants