Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new SocketCAN-capable device #251

Closed
PeterBowman opened this issue Jun 13, 2020 · 17 comments
Closed

Create new SocketCAN-capable device #251

PeterBowman opened this issue Jun 13, 2020 · 17 comments
Assignees
Projects

Comments

@PeterBowman
Copy link
Member

PeterBowman commented Jun 13, 2020

In my process of working on a new PCAN-based YARP device at #146, I considered the creation of a more generic SocketCAN device as well. It did not prosper at the time (beyond some preliminary work on the fix-146-CanBusSocket branch), although it's definitely on my radar as the currently empty CanBusSocket directory suggests.

I'm tagging this as low priority given that it's not a pressing matter for our work on TEO; note we are currently fine with the CanBusPeak device. However, it is definitely worth exploring the non-vendor widespread solution that most Linux kernels support nowadays.

See also:

@PeterBowman PeterBowman added this to To Do in CAN-TEO Jun 13, 2020
@PeterBowman PeterBowman self-assigned this Jul 16, 2021
@PeterBowman PeterBowman moved this from To Do to In Progress in CAN-TEO Jul 16, 2021
@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 19, 2021

I'd like to retrieve certain information on the CAN-enabled sockets, such as current state (BUSOFF, errors...) and bitrate. This can be achieved via command line with ip -details -statistics link show <iface>. How can I translate this to C code?

See https://www.kernel.org/doc/html/latest/networking/can.html (text version).

Edit: answered below at #251 (comment).

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 20, 2021

For now, I'm just giving up on providing bitrate information to CanBusSocket users. In this scenario, the CAN bus brokers should be configured to not instantiate a bus load monitor by either not providing a busLoadPeriod option (example) or disabling YARP ports altogether (no name option). Bus load statistic are supposed to be easily obtained via the can-utils tool suite, namely using the canbusload app.

Edit: nah, let's keep that window open. Clients still can pass a bitrate value via config options and retrieve it later through the interface's getter. The setter is disabled, of course.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 21, 2021

Regarding socket buffers: https://stackoverflow.com/q/1803215/10404307.

Regarding block-with-timeout socket R/W: https://stackoverflow.com/q/47425873/10404307 and https://ubuntuforums.org/showthread.php?t=1565651&p=9794579#post9794579 (from #146 (comment)).

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 21, 2021

CAN ID filtering implemented following the advices at https://stackoverflow.com/q/57790381/10404307. I'm assuming that each setsockopt call resets the internal list of filters, thus overwriting previous ones, hence the need of storing a list of active filters in a vector. When a filter reset is requested, documentation says that the device enters a send-only state, i.e. it can't receive any message. Normally, a default filter is active such that all IDs are allowed. I have tried to replicate this by setting a filter with ID 0 | CAN_INV_FILTER, not tested yet.

In any case, we are not using filters in CanBusControlboard.

Edit: tested on a virtual CAN interface and working nicely. I'm actually prone to re-enable filters in CanBusControlboard.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 22, 2021

Thankfully, virtual CAN interfaces are a thing (docs)! Maybe I can test this device from my own home PC. To start such an interface, type:

sudo ip link add dev vcan0 type vcan
sudo ip link set vcan0 up

In fact, cansend, candump and canbusload seem to work nicely. Also, cangw can help emulate one or several nodes on a virtual bus: https://stackoverflow.com/a/33629359/10404307.

sudo ip link add dev vcan0 type vcan
sudo ip link set vcan0 up
sudo ip link add dev vcan1 type vcan
sudo ip link set vcan1 up
sudo modprobe can-gw
sudo cangw -A -s vcan1 -d vcan0 -e

I'm using the previous commands to redirect traffic from vcan1 to vcan0 (so vcan1 here plays the role of a CAN node, while vcan0 is the master), then listen to it via candump vcan0.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 26, 2021

Blocking and non-blocking reads and writes successfully tested with the following sample application (see previous comment regarding how to initiate two virtual CAN networks and bind them together):

#include <yarp/conf/numeric.h>

#include <yarp/os/LogStream.h>
#include <yarp/os/Property.h>
#include <yarp/os/SystemClock.h>

#include <yarp/dev/CanBusInterface.h>
#include <yarp/dev/PolyDriver.h>

int main()
{
    constexpr auto id = 25;
    constexpr auto block = false;
    constexpr auto BUFFER_SIZE = 10;

    yarp::os::Property options = {
        {"device", yarp::os::Value("CanBusSocket")},
        {"blockingMode", yarp::os::Value(block)},
        {"port", yarp::os::Value("vcan0")}
    };

    yarp::dev::PolyDriver device;

    if (!device.open(options))
    {
        yError() << "Cannot open";
        return 1;
    }

    yarp::dev::ICanBus * iCanBus;
    yarp::dev::ICanBufferFactory * iCanBufferFactory;

    if (!device.view(iCanBus) || !device.view(iCanBufferFactory))
    {
        yError() << "Cannot view";
        return 1;
    }

    yInfo() << "canIdAdd()" << iCanBus->canIdAdd(id);

    auto sendBuffer = iCanBufferFactory->createBuffer(BUFFER_SIZE);
    unsigned int sent;

    for (auto i = 0; i < BUFFER_SIZE; i++)
    {
        sendBuffer[i].setId(id);
        sendBuffer[i].setLen(2);
        unsigned char data[] = {0x01, 0x02};
        std::memcpy(sendBuffer[i].getData(), data, sizeof(data));
    }

    yInfo() << "canWrite():" << iCanBus->canWrite(sendBuffer, BUFFER_SIZE, &sent, block);
    yInfo() << "Sent" << sent << "messages";

    auto readBuffer = iCanBufferFactory->createBuffer(BUFFER_SIZE);
    unsigned int read = 0;

    do
    {
        iCanBus->canRead(readBuffer, BUFFER_SIZE, &read, block);
        yarp::os::SystemClock::delaySystem(0.005);
    }
    while (!block && read == 0);

    yInfo() << "Got" << read << "messages";

    for (auto i = 0; i < read; i++)
    {
        auto && log = yInfo();
        log << "id:" << readBuffer[i].getId() << "len:" << readBuffer[i].getLen() << "data:";

        for (auto j = 0; j < readBuffer[i].getLen(); j++)
        {
            log << yarp::conf::numeric::to_hex_string(readBuffer[i].getData()[j]);
        }
    }

    iCanBufferFactory->destroyBuffer(sendBuffer);
    iCanBufferFactory->destroyBuffer(readBuffer);
    yInfo() << "canIdDelete()" << iCanBus->canIdDelete(id);
    return 0;
}

@PeterBowman
Copy link
Member Author

@PeterBowman
Copy link
Member Author

According to https://stackoverflow.com/q/13021796/10404307, sockets are full-duplex and bi-directional, therefore lock mechanisms (e.g. a mutex) should not be necessary. In any case, reads and writes are carried out by separate CanBusControlboard threads.

@PeterBowman
Copy link
Member Author

Mainline kernel implementations for PEAK hardware (ref, see sources in previous comment):

  • peak_pci (since Kernel v3.2) is in charge of all SJA1000 based PCI/PCIe-like PC interfaces (PCAN-PCI, PCAN-PCI Express, PCAN-miniPCIe, ...).
  • peak_pcmcia (since Kernel v3.4) is in charge of our single or dual channel PCAN-PC Card interfaces.
  • peak_usb (since Kernel v3.4, CAN FD support since Kernel v4.0) is in charge of all our USB PC interfaces (PCAN-USB, PCAN-USB Pro, PCAN-USB FD, PCAN-USB Pro FD, PCAN-Chip USB, and PCAN-USB X6).
  • peak_pciefd (since Kernel v4.12) is in charge of our CAN FD PCIe-like PC interfaces (PCAN-PCI Express FD, PCAN-miniPCIe FD, and PCAN-M.2).

In TEO, we'd use peak_pciefd according to that.

@PeterBowman
Copy link
Member Author

CAN bus states along with the error count thresholds (ref):

/*
 * CAN operational and error states
 */
enum can_state {
	CAN_STATE_ERROR_ACTIVE = 0,	/* RX/TX error count < 96 */
	CAN_STATE_ERROR_WARNING,	/* RX/TX error count < 128 */
	CAN_STATE_ERROR_PASSIVE,	/* RX/TX error count < 256 */
	CAN_STATE_BUS_OFF,		/* RX/TX error count >= 256 */
	CAN_STATE_STOPPED,		/* Device is stopped */
	CAN_STATE_SLEEPING,		/* Device is sleeping */
	CAN_STATE_MAX
};

@PeterBowman
Copy link
Member Author

The previous enumeration originates from linux/can/netlink.h. This protocol is said to be "a more flexible successor to ioctl" (Wikipedia). It grants access to otherwise inaccessible CAN interface management such as: starting an interface, setting/getting the bitrate, querying current bus state, etc. Ideally, one would use the libsocketcan wrapper library to abstract from the details of the underlying calls.

More info:

NIce thing is: we would not need to type (or configure on start) ip link set canX up type can bitrate xxx prior to using the CAN bus as this could be done programatically.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 28, 2021

My findings on netlink and libsocketcan:

  • It provides no further means of sending or receiving messages. A separate socket must be opened anyway for that matter. I thought this protocol/interface/library would return a reusable socket descriptor, but it's not the case, apparently a new temporary socket is opened for any of these methods.
  • Somewhat along the lines of what I have read, I think that starting a virtual CAN (vcan) is not supported. The "can" module is hardcoded here.
  • Being able to open/close a CAN interface, set its bitrate and optionally configure an automated restart on bus-off conditionally is something that can be done either via command line (ip link) or programatically using libsocketcan. Because of that, having C++ code do the same is not that much of an advantage. Moreover, sometimes we may want to start candump or any other listening app before running launchCanBus; this would not be possible if the CAN interface is opened during the execution of launchCanBus.
  • What does libsocketcan actually offer? We can access bus state via can_get_state, query its bitrate via can_get_bittiming, and retrieve error counters via can_get_berr_counter, can_device_stats or can_get_link_stats (details on the last one). These enable a potentially exhaustive implementation of ICanBusErrors.

Is libsocketcan worth it? I am mostly interested in solving #246, which might be doable through other means. I just need to monitor the bus-off condition, I don't care about overflow and error counts stored along with that single Boolean in yarp::dev::CanErrors.

For future reference (maybe I decide to ditch this idea), I have written a find module for libsocketcan at 46818a8.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 29, 2021

Current non-netlink WIP implementation reaches error-passive state on TX upon pressing the emergency button. This fires up at exactly 128 TX errors (#251 (comment)). The output is as follows:

[WARNING] |rl.CanBusSocket| Controller: reached warning level for TX errors (can3)
[WARNING] |rl.CanBusSocket| Controller: reached error passive status TX (can3)

Then, all subsequent CAN writes fail with:

[ERROR] |rl.CanBusSocket| write() failed: No buffer space available

Bus statistics can be obtained via sudo ip -details link show can3:

6: can3: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 minmtu 0 maxmtu 0 
    can state ERROR-PASSIVE (berr-counter tx 128 rx 1) restart-ms 0 
	  bitrate 1000000 sample-point 0.750 
	  tq 12 prop-seg 29 phase-seg1 30 phase-seg2 20 sjw 1
	  peak_canfd: tseg1 1..256 tseg2 1..128 sjw 1..128 brp 1..1024 brp-inc 1
	  peak_canfd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..1024 dbrp-inc 1
	  clock 80000000 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

It never goes beyond those 128 TX errors, which means bus-off state (256 errors) is never reached. We need it to prevent spurious writes from happening, see #246 and:

//-- Query bus state.
if (!iCanBusErrors->canGetErrors(errors) || errors.busoff)
{
//-- Bus off, reset TX queue.
preparedMessages = 0;
return;
}

Output of ifconfig command:

can3: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 73798  bytes 380798 (380.7 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10036  bytes 3985 (3.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 16

Note the txqueuelen 10 (UNSPEC) part. My hackish solution was to increase this default TX queue length, e.g. via sudo ifconfig can3 txqueuelen 1000 (too low values, e.g. 200, are not fine). In this case, write() keeps trying to send stuff, but nothing actually gets sent (according to candump) and the TX error counter doesn't get increased either (according to ip -details link).

We can conclude that netlink/libsocketcan would have made no difference since it returns the same information as ip -details link.

@PeterBowman
Copy link
Member Author

PeterBowman commented Jul 29, 2021

Done at c86fc25. Configuration files have been not prepared yet, but it's a trivial task. I'm going to consider a migration from CanBusPeak to CanBusSocket as the default CAN device in September. The commands that need to be added on start for each canX interface are:

sudo ip link set canX up type can bitrate 1000000
sudo ifconfig canX txqueuelen 1000

Prior to that, if proprietary PCAN chardev drivers are decided to be kept, the mainline "peak_pciefd" module must be un-blacklisted (commented out) from /etc/modprobe.d/blacklist-peak.conf. Perhaps it's enough to just type sudo modprobe peak_pciefd, I'm not sure.

CAN-TEO automation moved this from In Progress to Done Jul 29, 2021
@PeterBowman
Copy link
Member Author

PeterBowman commented Sep 14, 2021

SocketCAN bus interfaces can be enabled via systemd using this config file located at /etc/systemd/system/canX.service (ref):

[Unit]
Description=SocketCAN interface canX with a baudrate of 1000000
After=multi-user.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/ip link set canX up txqueuelen 1000 type can bitrate 1000000 restart-ms 100
ExecReload=/sbin/ifconfig canX down ; /sbin/ip link set canX up txqueuelen 1000 type can bitrate 1000000 restart-ms 100
ExecStop=/sbin/ifconfig canX down

[Install]
WantedBy=multi-user.target

Replace canX with can0, can1... Make sure to chmod u+x this file. Then use these commands:

  • sudo systemctl start canX to start this service
  • sudo systemctl enable canX to enable it on startup
  • sudo systemctl status canX whenever you need to check its state
  • stop, restart, disable - as expected

Discarded solutions:

If you want to disable PeakCAN and enable SocketCAN:

  1. Comment out peak_pciefd in /etc/modprobe.d/blacklist-peak.conf
  2. sudo modprobe -r pcan
  3. sudo modprobe peak_pciefd
  4. sudo systemctl start canX
  5. sudo systemctl enable canX

If you want to disable SocketCAN and enable PeakCAN:

  1. Uncomment peak_pciefd in /etc/modprobe.d/blacklist-peak.conf
  2. sudo systemctl stop canX
  3. sudo systemctl disable canX
  4. sudo modprobe -r peak_pciefd
  5. sudo modprobe pcan

Step 1 and enable/disable are necessary to make the changes persist across reboots.

@PeterBowman
Copy link
Member Author

PeterBowman commented Sep 15, 2021

For future reference, left arm (7 nodes) CAN bus usage at 2 ms is 75% (or 68% according to canbusload -e, which takes into account stuffed bits) and CPU usage around 9-11% with CanBusPeak and 7-9% with CanBusSocket.

I have merged roboticslab-uc3m/teo-configuration-files@3627e52. The default CAN bus device was switched from CanBusPeak to CanBusSocket on both manipulation and locomotion PCs. See above comment for instructions on how to easily switch between CAN devices and networks. Mind that the proprietary PEAK module might become obsolete due to Linux header upgrades and thus require a rebuild. Also, launch files need to be tweaked in order to pick the correct device implementation (replace socket-XXX with peak-XXX or viceversa in these lines).

In order to keep the workspace clean, it is best to use YARP's resource finder system and locally installed files (as opposed to copy-pasting lots of .ini files in /home/teo). Please refer to yarp-config for usage notes.

cc @rsantos88 @smcdiaz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
CAN-TEO
  
Done
Development

No branches or pull requests

1 participant