Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add machine.CAN low-level API docs. #13149

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

projectgus
Copy link
Contributor

@projectgus projectgus commented Dec 7, 2023

First step to implementing a cross-port set of machine.CAN drivers, as outlined in #12337. Refer to the linked PR for the whole plan.


This is a draft of a new API at this point. There might still be significant changes before it can be accepted.


Outline

The proposed API implements machine.CAN as a low-level hardware abstraction layer that provides three things:

  1. A priority hardware queue of outgoing CAN messages.
  2. A FIFO hardware queue of incoming CAN messages, with optional hardware filters.
  3. Bus error & state change reporting and control.

machine.CAN is enough to do some very basic CAN operations, even more basic than pyb.CAN. The idea is to layer micropython-lib modules for can and aiocan on top of machine.CAN. These modules will provide extra functions like software message queues, a more "Pythonic" API, etc.

The machine.CAN API is also simple enough that it should be possible to implement pure Python drivers for external CAN controllers like MCP2515, MCP2518FD, etc. and use these with can and aiocan.

can & aiocan

Currently this higher layer in micropython-lib (not part of this PR) is less well defined, but I''d like it to look quite similar to python-can, at least for the synchronous API. Anticipating the code in the higher layer will look something like:

import can
import machine

mcan = machine.CAN(0, bitrate=1_000_000, mode=machine.CAN.NORMAL)
bus = can.Bus(mcan, send_depth=8, recv_depth=16)  # Set explicit software message queue depths

msg = can.Message(0x300, data=bytearray(8), is_extended=False)
counter = 0

def send_message(_timer):
    global counter
    counter += 1
    struct.pack_into("I", msg.data, 0, counter)
    bus.send(msg)

tim = machine.Timer(-1)
tim.init(period=100, callback=send_message)

r_msg = None

while True:
    r_msg = bus.recv(timeout=None, message=r_msg)  # reuse Message object
    if r_msg.identifier < 0x300:
       print(hex(r_msg.identifier, r_msg.data.hex())

Or async:

import asyncio
import aiocan
import machine

mcan = machine.CAN(0, bitrate=1_000_000, mode=machine.CAN.NORMAL)
bus = aiocan.Bus(mcan, send_depth=8, recv_depth=16)

bus.filter(aiocan.Filter(message=0x100, mask=0x100, is_extended=False))

async def send():
    msg = aiocan.Message(0x300, data=bytearray(8), is_extended=False)
    counter = 0
    while True:
        await asyncio.sleep_ms(100)
        counter += 1
        struct.pack_into("I", msg.data, 0, counter)
        await bus.send(msg, timeout=None)

async def recv():
    r_msg = None
    while True:
        r_msg = await aio.recv(timeout=None, message=r_msg)
        if r_msg.identifier < 0x300:
           print(r_msg)

async def main():
    await asyncio.gather(send(), recv())

asyncio.run(main())

Limitations

  • Dealing with multiple hardware FIFOs is left for the driver to manage. The caller sees incoming CAN messages in FIFO order.
  • Filtering may be less capable than what some hardware is able to do.

Concerns

  • Python code may not be a fast enough "message pump" to keep the hardware TX queue full, and the RX queue empty. I think it will be, but remains to be proven.
  • CAN controller differences may make a common machine.CAN API too difficult to implement, with too many "quirks" of behaviour.
  • Not including a recv() method (only a callback) in the low-level machine.CAN may be too restrictive. However, the goal is for everyone to use the higher level can or aiocan modules.

Credit

This API is variously inspired by MicroPython aioble, pyb.CAN, CANPico, and the Rust bxcan crate. Misunderstandings and bad takes are 100% my own work, though!

This work was funded through GitHub Sponsors.

@projectgus
Copy link
Contributor Author

Regretfully I am putting this up and vanishing on holiday for a few weeks. 🏖️ I will be back around Jan 2 and will carefully read any comments then. Looking forward to pushing ahead with the CAN API!

@projectgus projectgus changed the title docs: Add machine.CAN API docs. docs: Add machine.CAN low-level API docs. Dec 7, 2023
@robert-hh
Copy link
Contributor

I am putting this up and vanishing on holiday for a few weeks.

This is quite a cliff-hanger ...

Copy link
Sponsor Contributor

@kwagyeman kwagyeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API seems to work. I would reconsider adding a recv() method so folks don't have to change their code too much. If the higher level CAN module lands at the same time then I think there's not a problem.

Anyway, after this is approved I can modify the imxrt can implementation to follow this spec pretty easily.

docs/library/machine.CAN.rst Outdated Show resolved Hide resolved
docs/library/machine.CAN.rst Outdated Show resolved Hide resolved
docs/library/machine.CAN.rst Outdated Show resolved Hide resolved
docs/library/machine.CAN.rst Show resolved Hide resolved
@kdschlosser
Copy link

There should be 2 distinct classes for the CAN. It should be CAN and CANFD. they should not be combined. I would also recommend not using "CAN" or "CANFD" as these are copyrighted by Bosch. I believe the copyright has expired in the US but it has not in Europe and use of those phrases could cause an issue. This is the reason why Espressif uses TWAI.

Regardless of what it is named there needs to be a distinction made between the 2 network types. There should also be a "Message" class. This class represents a CAN frame. It contains the arbitration ID's, the data contained in the packet. along with any other information that is attached to a frame. The information that is going to be available in a message is going to vary based on the flavor and version of CAN that is being used nd if something in a message is not used then it would be set to None. The message class is what gets handed back to a user when collecting the received messages. This is what gets passed to the transmit function. only 2 things should get passed to the transmit function, one of these messages and a timeout if wanting to make a blocking transmit call. If a call is non blocking (user has a registered callback for transmit errors) then timeout is optional and would carry a default value of -1.

Few years back I wrote a program that you can feed in a bitrate and some static limits based on the CAN IC being used and it would return the registers needed to set the bitrate for the CAN IC being used. This is something that MUST be available. A user really should not have to know things like BRP, TSEG1, TSEG2, SJW, PHSEG1, PHSEG2, TQ, PROPSEG, (CNF1, CNF2 and CNF3), (BTR0 and BTR1), DCANBTR, (BTP and FBTP), (CiCFG1 and CiCFG2), CxBCR, CnBTR, CiCONR, etc.... The user should be able to pass to the constructor the target bitrate, bus length(optional), transceiver timings (optional) and number of nodes (optional) and the software will calculate the required registers needed to make it work. Optionally There can be BRP, TSEG1, TSEG2, SJW and sample point parameters can be used. The software should have the ability to take those optional parameters and turn them into the timing registers for the CAN interface that is being used.

I recommend splitting the CAN implementation into 2 pieces. the first piece is what the user interacts with. it is the classes. The other is the bus. The only exposed portion of the bus is the constructor. The user will not have access to any other portion of the bus. This would allow for MCU's that do not have built in CAN interfaces to be able to connect an external one over I2C or SPI. This would keep the exposed user API identical across the board. The "busses" would need to have specific functions that the CAN classes would call to perform tasks. The API for those functions would be defined as well. The bus would be responsible for carrying out the task that it has been instructed to do and it must do it with the information that has been given.

The hardest part is the initialization of a bus. There is no standard so the information that is needed by an IC is going to be all over the place based on model and manufacturer of the IC. That is why it is imperative that code like what I wrote is able to calculate the timing registers using a standardized set of inputs. Without having that available you will not be able to have a common API between the different IC's.

@projectgus
Copy link
Contributor Author

@kdschlosser Thanks for the in-depth comments here.

I would also recommend not using "CAN" or "CANFD" as these are copyrighted by Bosch. I believe the copyright has expired in the US but it has not in Europe and use of those phrases could cause an issue. This is the reason why Espressif uses TWAI.

Perhaps you're thinking of patents and/or trademarks? It's the patent on CAN which has expired. CAN and CAN-FD are still Registered Trademarks of Bosch, and IANAL but as I understand it their licensing concerns are around hardware/gateware implementations and not software drivers for those implementations. For all controllers except for Espressif TWAI, we're writing drivers for CAN controller hardware (presumably with any applicable hardware IP licensing in place), so I think it's reasonable to name the driver as such. (See also: CAN drivers in open source operating systems like Zephyr, Linux, NuttX.) For the particular case of Espressif then I guess we can add some disclaimer that it's not actual CAN controller hardware in that case, and/or the low level part can be machine.TWAI.

A user really should not have to know things like BRP, TSEG1, TSEG2, SJW, PHSEG1, PHSEG2, TQ, PROPSEG, (CNF1, CNF2 and CNF3), (BTR0 and BTR1), DCANBTR, (BTP and FBTP), (CiCFG1 and CiCFG2), CxBCR, CnBTR, CiCONR, etc.... The user should be able to pass to the constructor the target bitrate, bus length(optional), transceiver timings (optional) and number of nodes (optional) and the software will calculate the required registers needed to make it work. Optionally There can be BRP, TSEG1, TSEG2, SJW and sample point parameters can be used

You make a very good point, fully agree. I note @kwagyeman said similar, too. Regarding exposing non-hardware-specific constructor options for bus length and transceiver timings: do you know if most manufacturers provide reasonable algorithms in order to calculate low-level configuration register values from these parameters? If so, then it seems reasonable to require drivers to expose these as common arguments. If not, then falling back to controller-specific register arguments may be the best (i.e. minimal cross-hardware common settings, maximum hardware-specific flexibility). I'll dig into this myself as well.

There should also be a "Message" class. This class represents a CAN frame. [...]
I recommend splitting the CAN implementation into 2 pieces. the first piece is what the user interacts with. it is the classes. The other is the bus. The only exposed portion of the bus is the constructor. The user will not have access to any other portion of the bus.

Unless I'm missing something then this is essentially the design proposed in this PR. The part in the machine.CAN.rst file is the low level hardware-specific API, and the idea is to ship a high-level Pythonic API in micropython-lib that is what most folks will interact with (the sample code in the description of this PR is to give a taste of that API).

We don't specifically disallow using the lower level classes directly (it is Python not C++ after all), but we'll try to ship the high level classes at the same time, document them well, etc. The question mentioned above of whether or not to include a simple recv() in the low-level API at all is probably the outstanding question of how much we want to "nudge" versus "shove" people towards always using the high-level classes.

If you think I've missed another distinction compared to the design you've described, please let me know.

There should be 2 distinct classes for the CAN. It should be CAN and CANFD. they should not be combined.

I'm not sure I follow the reasoning here, especially as CAN and CAN-FD nodes can exist on the same bus so the controller needs to be aware of both protocols for many applications. Can you please explain your reasoning some more, in particular the pitfalls of a combined design? Particularly if most users are going to be working with a higher level common API anyhow (if I understand your comments correctly, you're saying that the higher layer API should be combined.)

@projectgus projectgus force-pushed the doc/machine_can branch 2 times, most recently from 564f399 to eda34bc Compare January 10, 2024 06:42
@projectgus
Copy link
Contributor Author

@IhorNehrutsa @obdevel @andrewleech @kentindell Do any of you have time to take a look at this proposed API? Additional feedback or questions would be very welcome! Cheers.

@kentindell
Copy link

A few comments.

  • Very pleased to see a priority queue. Nearly everyone gets this wrong and breaks their CAN implementation as a result. But there is also a need to feed a priority queue from one or more FIFO queues (ISO TP messages will need this, for example). It's better to define the way to do that than have everyone roll their own.
  • On the timing properties of CAN, you shouldn't take a cue from what the hardware does, but rather specify exactly the parameters defined in ISO 11898. These are complicated, and people just want to get started, so define the common bit rates as profiles and let people choose one. This has the benefit of making it harder to screw up (people always get the sample point in the wrong place because they're thinking of UARTs and reckon 50% is good - which it very much isn't). I really wouldn't try to calculate these fields dynamically: it's a complex problem. Better to have simple quickstart settings and then anyone who wants to go off-route can do their own thing with ISO standard parameters.
  • The above goes double for CAN FD!
  • ISO 11898 defines some options that the hardware may choose to support, like disabling error re-transmission or ignoring CAN FD frames. I would include those as defined options because they're standard ones, and then ignore them where the hardware doesn't support them.
  • I don't think you'll get away with Python only for the low level drivers for 3 TX buffer controllers like the bxCAN. Building a priority queue in software requires the ISR responding to the TX / Abort interrupt to juggle buffers very quickly. But I guess we'll have to see.
  • It would be nice to support time-stamping in a standard way: most modern hardware does it, and it's needed for someone doing time-triggered CAN in software (I don't think TTCAN solves real-time problems, but that's another story).

@andrewleech
Copy link
Sponsor Contributor

Do any of you have time to take a look at this proposed API?

Unfortunately I've got very little actual knowledge about can usage, my interest is more in getting canopen working eventually to broaden the appeal of micropython at work :-D

@projectgus
Copy link
Contributor Author

projectgus commented Jan 10, 2024

Thanks @kentindell for weighing in here.

Very pleased to see a priority queue. Nearly everyone gets this wrong and breaks their CAN implementation as a result. But there is also a need to feed a priority queue from one or more FIFO queues (ISO TP messages will need this, for example). It's better to define the way to do that than have everyone roll their own.

In the lower layer, that's the intention of the send(..., fifo_equal=True) parameter. By default, send shouldn't requeue a message of equal priority and the caller will be able to get FIFO transmission. To requeue a newer message with equal priority over the top of one already in the hardware buffer requires setting fifo_equal=False. However this doesn't account for the possibility that you might want FIFO transmission of messages with different priorities (is that ever a thing?)

On the higher layer, the "FIFO send" API is not yet defined. For ergonomics, it might need to be made more explicit with a separate queue or a different API.

How does that sound?

On the timing properties of CAN, you shouldn't take a cue from what the hardware does, but rather specify exactly the parameters defined in ISO 11898. [..]
The above goes double for CAN FD!

Understood. See the threaded discussion above for more on this one.

ISO 11898 defines some options that the hardware may choose to support, like disabling error re-transmission or ignoring CAN FD frames. I would include those as defined options because they're standard ones, and then ignore them where the hardware doesn't support them.

Thanks, will look at these.

I don't think you'll get away with Python only for the low level drivers for 3 TX buffer controllers like the bxCAN. Building a priority queue in software requires the ISR responding to the TX / Abort interrupt to juggle buffers very quickly. But I guess we'll have to see.

Agree, we'll see. I'm confident the higher level Python ISRs won't need to allocate, so on STM32 this means they can be a "hard" ISR. But if we end up needing to implement the software priority queue in C, that's fine (it can still be a generic piece of code shared by all drivers, which is a win.)

It would be nice to support time-stamping in a standard way: most modern hardware does it, and it's needed for someone doing time-triggered CAN in software (I don't think TTCAN solves real-time problems, but that's another story).

Do you mean on the sending side, so you could schedule a message to send in a particular window for TTCAN-like functions? This feels like it's best to leave above the individually controller driver layer, even if that reduces timing tolerance a bit. (From what little I know about TTCAN it seems like the sort of thing where you probably need a harder real-time system than MicroPython to pull it off, anyway?)

On the receive side, what's written in there now is to have all drivers timestamp incoming messages with time.ticks_us(), and the driver can do its best to accurately calculate that timestamp. Do you think that's precise enough? In Python applications it's pretty difficult to do anything with much finer timing.

@kentindell
Copy link

I think microsecond accuracy timestamps are fine (and a natural human readable unit). I just hardwired the timestamps in the CANPico drivers to this. It's enough accuracy to spot crystal clock drift between devices (the reason for collecting the transmit timestamp too is that you can send a "follow up" message of that timestamp, and devices can sync their local clocks and then coordinate actions at the same time).

I don't think it's worth supporting FIFOs with different IDs, by the way: horrible side effects and don't think it's helpful. But multiple FIFOs into a single priority transmit queue is important (multiple ISO TP messages, and MQTT over CAN).

Someone mentioned using the HAL to do the drivers. I can tell you that's a terrible idea: I've never seen a HAL drive CAN properly. They routinely suck, almost no-one who writes CAN drivers has heard about priority inversion, and I've never seen anyone write bxCAN drivers properly (I have some proper bxCAN drivers you can have if you want: supports up to 32 frames in a priority queue in RAM).

My other advice is not to separate FD from CAN. No-one using MicroPython is going to worry about the buffer sizing issues, and you can in any case use what the CAN hardware provides (in the 2518FD hardware I just make the priority transmit queue have a depth of 20). For the receive FIFO it's just a FIFO and different frame sizes aren't a problem.

When CAN XL hardware turns up with its 2048 byte frames, then we can revisit this problem.

@kdschlosser
Copy link

kdschlosser commented Jan 11, 2024

to give a basic idea of how to go about doing this where the front end API would remain a constant across all MCU's

CAN.zip

@projectgus
Copy link
Contributor Author

projectgus commented Jan 11, 2024

MicroPython priorities RAM usage heavily, and there are more than two possible ways to approach this. It should be entirely possible to have a single controller driver that pre-allocates buffers of different sizes depending on whether or not CAN FD is in use.

@kdschlosser

This comment was marked as off-topic.

@projectgus
Copy link
Contributor Author

I've edited this thread to remove some comments. MicroPython has a code of conduct, and we will enforce it if necessary. Making constructive comments about the specific technical merits (or not) of an idea is very welcome, but it's essential to do so respectfully.

First stage to implementing a cross-port set of machine.CAN driver.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@projectgus
Copy link
Contributor Author

I've made some more updates based on the feedback here, thanks everyone for your time looking over this doc.

Next will try implementing an initial driver. Will use the STM32 CAN controller as we have the existing pyboard implementation.

@projectgus projectgus marked this pull request as ready for review January 18, 2024 02:23
@kentindell
Copy link

Next will try implementing an initial driver. Will use the STM32 CAN controller as we have the existing pyboard implementation.

That only supports 3 transmit slots (i.e. directly maps to hardware). That's OK to start with, I guess, but is very limiting (like you queue a handful of frames and it barfs with an overflow). I think you'll need to implement a "shuffling" driver that keeps (say) 32 frames in a priority queue in RAM and then uses the abort mechanism to ensure the highest 3 are in the 3 hardware buffer slots.

This shuffling/abort mechanism is very common: it's supported by most* small CAN controllers (no surprise: I invented the concept and talked to a bunch of silicon vendors). This means that you can implement a generic shuffling/abort driver and then instantiate it for different CAN controller implementations. If the drivers are in Python that should map nicely to an abstract base class (I'm wondering now if a priority list can be implemented very efficiently using MP primitives without triggering the heap when you come to add and remove items in an ISR or with interrupts locked).

*The notable ones that don't are the Microchip 2515 controller and the ones that are clones of the ancient 82C200 controller.

@projectgus
Copy link
Contributor Author

@kentindell Yes, for sure - I should have given more details. I'm hoping we can make the hardware controller specific parts a thin wrapper around the hardware queue, and the software queueing mechanism will be in the layer above (which will be Python). If that's too slow then we'll bump the software queueing part into C, but implemented in an abstract way so that the same implementation can work across different hardware.

BTW, you mentioned that you have a "proper" bxCAN driver implementation that skips the common pitfalls. If you're able to give me a peek, please send it through - I'd be interested to look.

@kwagyeman
Copy link
Sponsor Contributor

@projectgus - In the interest of having something versus nothing... are you ready to say the API is finalized? I'd like to not be in limbo moving forwards as we've got users wanting a stable CAN API.

@projectgus
Copy link
Contributor Author

are you ready to say the API is finalized? I'd like to not be in limbo moving forwards as we've got users wanting a stable CAN API.

I think this is very close to what the Python API for low-level CAN controllers will be. I'd hesitate to say it's final until it's been tested in an implementation.

From the perspective of writing a driver, the question of where software queueing sits (Python, C) may change if we can't meet decent performance with this part written in Python. Moving that may require small changes to the Python API documented here, and likely change the controller driver implementation substantially (i.e. which parts are C code included from extmod/ versus implemented in higher level Python).

From the user's perspective, the low-level controller API defined here is fairly bare bones and won't be particularly ergonomic until there's a high-level can module to sit on top of it.

So, on balance, I'd recommend waiting until there's an initial driver implementation available with PRs into both micropython and micropython-lib before refactoring any existing CAN drivers - just in case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants