Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers: It is not possible to use transceiver drivers on a shared SPI-bus #2769

Closed
PeterKietzmann opened this issue Apr 9, 2015 · 13 comments
Assignees
Labels
Area: drivers Area: Device drivers Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Comments

@PeterKietzmann
Copy link
Member

This issue results from PR #2756.

Problem:
Allowing SPI-transfers for example to receive data in interrupt context could lead to unwanted behavior. Assuming there is an ongoing SPI-transfer and at the same time an external interrupt is triggered by the transceiver. This will lead to two enabled chip selects and undefined behavior on the bus.

Former solution approaches:

  1. Ensure that the transceiver is the only device on the bus to guaranty exclusive access to the SPI-bus.
  2. Remove all SPI-transfers form the ISR and run them in another thread context.
  3. Make sure that an interrupt can only be triggered by the radio during "safe" periods (deactivate interrupt during SPI read/write).
  4. Enable only one radio interrupt source at the same time
  5. Like 2) but additionally increase the priority of a thread that handles the radios and their IRQs

Criticism of the above proposals:
1)

  • This is not the general idea of a bus
  • There may be future devices with more than one integrated peripherals on an internal SPI-bus
  • Could lead to timing difficulties
  • Time critical actions could lead to problems due to slower and unpredictable task switching time spans
  • Note: maxACK-timeout is in the range of 700-900us in 2.5Ghz devices (information source?)
  • Could introduces additional overhead and task switches
  • Could also introduce timing errors
  • Can miss deadlines for ACK packets cause it may take too long to re-enable the IRQs
  • The ISR-handler would have to decide accordingly to the current driver state
  • For some states there are several possible results
  • Could introduces additional overhead and task switches
  • ?
@PeterKietzmann PeterKietzmann added Area: drivers Area: Device drivers NSTF labels Apr 9, 2015
@PeterKietzmann
Copy link
Member Author

@jremmert-phytec-iot , @haukepetersen , @gebart would be nice if you can check and extend the description if I missed relevant informations!

@PeterKietzmann
Copy link
Member Author

Should we add the "bug" or "enhancement" label?

@jonas-rem
Copy link
Contributor

I discussed that topic with Johann today. We came to the conclusion that 1) could be the a reasonable solution for devices with an exclusive SPI interface for the radio due to the following reasons:

  • The KW2x device is a SiP that integrates the radio in the same package as the MCU; there are two seperate SPI, one for periph and one for the radio. In this SiP exclusive access is guaranted. As there is no dedicated radio-module available, we currently do not see the necessity for an other solution that has significant drawbacks.
  • SPI aquire is not neccessary in this configuration and will be omitted.
  • Due to exclusive access in hardware (splitted hardware interfaces) the interrupts doesn't need to be disabled during SPI read/write. Only the GPIO-ISR source from the driver will be masked during SPI read/write. We don't see any drawbacks from that.
  • Potential drawback: This solution will block the SPI-Bus for peripheral reasons other than the radio in future devices with one shared SPI (Radio+Periph).

Maybe @haukepetersen will have more ideas for devices without exclusive access when he is implementing the atmel driver.

Calculation of the maxACK-timeout (This could be an important fact for the final decision):
In [1] listed as macAckWaitDuration:
macAckWaitDuration = aUnitBackoffPeriod + aTurnaroundTime + phySHRDuration + ceiling ( 6 × phySymbolsPerOctet )

Whereas:

  • aUnitBackoffPeriod = 20 * Symbol_length (all PHY)
  • aTurnaroundTime = 12 * Symbol_length (all PHY)
  • phySHRDuration = Preamble_Duration + SFD_Duration) = (8+2) * Symbol_length (O-QPSK PHY, 250kps)
  • ceiling (6 x phySymbolsPerOctet) = ceiling(12) = 12
  • Symbol_length = 16µs (O-QPSK PHY, 250kbps)

macAckWaitDuration = 320µs + 192µs + 160µs + 192µs = 864µs

[1] https://standards.ieee.org/getieee802/download/802.15.4-2011.pdf

@PeterKietzmann
Copy link
Member Author

@jremmert-phytec-iot thanks for the detailed response and the link. In your last point, what do you mean with "block the SPI-Bus for peripheral reasons "? Is it "hardware-blocking" so no other device can use the bus or what was your suggestion?
I remember there was a discussion about a radio with an onboard BLE and 802.15.4 transceiver. Unfortunately I forgot the name. This could be a potential candidate for a device with more than one integrated components on one internal SPI-bus. @gebart did you have some hardware in mind when stating this problem?

@jonas-rem
Copy link
Contributor

Oh true, that sentence was confusing.
In the suggested method the radio must have exclusive SPI-bus access. Consequently, if using method 1) devices with just one SPI can not handling other peripheral SPI-devices because this SPI is reserved (blocked) for the radio.

@jnohlgard
Copy link
Member

@PeterKietzmann I did not have any particular hardware in mind, at least the kw2x radio will work without locking as long as only one single thread is ever used to communicate with it.

The most likely issue that I see is when an SPI transfer is already in progress from thread context and an interrupt occurs and the ISR wants to perform an SPI transfer as well, which may or may not work correctly, depending on the implementation and SPI hardware used.

One example off the top of my head: If using the hardware CS feature of the Kinetis processors the kw2x driver can spinlock inside the ISR until the SPI status register says there is free space in the TX buffer and then perform the transfer, but this will probably break if using software controlled CS from thread context as the CS pin will already be asserted when the ISR wants to assert it, and the radio will likely ignore the command.

Even if using hardware CS it is still a problem to figure out which of the RX bytes belong to which transfer, since the ISR does not know anything about the transfer that was in progress when the interrupt came.

Thinking about this I see only two possible solutions to this:

  • Never do SPI transfers from interrupt context
  • or, disable (some) interrupts while performing SPI transfers in thread context. This means that all drivers on the same bus must disable this IRQ, not just the radio driver itself. (think of the case of a radio and a temperature sensor on the same bus, the temperature sensor driver must disable the radio IRQ every time it does a transfer, ugly cross-dependency problems occur)

@haukepetersen
Copy link
Contributor

After looking a little bit into the Atmel AT86RF2xx driver, I think I tend to actually go with solution 2). This makes sure there is no colliding bus access and we don't prevent the use of multiple devices on the bus.

Correct me if I am wrong, but at least for 802.15.4 devices I think all of them support hardware auto-acking, right? So the sending of ACKs we can just leave to the hardware and therefore do not have to worry about timings.

A workaround for other time critical events could be to use a timestamp (brainstorming here): in the interrupt routine save the value of hwtimer_now() into a field in the device descriptor, and once we are in thread context we can at least compensate for any jittering delays.

What do you think?

@jonas-rem
Copy link
Contributor

Ok seems so, then for Ack packets we should be fine. They do not throw an interrupt at all and are handled completely in HW.

I have thought about the time-span e.g. between an incoming Beacon and the outgoing response. As far as I understand, the timing here is not that important. There is even a minimum time spacing time, LIFS (Long interframe Spacing Period), which is 40symbols (640µs).

The timestamps could be essential to enable slottet CSMA. An other possibility beside the hwtimer_now() timestamping could be to use the hardware timestamping, some IEEE802.15.4 radios provide. Since the timestamps are stored in the radios internal registers, they can also be read out after a certain delay.

But I am not sure if I got everything correct. E.g. what happens if the message queue is partly or complete filled with TX packets from upper layer. As far as I understand, the msg queue is a LIFO? As soon as the mac-layer thread completes the current action (executes code until it reaches msg_recv() the next time), it will process the most recent msg; which is the pending interrupt message. All other network stack related threads running with lower priority than the mac-thread.
For info: The above was partly confusion of mine, partly an design issue when changing the implementation of the csma_mac layer. The message queue in msg.c should be implemented as FIFO. Under normal use cases the MAC layer-thread will not not be swamped with messages. The radio transceiver should be fast enough to send the messages out faster than they are coming in. However if the radio is not fast enough, the mac layer can receive the message and push the message pointer to the packetqueue interface. If this is full upper layer are automatially informed about this. Consequently message dropping should not be necessary.

If that summary of my thoughts is correct, I think I am fine ;).

@OlegHahm
Copy link
Member

Correct me if I am wrong, but at least for 802.15.4 devices I think all of them support hardware auto-acking, right?

At least if I think of very cheap transceivers, that may support 802.15.4 PHY, but are not necessarily designed for 802.15.4 L2 this may not hold for every device.

@haukepetersen
Copy link
Contributor

Then these devices would have to handle that in their IRQ with all the implications that come with it for their SPI bus usage. It is still possible to do this - we should just try not to do it if not really really needed.

@PeterKietzmann PeterKietzmann added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation and removed NSTF labels May 21, 2015
@OlegHahm OlegHahm modified the milestone: Release 2015.06 May 28, 2015
@OlegHahm
Copy link
Member

OlegHahm commented Sep 2, 2015

Any final conclusion?

@kaspar030
Copy link
Contributor

didn't we fix most drivers? @haukepetersen?

@haukepetersen
Copy link
Contributor

IMHO this is solved by (i) coding convention not to use SPI in interrupt context and (ii) the introduction of spi_acquire and spi_release. -> close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: drivers Area: Device drivers Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

No branches or pull requests

6 participants