New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This patch is to deal with issue #617 Radio Race Condition. #758
Conversation
The issue is that if the RF233 has a frame in its memory and starts receiving another frame, the second one can overwrite the first. If software does not read the first one out fast enough, this can lead to corrupted packets. While the radio has some hardware features to try to protect against this (Sec 11.8 of manual, Dynamic Frame Buffer Protection), its use requires a very specific usage model of the SPI bus that Tock's SPI abstraction does not support: it requires that the software read a single byte from packet memory over the SPI bus, and without releasing the chip select line, read the rest of the packet (based on the value of this first byte). This patch explicitly disables reception after a packet is received, and re-enables it after the packet is successfully read out of the radio. It's been tested with the ieee802154 test applications, including at higher packet transmit rates. However, it is possible that there are edge cases on interleaved RX/TX events that might put the radio into a wedged state. I'll do a careful read-over of the code again in a few days once it's cleared from my head.
I ran the logic analyzer and the stack is definitely transitioning to PLL_ON after the interrupt but before reading the packet, and transitioning to RX_AACK_ON after reading the packet. |
I pulled this commit into ptcrews/sixlowpan_library branch and ran it with the test suite. When i run it with the test suite that has built in delays between frames, the test suite runs as expected and reports success. When I run it with the test suite with delays removed, the receiving imix hangs (as it misses fragments where before it was overwriting the previous buffer, and therefore never receives an entire packet). This leads me to believe that this fix works as advertised. I suppose a better test would be the sixlowpan test suite but with link layer acks, but no such test suite exists as of now. |
Cool — link layer acks are pretty critical. I’d suggest just putting a thin shim on top of the radio that retransmits (a fixed number of times) if a frame isn’t acknowledged.
Phil
sorry shift broken sometimes
… On Feb 28, 2018, at 5:34 PM, hudson-ayers ***@***.***> wrote:
I pulled this commit into ptcrews/sixlowpan_library branch and ran it with the test suite. When i run it with the test suite that has built in delays between frames, the test suite runs as expected and reports success. When I run it with the test suite with delays removed, the receiving imix hangs (as it misses fragments where before it was overwriting the previous buffer, and therefore never receives an entire packet). This leads me to believe that this fix works as advertised. I suppose a better test would be the sixlowpan test suite but with link layer acks, but no such test suite exists as of now.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
OK -- if you're good with it, can you merge it? |
85: Imix app: dont print error for enoack r=ppannuto a=hudson-ayers Currently, the Imix test app will print `Error sending packet -13` if a receiving device is not also active nearby and acking packets sent by the sender. This is a result of the 15.4 stack now correctly returning ENOACK when packets are transmitted successfully but not Acked (tock/tock#758). This PR fixes the Imix app to not indicate an error if TOCK_ENOACK is returned when sending a 15.4 packet. It also removes the "kernel" comment to reflect that LED(0) is no longer the led labeled "kernel" as that LED is no longer accessible from userspace. Co-authored-by: Hudson Ayers <hayers@stanford.edu>
85: Imix app: dont print error for enoack r=ppannuto a=hudson-ayers Currently, the Imix test app will print `Error sending packet -13` if a receiving device is not also active nearby and acking packets sent by the sender. This is a result of the 15.4 stack now correctly returning ENOACK when packets are transmitted successfully but not Acked (tock/tock#758). This PR fixes the Imix app to not indicate an error if TOCK_ENOACK is returned when sending a 15.4 packet. It also removes the "kernel" comment to reflect that LED(0) is no longer the led labeled "kernel" as that LED is no longer accessible from userspace. Co-authored-by: Hudson Ayers <hayers@stanford.edu>
Pull Request Overview
The issue is that if the RF233 has a frame in its memory and starts
receiving another frame, the second one can overwrite the first. If software
does not read the first one out fast enough, this can lead to corrupted
packets. While the radio has some hardware features to try to protect
against this (Sec 11.8 of manual, Dynamic Frame Buffer Protection),
its use requires a very specific usage model of the SPI bus that Tock's
SPI abstraction does not support: it requires that the software read a
single byte from packet memory over the SPI bus, and without releasing
the chip select line, read the rest of the packet (based on the value of
this first byte).
This patch explicitly disables reception after a packet is received (transition to PLL_ON state), and
re-enables it after the packet is successfully read out of the radio
This also fixes a bug in the 802.15.4 userland in which TOCK_ENOACK was not being returned when a packet is not acknowledged.
Testing Strategy
It's been tested with the ieee802154 test applications, including at
higher packet transmit rates. However, it is possible that there are
edge cases on interleaved RX/TX events that might put the radio into
a wedged state. I'll do a careful read-over of the code again in a few
days once it's cleared from my head.
TODO or Help Wanted
It needs me to take a look over the state transitions again, especially what happens if interrupts arrive while in the intermediate turning-on, turning-off states.
Anyone with experience with the radio stack taking a look would also be great.
Documentation Updated
/docs
have been updated or no updates are required.Formatting
make formatall
has been run.