New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This patch is to deal with issue #617 Radio Race Condition. #758

Merged
merged 3 commits into from Mar 1, 2018

Conversation

Projects
None yet
2 participants
@phil-levis
Collaborator

phil-levis commented Feb 28, 2018

Pull Request Overview

The issue is that if the RF233 has a frame in its memory and starts
receiving another frame, the second one can overwrite the first. If software
does not read the first one out fast enough, this can lead to corrupted
packets. While the radio has some hardware features to try to protect
against this (Sec 11.8 of manual, Dynamic Frame Buffer Protection),
its use requires a very specific usage model of the SPI bus that Tock's
SPI abstraction does not support: it requires that the software read a
single byte from packet memory over the SPI bus, and without releasing
the chip select line, read the rest of the packet (based on the value of
this first byte).

This patch explicitly disables reception after a packet is received (transition to PLL_ON state), and
re-enables it after the packet is successfully read out of the radio

This also fixes a bug in the 802.15.4 userland in which TOCK_ENOACK was not being returned when a packet is not acknowledged.

Testing Strategy

It's been tested with the ieee802154 test applications, including at
higher packet transmit rates. However, it is possible that there are
edge cases on interleaved RX/TX events that might put the radio into
a wedged state. I'll do a careful read-over of the code again in a few
days once it's cleared from my head.

TODO or Help Wanted

It needs me to take a look over the state transitions again, especially what happens if interrupts arrive while in the intermediate turning-on, turning-off states.

Anyone with experience with the radio stack taking a look would also be great.

Documentation Updated

  • Kernel: The relevant files in /docs have been updated or no updates are required.
  • Userland: The application README has been added, updated, or no updates are required.

Formatting

  • make formatall has been run.

phil-levis added some commits Feb 28, 2018

This patch is to deal with issue #617 Radio Race Condition.
The issue is that if the RF233 has a frame in its memory and starts
receiving another frame, the second one can overwrite the first. If software
does not read the first one out fast enough, this can lead to corrupted
packets. While the radio has some hardware features to try to protect
against this (Sec 11.8 of manual, Dynamic Frame Buffer Protection),
its use requires a very specific usage model of the SPI bus that Tock's
SPI abstraction does not support: it requires that the software read a
single byte from packet memory over the SPI bus, and without releasing
the chip select line, read the rest of the packet (based on the value of
this first byte).

This patch explicitly disables reception after a packet is received, and
re-enables it after the packet is successfully read out of the radio.

It's been tested with the ieee802154 test applications, including at
higher packet transmit rates. However, it is possible that there are
edge cases on interleaved RX/TX events that might put the radio into
a wedged state. I'll do a careful read-over of the code again in a few
days once it's cleared from my head.
@phil-levis

This comment has been minimized.

Show comment
Hide comment
@phil-levis

phil-levis Feb 28, 2018

Collaborator

I ran the logic analyzer and the stack is definitely transitioning to PLL_ON after the interrupt but before reading the packet, and transitioning to RX_AACK_ON after reading the packet.

Collaborator

phil-levis commented Feb 28, 2018

I ran the logic analyzer and the stack is definitely transitioning to PLL_ON after the interrupt but before reading the packet, and transitioning to RX_AACK_ON after reading the packet.

@hudson-ayers

This comment has been minimized.

Show comment
Hide comment
@hudson-ayers

hudson-ayers Mar 1, 2018

Collaborator

I pulled this commit into ptcrews/sixlowpan_library branch and ran it with the test suite. When i run it with the test suite that has built in delays between frames, the test suite runs as expected and reports success. When I run it with the test suite with delays removed, the receiving imix hangs (as it misses fragments where before it was overwriting the previous buffer, and therefore never receives an entire packet). This leads me to believe that this fix works as advertised. I suppose a better test would be the sixlowpan test suite but with link layer acks, but no such test suite exists as of now.

Collaborator

hudson-ayers commented Mar 1, 2018

I pulled this commit into ptcrews/sixlowpan_library branch and ran it with the test suite. When i run it with the test suite that has built in delays between frames, the test suite runs as expected and reports success. When I run it with the test suite with delays removed, the receiving imix hangs (as it misses fragments where before it was overwriting the previous buffer, and therefore never receives an entire packet). This leads me to believe that this fix works as advertised. I suppose a better test would be the sixlowpan test suite but with link layer acks, but no such test suite exists as of now.

@phil-levis

This comment has been minimized.

Show comment
Hide comment
@phil-levis

phil-levis Mar 1, 2018

Collaborator
Collaborator

phil-levis commented Mar 1, 2018

@phil-levis phil-levis added the P-Upkeep label Mar 1, 2018

@phil-levis

This comment has been minimized.

Show comment
Hide comment
@phil-levis

phil-levis Mar 1, 2018

Collaborator

OK -- if you're good with it, can you merge it?

Collaborator

phil-levis commented Mar 1, 2018

OK -- if you're good with it, can you merge it?

@hudson-ayers hudson-ayers merged commit e2ec7de into master Mar 1, 2018

2 of 3 checks passed

deploy/netlify Deploy preview failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@hudson-ayers hudson-ayers deleted the rf233-issue-617 branch Mar 1, 2018

@bradjc bradjc referenced this pull request Mar 5, 2018

Closed

Radio Race Condition #617

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment