Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

at86rf2xx: introduce pending TX counter #5261

Merged
merged 1 commit into from
Apr 8, 2016

Conversation

OlegHahm
Copy link
Member

@OlegHahm OlegHahm commented Apr 7, 2016

This is another (better) workaround to solve #5257 compared to #5258.

The problem is that one might want to send several packets back to back. The netdev2 event loop will now queue these sending requests in the message queue. Once the first TX is completed it will queue another event in the queue. Once this is handled it would reset the state to idle (RX) again. Now the next TX completed event is handled and interpret the IRQ as an RX event, because it was received in RX state.

This PR counts the currently pending TX calls to only return to idle (RX) after all TX calls are handled.

A better solution for this would be to modify the driver to use thread_flags instead of msg, but if we cannot find the time to include this for the release, I would propose to take this intermediate solution.

@OlegHahm OlegHahm added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: network Area: Networking Area: drivers Area: Device drivers ugly labels Apr 7, 2016
@OlegHahm OlegHahm added this to the Release 2016.04 milestone Apr 7, 2016
at86rf2xx_set_state(dev, dev->idle_state);
/* check for more pending TX calls and return to idle state if
* there are none */
if (--dev->pending_TX <= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unsigned variable is unlikely to be less than 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smartass! ;-b

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

@OlegHahm OlegHahm added the CI: needs squashing Commits in this PR need to be squashed; If set, CI systems will mark this PR as unmergable label Apr 7, 2016
@@ -163,6 +163,9 @@ typedef struct {
uint8_t page; /**< currently used channel page */
#endif
uint8_t idle_state; /**< state to return to after sending */
uint8_t pending_TX; /**< keep track of pending TX calls
this is required to known when to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/known/know/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and please TX->tx

@kaspar030
Copy link
Contributor

This also fixes #4725. Thanks Oleg for looking into it! I would ACK but maybe @authmillenon should take another look?

@kaspar030 kaspar030 assigned miri64 and unassigned kaspar030 Apr 7, 2016
@thomaseichinger
Copy link
Member

I tried to bend my head around this but can't see the situation described above.
Taken two packets get queued for the MAC event loop: The first packet would take the usual way through at86rf2xx_send. After that the next message will be dequeued and this at86rf2xx_send will poll in at86rf2xx_tx_prepare on the busy ongoing transmission. Then the first interrupt occurs and resets the state to RX. This saved and the TX procedure starts over. To me it is not possible two radio interrupts queue up.
Do I have a mishap in my code path? Where? Especially as this workaround reportedly solves this according to @OlegHahm and @kaspar030.

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 7, 2016

@thomaseichinger, the state is reset to RX in netdev2->isr(). This functions is called in the same thread as at86rf2xx_tx_prepare(). Hence, the first at86rf2xx_tx_prepare() goes through without problems, then the next at86rf2xx_tx_prepare() has to wait for the transceiver (not the driver) leaves the TX busy state again. After the first packet has been sent, it continues. In the meantime, both NETDEV2_EVENT_ISR got queued (because the thread was still busy/blocked with at86rf2xx_tx_prepare()). Now the first event is handled and resets the driver's state to RX. Then the second event is handled and, boom, thinks the transceiver would be in RX state and handles the incoming interrupt as a reception.

More understandable?

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 7, 2016

Addressed comments.

Btw. thanks to @haukepetersen for his hints while debugging.

@thomaseichinger
Copy link
Member

Aah! _isr doesn't get executed in interrupt context. @OlegHahm Thank you for pointing this out.

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 7, 2016

Yes, I know, it's a bit confusing that a function named ISR is not called in interrupt context. ;)

@thomaseichinger
Copy link
Member

It is, we might should consider renaming it for clarity. Despite of this, ugliness aside, I like this change because it massively saves time we would poll against state changes.

@miri64
Copy link
Member

miri64 commented Apr 8, 2016

I second @kaspar030's ACK.

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 8, 2016

wait a second - reading @haukepetersen's thoughts in #5257 I wonder what happens if the message queue is too small (i.e. too many sending calls arrive too quickly). Wouldn't we fail to handle the isr in that case?

@haukepetersen
Copy link
Contributor

I think not, as we still actively block in _tx_prepare(), so the queue should not be filled up.

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 8, 2016

Hm, yes, I guess you're right.

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 8, 2016

Should I squash then?

@haukepetersen
Copy link
Contributor

let's see, two possible orders:

  • pkt_snd msg from upper layer (queue fill: 1)
  • msg is handled (qf: 0)
  • tx_prep does not block, tx ongoing, thread back to msg_receive
  • pkt_snd msg from upper layer (qf: 1)
  • TX_DONE isr msg (qf: 2)
  • pkt_snd msg is handled (qf: 1)
  • tx_prep does not block, tx ongoing, thread back to msg_receive
  • pkt_send msg from above (qf: 2)
  • pkt_send msg is handled (qf: 1)
  • tx_prep blocks, as the tx is ongoing
  • TX_DONE isr msg (qf: 2)
  • tx_prep unblocks, back to msg_receive
  • first TX_DONE handled
    ...

So it seems to me, that we always actively block the thread before we can overfill the msg queue.

@haukepetersen
Copy link
Contributor

yes, please

@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 8, 2016

At least if our queue can hold more than one packet.

This counter is necessary for the current concept to tell the driver when to return to idle after sending.
@OlegHahm OlegHahm added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: needs squashing Commits in this PR need to be squashed; If set, CI systems will mark this PR as unmergable labels Apr 8, 2016
@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 8, 2016

squashed

@haukepetersen
Copy link
Contributor

At least if our queue can hold more than one packet.

true :-) So our current setting with 8 packets should be sufficient

@OlegHahm OlegHahm added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Apr 8, 2016
@OlegHahm
Copy link
Member Author

OlegHahm commented Apr 8, 2016

Apparently OlegHahm added Ready for CI build and removed NEEDS SQUASHING labels an hour ago was not enough for Murdock.

@OlegHahm OlegHahm merged commit e2cb553 into RIOT-OS:master Apr 8, 2016
@OlegHahm OlegHahm deleted the at86rf2xx_pending_TX branch April 8, 2016 11:08
@miri64
Copy link
Member

miri64 commented Apr 25, 2016

This PR made lwIP not able to receive anymore (the interrupt on receive isn't even fired). Any idea why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: drivers Area: Device drivers Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants