Commit 102b55e
net: sched: fix tx action rescheduling issue during deactivation
Currently qdisc_run() checks the STATE_DEACTIVATED of lockless
qdisc before calling __qdisc_run(), which ultimately clear the
STATE_MISSED when all the skb is dequeued. If STATE_DEACTIVATED
is set before clearing STATE_MISSED, there may be rescheduling
of net_tx_action() at the end of qdisc_run_end(), see below:
CPU0(net_tx_atcion) CPU1(__dev_xmit_skb) CPU2(dev_deactivate)
. . .
. set STATE_MISSED .
. __netif_schedule() .
. . set STATE_DEACTIVATED
. . qdisc_reset()
. . .
.<--------------- . synchronize_net()
clear __QDISC_STATE_SCHED | . .
. | . .
. | . some_qdisc_is_busy()
. | . return *false*
. | . .
test STATE_DEACTIVATED | . .
__qdisc_run() *not* called | . .
. | . .
test STATE_MISS | . .
__netif_schedule()--------| . .
. . .
. . .
__qdisc_run() is not called by net_tx_atcion() in CPU0 because
CPU2 has set STATE_DEACTIVATED flag during dev_deactivate(), and
STATE_MISSED is only cleared in __qdisc_run(), __netif_schedule
is called at the end of qdisc_run_end(), causing tx action
rescheduling problem.
qdisc_run() called by net_tx_action() runs in the softirq context,
which should has the same semantic as the qdisc_run() called by
__dev_xmit_skb() protected by rcu_read_lock_bh(). And there is a
synchronize_net() between STATE_DEACTIVATED flag being set and
qdisc_reset()/some_qdisc_is_busy in dev_deactivate(), we can safely
bail out for the deactived lockless qdisc in net_tx_action(), and
qdisc_reset() will reset all skb not dequeued yet.
So add the rcu_read_lock() explicitly to protect the qdisc_run()
and do the STATE_DEACTIVATED checking in net_tx_action() before
calling qdisc_run_begin(). Another option is to do the checking in
the qdisc_run_end(), but it will add unnecessary overhead for
non-tx_action case, because __dev_queue_xmit() will not see qdisc
with STATE_DEACTIVATED after synchronize_net(), the qdisc with
STATE_DEACTIVATED can only be seen by net_tx_action() because of
__netif_schedule().
The STATE_DEACTIVATED checking in qdisc_run() is to avoid race
between net_tx_action() and qdisc_reset(), see:
commit d518d2e ("net/sched: fix race between deactivation
and dequeue for NOLOCK qdisc"). As the bailout added above for
deactived lockless qdisc in net_tx_action() provides better
protection for the race without calling qdisc_run() at all, so
remove the STATE_DEACTIVATED checking in qdisc_run().
After qdisc_reset(), there is no skb in qdisc to be dequeued, so
clear the STATE_MISSED in dev_reset_queue() too.
Fixes: 6b3ba91 ("net: sched: allow qdiscs to handle locking")
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
V8: Clearing STATE_MISSED before calling __netif_schedule() has
avoid the endless rescheduling problem, but there may still
be a unnecessary rescheduling, so adjust the commit log.
Signed-off-by: David S. Miller <davem@davemloft.net>1 parent a90c57f commit 102b55e
3 files changed
+26
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
| 131 | + | |
137 | 132 | | |
138 | 133 | | |
139 | 134 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5025 | 5025 | | |
5026 | 5026 | | |
5027 | 5027 | | |
| 5028 | + | |
| 5029 | + | |
5028 | 5030 | | |
5029 | 5031 | | |
5030 | 5032 | | |
5031 | 5033 | | |
5032 | 5034 | | |
5033 | 5035 | | |
5034 | | - | |
5035 | | - | |
5036 | | - | |
5037 | | - | |
5038 | 5036 | | |
5039 | 5037 | | |
5040 | 5038 | | |
5041 | 5039 | | |
| 5040 | + | |
| 5041 | + | |
| 5042 | + | |
| 5043 | + | |
| 5044 | + | |
| 5045 | + | |
| 5046 | + | |
| 5047 | + | |
| 5048 | + | |
| 5049 | + | |
| 5050 | + | |
| 5051 | + | |
| 5052 | + | |
| 5053 | + | |
| 5054 | + | |
| 5055 | + | |
| 5056 | + | |
| 5057 | + | |
5042 | 5058 | | |
5043 | 5059 | | |
5044 | 5060 | | |
5045 | 5061 | | |
5046 | 5062 | | |
| 5063 | + | |
| 5064 | + | |
5047 | 5065 | | |
5048 | 5066 | | |
5049 | 5067 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1177 | 1177 | | |
1178 | 1178 | | |
1179 | 1179 | | |
1180 | | - | |
| 1180 | + | |
| 1181 | + | |
1181 | 1182 | | |
| 1183 | + | |
1182 | 1184 | | |
1183 | 1185 | | |
1184 | 1186 | | |
| |||
0 commit comments