Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APB: Check your mt7612u and mt7610u based adapters with Linux kernel 6.1 rc* #142

Open
morrownr opened this issue Oct 31, 2022 · 38 comments

Comments

@morrownr
Copy link
Owner

morrownr commented Oct 31, 2022

This reason for this message is that I am seeing a system lockup when trying to use mt7612u based adapters. Yes, device drivers can do that. I was seeing it in kernel 6.1 rc1 and am still seeing it in 6.1 rc3. Would like confirmation. This only started with kernel 6.1.

Thanks,

Nick

@morrownr
Copy link
Owner Author

morrownr commented Oct 31, 2022

Note: This could be something else going on in this specific test system so I am asking others that have the ability to upgrade to kernel 6.1 to test before I spend time gathering info to file a bug report.

@lestcape
Copy link

I have that problem too, but i was thinking was another patch i was tested. Can this be related some how?

https://gitlab.freedesktop.org/drm/amd/-/issues/2236

@morrownr
Copy link
Owner Author

morrownr commented Oct 31, 2022

@lestcape

Can this be related some how?

I don't know but I don't think so. The problem I am seeing is only with adapters based on the mt7612u and mt7610u chipsets. My adapter with a mt7921au chipset is working better than ever on the same kernel 6.1. In fact, I have several adapters with various chipsets and I am only seeing this lock up problem with mt7612u and mt7610u chipset based adapters on kernel 6.1.

Hopefully we have some other folks roll in with additional information so we can see if this is widespread and speciifc to the chipset and kernel we are talking about. There could have been a patch that is not working well with the mt761Xu chipsets.

@bjlockie
Copy link

bjlockie commented Nov 1, 2022

This was before I associated:

$ uname -a
Linux me-aspiretc281 6.1.0-060100rc3-generic #202210301931 SMP PREEMPT_DYNAMIC Sun Oct 30 23:40:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

[  354.214752] usb 5-3: New USB device strings: Mfr=2, Product=3, SerialNumber=4
[  354.214758] usb 5-3: Product: Wireless 
[  354.214762] usb 5-3: Manufacturer: MediaTek Inc.
[  354.214766] usb 5-3: SerialNumber: 000000000
[  354.488160] usb 5-3: reset SuperSpeed USB device number 3 using xhci_hcd
[  354.514207] mt76x2u 5-3:1.0: ASIC revision: 76120044
[  354.660293] mt76x2u 5-3:1.0: ROM patch build: 20141115060606a
[  354.876618] mt76x2u 5-3:1.0: Firmware Version: 0.0.00
[  354.876632] mt76x2u 5-3:1.0: Build: 1
[  354.876637] mt76x2u 5-3:1.0: Build Time: 201507311614____
[  356.321129] ieee80211 phy1: Selected rate control algorithm 'minstrel_ht'
[  356.323917] usbcore: registered new interface driver mt76x2u
[  356.424442] mt76x2u 5-3:1.0 wlx00c0caaa9c2b: renamed from wlan0

That is the Alfa.

I associated and the system locked up within seconds.

@morrownr
Copy link
Owner Author

morrownr commented Nov 1, 2022

I associated and the system locked up within seconds.

Exactly what I am seeing. Within seconds of association, hard lock, pull the plug.

The only combination I am seeing this with is kernel 6.1 and mt761Xu chipsets.
Distro: Mint 21 (based on Ubuntu 20.04)
Kernel 6.1 rc1-rc3
Intel i7
Did not see this problem with kernel 6.0 and earlier so it appears that a patch that went into 6.1 is the cause.

We really don't want 6.1 (LTS) released with this issue. @bjlockie , do you have time to post a [bug - mt7612u] to linux-wireless? You can reference this thread.

@morrownr
Copy link
Owner Author

morrownr commented Nov 1, 2022

More testing: This time with a mt7610u chipset based adapter (Alfa ACHM to be exact).

With kernel 6.1, I am seeing the same problem-- about 3-4 seconds after association with an AP, hard lock, plug the plug.

So this issue appears to not just be with the mt7612u chipset. It appears with both the mt7612u and mt7610u chipsets.

@morrownr
Copy link
Owner Author

morrownr commented Nov 1, 2022

@bjlockie

Thanks for posting the bug. I guess with recent guidance it should have been [BUG] wifi: MT761Xu.

It sure looks to me at this point that some patch for 6.1 is causing a problem.

@morrownr
Copy link
Owner Author

morrownr commented Nov 6, 2022

@bjlockie

This morning I saw the following patch on linux-wireless. I see it is cc'ed to you but others might be interested. We really need this problem fixed before 6.1 is released.

[PATCH wireless] wifi: mac8021: fix possible oob access in
ieee80211_get_rate_duration
Lorenzo Bianconi <lorenzo@kernel.org> Sun, Nov 6, 2022 at 4:30 AM
To: linux-wireless@vger.kernel.org
Cc: bjlockie@lockie.ca, toke@toke.dk, johannes@sipsolutions.net, nbd@nbd.name
Fix possible out-of-bound access in ieee80211_get_rate_duration routine
as reported by the following UBSAN report:
UBSAN: array-index-out-of-bounds in net/mac80211/airtime.c:455:47
index 15 is out of range for type 'u16 [12]'
CPU: 2 PID: 217 Comm: kworker/u32:10 Not tainted 6.1.0-060100rc3-generic
Hardware name: Acer Aspire TC-281/Aspire TC-281, BIOS R01-A2 07/18/2017
Workqueue: mt76 mt76u_tx_status_data [mt76_usb]
Call Trace:
<TASK>
show_stack+0x4e/0x61
dump_stack_lvl+0x4a/0x6f
dump_stack+0x10/0x18
ubsan_epilogue+0x9/0x43
__ubsan_handle_out_of_bounds.cold+0x42/0x47
ieee80211_get_rate_duration.constprop.0+0x22f/0x2a0 [mac80211]
? ieee80211_tx_status_ext+0x32e/0x640 [mac80211]
ieee80211_calc_rx_airtime+0xda/0x120 [mac80211]
ieee80211_calc_tx_airtime+0xb4/0x100 [mac80211]
mt76x02_send_tx_status+0x266/0x480 [mt76x02_lib]
mt76x02_tx_status_data+0x52/0x80 [mt76x02_lib]
mt76u_tx_status_data+0x67/0xd0 [mt76_usb]
process_one_work+0x225/0x400
worker_thread+0x50/0x3e0
? process_one_work+0x400/0x400
kthread+0xe9/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
Reported-by: bjlockie@lockie.ca
Fixes: db3e1c40cf2f ("mac80211: Import airtime calculation code from mt76")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
net/mac80211/airtime.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/mac80211/airtime.c b/net/mac80211/airtime.c
index 2e66598fac79..4ed05988131d 100644
--- a/net/mac80211/airtime.c
+++ b/net/mac80211/airtime.c
@@ -452,6 +452,9 @@ static u32 ieee80211_get_rate_duration(struct ieee80211_hw
*hw,
(status->encoding == RX_ENC_HE && streams > 8)))
return 0;
+ if (WARN_ON_ONCE(idx >= MCS_GROUP_RATES))
+ return 0;
+
duration = airtime_mcs_groups[group].duration[idx];
duration <<= airtime_mcs_groups[group].shift;
*overhead = 36 + (streams << 2);
--
2.38.1

@morrownr morrownr changed the title APB: Check your mt7612u based adapter with Linux kernel 6.1 rc* (if you have time) APB: Check your mt7612u and mt7610u based adapters with Linux kernel 6.1 rc* Nov 6, 2022
@bjlockie
Copy link

bjlockie commented Nov 7, 2022

Thanks, I missed that because I delete all the patches on that list. :-)
I'll try rc4. :-)

@morrownr
Copy link
Owner Author

morrownr commented Nov 7, 2022

I don't think it will be in RC4. There are follow on messages now where another guy is questioning whether mac80211 is the right place to fix the problem. I'm reading between the lines here but I get the feeling that a mac80211 patch has broken a few drivers and Lorenzo is patching mac80211 as a short term solution because the real fix may require firmware patches which come from the company. I suspect there are more than MT76 drivers that are broken. In fact, if I have time this week, I'm going to test everything I have here which is a significant job. I don't like hard lock, pull the plug situations.

@morrownr
Copy link
Owner Author

@bjlockie

I just tested with 6.1 rc5 and hard lock, pull the plug. This is a nasty bug.

I haven't seen a V2 of the initial patch come through on linux-wireless. No idea what the status is.

Nick

@bjlockie
Copy link

bjlockie commented Nov 15, 2022 via email

@morrownr
Copy link
Owner Author

No, the patch is not in rc5. Not sure what the status is.

@morrownr morrownr closed this as completed Dec 8, 2022
@randomcodepanda
Copy link

So i'm using a MT7610U and updated to 6.1.1 in Arch Linux and it's a complete lockup a few seconds after connecting as other people have said, i'll use the LTS kernel (5.15) in the meanwhile but this is probably going to get a lot of people that use distros with newer kernels.

@bjlockie
Copy link

bjlockie commented Dec 23, 2022 via email

@morrownr morrownr reopened this Dec 23, 2022
@morrownr
Copy link
Owner Author

@bjlockie

I had hoped it would be fixed by now.

Me too. I'm reopening this issue so folks can see it.

Are you able to grab your old report to linux-wireless and add the above report to it while pointing out that this is a show stopper?

@morrownr
Copy link
Owner Author

@bjlockie

I think the proper title for a report like something like this would be:

[bug report} wifi: mt76: mt7612u/mt7610u 6.1.x - hard locking systems

@morrownr
Copy link
Owner Author

@bjlockie

If you are busy, I can send the report. Let me know.

@morrownr
Copy link
Owner Author

@bjlockie

I have some time this afternoon so I'll go ahead and report this.

@bjlockie
Copy link

bjlockie commented Dec 24, 2022 via email

@morrownr
Copy link
Owner Author

No problem. It is a time of year when most of us are busy to one degree or another.

Does the patch only get in a released kernel only after someone verifies the patch fixes the problem?

I've been looking at the patch. The patch is not to the mt76 driver. It is a patch to mac80211 which affects almost all wireless so it has to be approved.

I'm traveling so it is not a good time for me to test the patch.

@morrownr
Copy link
Owner Author

@randomcodepanda

The proposed patch has to be approved by Johannes Berg as it is not a patch to the mt76 driver. It is a patch to the mac80211 driver which is a system wide wifi driver so care has to be taken. The patch was provided by a Mediatek dev.

I can't say how much time this will take but the devs are very aware of it.

Nick

@randomcodepanda
Copy link

That's great to hear @morrownr and @bjlockie , you guys do some awesome work for the Linux wifi users here.
I'll update if there's a kernel update in the meanwhile.

@morrownr
Copy link
Owner Author

@bjlockie @randomcodepanda

I had time to apply the patch that needed to be tested. It worked. I could not see any side effects so we'll just have to see.

I'll recommend to stay with kernel 6.0 or earlier until such time as that patch works its way into 6.1 and later.

For what its worth. That patch is not to the Mediatek driver. That patch is to the mac80211 driver which is part of the wifi supporting stack. The patch was written by a Mediatek dev.

Nick

@bjlockie
Copy link

bjlockie commented Dec 27, 2022 via email

@morrownr
Copy link
Owner Author

It may not just affect Mediatek devices. I looked at the patch and it is not a trivial patch. We need more testing. I'm wondering if it would help if I posted a guide that would give specific instructions how to test.

@bjlockie
Copy link

bjlockie commented Dec 28, 2022 via email

@morrownr
Copy link
Owner Author

If mediatek devices are the only ones that cause a crash then wouldn't the developers know?

Depends on what the devs are working on and whether what caused the problem is something that came from them. This problem does not appear to have been caused by a patch from a Mediatek dev.

Can this be tested in a VM?

It probably can but I'm the wrong person to answer as I run 100% bare metal here.

@morrownr
Copy link
Owner Author

V5 of the patch was submitted today. Not sure when it will be accepted:

[PATCH v5] wifi: mac80211: fix initialization of rx->link and rx->link_sta


There are some codepaths that do not initialize rx->link_sta properly. This
causes a crash in places which assume that rx->link_sta is valid if rx->sta
is valid.
One known instance is triggered by __ieee80211_rx_h_amsdu being called from
fast-rx. It results in a crash like this one:

 BUG: kernel NULL pointer dereference, address: 00000000000000a8
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 506 Comm: mt76-usb-rx phy Tainted: G            E      6.1.0-debian64x+1.7 #3
 Hardware name: ZOTAC ZBOX-ID92/ZBOX-IQ01/ZBOX-ID92/ZBOX-IQ01, BIOS B220P007 05/21/2014
 RIP: 0010:ieee80211_deliver_skb+0x62/0x1f0 [mac80211]
 Code: 00 48 89 04 24 e8 9e a7 c3 df 89 c0 48 03 1c c5 a0 ea 39 a1 4c 01 6b 08 48 ff 03 48
       83 7d 28 00 74 11 48 8b 45 30 48 63 55 44 <48> 83 84 d0 a8 00 00 00 01 41 8b 86 c0
       11 00 00 8d 50 fd 83 fa 01
 RSP: 0018:ffff999040803b10 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffffb9903f496480 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: ffff999040803ce0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d21828ac900
 R13: 000000000000004a R14: ffff8d2198ed89c0 R15: ffff8d2198ed8000
 FS:  0000000000000000(0000) GS:ffff8d24afe80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00000000000000a8 CR3: 0000000429810002 CR4: 00000000001706e0
 Call Trace:
  <TASK>
  __ieee80211_rx_h_amsdu+0x1b5/0x240 [mac80211]
  ? ieee80211_prepare_and_rx_handle+0xcdd/0x1320 [mac80211]
  ? __local_bh_enable_ip+0x3b/0xa0
  ieee80211_prepare_and_rx_handle+0xcdd/0x1320 [mac80211]
  ? prepare_transfer+0x109/0x1a0 [xhci_hcd]
  ieee80211_rx_list+0xa80/0xda0 [mac80211]
  mt76_rx_complete+0x207/0x2e0 [mt76]
  mt76_rx_poll_complete+0x357/0x5a0 [mt76]
  mt76u_rx_worker+0x4f5/0x600 [mt76_usb]
  ? mt76_get_min_avg_rssi+0x140/0x140 [mt76]
  __mt76_worker_fn+0x50/0x80 [mt76]
  kthread+0xed/0x120
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x22/0x30

Since the initialization of rx->link and rx->link_sta is rather convoluted
and duplicated in many places, clean it up by using a helper function to
set it.

Fixes: ccdde7c74ffd ("wifi: mac80211: properly implement MLO key handling")
Fixes: b320d6c456ff ("wifi: mac80211: use correct rx link_sta instead of default")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
---
v5: fix sdata link assignment, fix sdata when receiving on multiple interfaces
v4: fix regression in handling mgmt frames with AP_VLAN
v3: include crash log
v2: fix uninitialized variable
 net/mac80211/rx.c | 222 +++++++++++++++++++++-------------------------
 1 file changed, 99 insertions(+), 123 deletions(-)

diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 7e3ab6e1b28f..6ccc487bad7f 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -4049,6 +4049,58 @@ static void ieee80211_invoke_rx_handlers(struct ieee80211_rx_data *rx)
 #undef CALL_RXH
 }

+static bool
+ieee80211_rx_is_valid_sta_link_id(struct ieee80211_sta *sta, u8 link_id)
+{
+       if (!sta->mlo)
+               return false;
+
+       return !!(sta->valid_links & BIT(link_id));
+}
+
+static bool ieee80211_rx_data_set_link(struct ieee80211_rx_data *rx,
+                                      u8 link_id)
+{
+       rx->link_id = link_id;
+       rx->link = rcu_dereference(rx->sdata->link[link_id]);
+
+       if (!rx->sta || !rx->sta->sta.mlo)
+               return rx->link;
+
+       if (!ieee80211_rx_is_valid_sta_link_id(&rx->sta->sta, link_id))
+               return false;
+
+       rx->link_sta = rcu_dereference(rx->sta->link[link_id]);
+
+       return rx->link && rx->link_sta;
+}
+
+static bool ieee80211_rx_data_set_sta(struct ieee80211_rx_data *rx,
+                                     struct ieee80211_sta *pubsta,
+                                     int link_id)
+{
+       struct sta_info *sta;
+
+       sta = container_of(pubsta, struct sta_info, sta);
+
+       rx->link_id = link_id;
+       rx->sta = sta;
+
+       if (sta) {
+               rx->local = sta->sdata->local;
+               if (!rx->sdata)
+                       rx->sdata = sta->sdata;
+               rx->link_sta = &sta->deflink;
+       }
+
+       if (link_id < 0)
+               rx->link = &rx->sdata->deflink;
+       else if (!ieee80211_rx_data_set_link(rx, link_id))
+               return false;
+
+       return true;
+}
+
 /*
  * This function makes calls into the RX path, therefore
  * it has to be invoked under RCU read lock.
@@ -4057,16 +4109,19 @@ void ieee80211_release_reorder_timeout(struct sta_info *sta, int tid)
 {
        struct sk_buff_head frames;
        struct ieee80211_rx_data rx = {
-               .sta = sta,
-               .sdata = sta->sdata,
-               .local = sta->local,
                /* This is OK -- must be QoS data frame */
                .security_idx = tid,
                .seqno_idx = tid,
-               .link_id = -1,
        };
        struct tid_ampdu_rx *tid_agg_rx;
-       u8 link_id;
+       int link_id = -1;
+
+       /* FIXME: statistics won't be right with this */
+       if (sta->sta.valid_links)
+               link_id = ffs(sta->sta.valid_links) - 1;
+
+       if (!ieee80211_rx_data_set_sta(&rx, &sta->sta, link_id))
+               return;

        tid_agg_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
        if (!tid_agg_rx)
@@ -4086,10 +4141,6 @@ void ieee80211_release_reorder_timeout(struct sta_info *sta, int tid)
                };
                drv_event_callback(rx.local, rx.sdata, &event);
        }
-       /* FIXME: statistics won't be right with this */
-       link_id = sta->sta.valid_links ? ffs(sta->sta.valid_links) - 1 : 0;
-       rx.link = rcu_dereference(sta->sdata->link[link_id]);
-       rx.link_sta = rcu_dereference(sta->link[link_id]);

        ieee80211_rx_handlers(&rx, &frames);
 }
@@ -4105,7 +4156,6 @@ void ieee80211_mark_rx_ba_filtered_frames(struct ieee80211_sta *pubsta, u8 tid,
                /* This is OK -- must be QoS data frame */
                .security_idx = tid,
                .seqno_idx = tid,
-               .link_id = -1,
        };
        int i, diff;

@@ -4116,10 +4166,8 @@ void ieee80211_mark_rx_ba_filtered_frames(struct ieee80211_sta *pubsta, u8 tid,

        sta = container_of(pubsta, struct sta_info, sta);

-       rx.sta = sta;
-       rx.sdata = sta->sdata;
-       rx.link = &rx.sdata->deflink;
-       rx.local = sta->local;
+       if (!ieee80211_rx_data_set_sta(&rx, pubsta, -1))
+               return;

        rcu_read_lock();
        tid_agg_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[tid]);
@@ -4506,15 +4554,6 @@ void ieee80211_check_fast_rx_iface(struct ieee80211_sub_if_data *sdata)
        mutex_unlock(&local->sta_mtx);
 }

-static bool
-ieee80211_rx_is_valid_sta_link_id(struct ieee80211_sta *sta, u8 link_id)
-{
-       if (!sta->mlo)
-               return false;
-
-       return !!(sta->valid_links & BIT(link_id));
-}
-
 static void ieee80211_rx_8023(struct ieee80211_rx_data *rx,
                              struct ieee80211_fast_rx *fast_rx,
                              int orig_len)
@@ -4625,7 +4664,6 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx,
        struct sk_buff *skb = rx->skb;
        struct ieee80211_hdr *hdr = (void *)skb->data;
        struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb);
-       struct sta_info *sta = rx->sta;
        int orig_len = skb->len;
        int hdrlen = ieee80211_hdrlen(hdr->frame_control);
        int snap_offs = hdrlen;
@@ -4637,7 +4675,6 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx,
                u8 da[ETH_ALEN];
                u8 sa[ETH_ALEN];
        } addrs __aligned(2);
-       struct link_sta_info *link_sta;
        struct ieee80211_sta_rx_stats *stats;

        /* for parallel-rx, we need to have DUP_VALIDATED, otherwise we write
@@ -4740,18 +4777,10 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx,
  drop:
        dev_kfree_skb(skb);

-       if (rx->link_id >= 0) {
-               link_sta = rcu_dereference(sta->link[rx->link_id]);
-               if (!link_sta)
-                       return true;
-       } else {
-               link_sta = &sta->deflink;
-       }
-
        if (fast_rx->uses_rss)
-               stats = this_cpu_ptr(link_sta->pcpu_rx_stats);
+               stats = this_cpu_ptr(rx->link_sta->pcpu_rx_stats);
        else
-               stats = &link_sta->rx_stats;
+               stats = &rx->link_sta->rx_stats;

        stats->dropped++;
        return true;
@@ -4769,8 +4798,8 @@ static bool ieee80211_prepare_and_rx_handle(struct ieee80211_rx_data *rx,
        struct ieee80211_local *local = rx->local;
        struct ieee80211_sub_if_data *sdata = rx->sdata;
        struct ieee80211_hdr *hdr = (void *)skb->data;
-       struct link_sta_info *link_sta = NULL;
-       struct ieee80211_link_data *link;
+       struct link_sta_info *link_sta = rx->link_sta;
+       struct ieee80211_link_data *link = rx->link;

        rx->skb = skb;

@@ -4792,35 +4821,6 @@ static bool ieee80211_prepare_and_rx_handle(struct ieee80211_rx_data *rx,
        if (!ieee80211_accept_frame(rx))
                return false;

-       if (rx->link_id >= 0) {
-               link = rcu_dereference(rx->sdata->link[rx->link_id]);
-
-               /* we might race link removal */
-               if (!link)
-                       return true;
-               rx->link = link;
-
-               if (rx->sta) {
-                       rx->link_sta =
-                               rcu_dereference(rx->sta->link[rx->link_id]);
-                       if (!rx->link_sta)
-                               return true;
-               }
-       } else {
-               if (rx->sta)
-                       rx->link_sta = &rx->sta->deflink;
-
-               rx->link = &sdata->deflink;
-       }
-
-       if (unlikely(!is_multicast_ether_addr(hdr->addr1) &&
-                    rx->link_id >= 0 && rx->sta && rx->sta->sta.mlo)) {
-               link_sta = rcu_dereference(rx->sta->link[rx->link_id]);
-
-               if (WARN_ON_ONCE(!link_sta))
-                       return true;
-       }
-
        if (!consume) {
                struct skb_shared_hwtstamps *shwt;

@@ -4840,7 +4840,7 @@ static bool ieee80211_prepare_and_rx_handle(struct ieee80211_rx_data *rx,
                shwt->hwtstamp = skb_hwtstamps(skb)->hwtstamp;
        }

-       if (unlikely(link_sta)) {
+       if (unlikely(rx->sta && rx->sta->sta.mlo)) {
                /* translate to MLD addresses */
                if (ether_addr_equal(link->conf->addr, hdr->addr1))
                        ether_addr_copy(hdr->addr1, rx->sdata->vif.addr);
@@ -4870,6 +4870,7 @@ static void __ieee80211_rx_handle_8023(struct ieee80211_hw *hw,
        struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb);
        struct ieee80211_fast_rx *fast_rx;
        struct ieee80211_rx_data rx;
+       int link_id = -1;

        memset(&rx, 0, sizeof(rx));
        rx.skb = skb;
@@ -4886,12 +4887,8 @@ static void __ieee80211_rx_handle_8023(struct ieee80211_hw *hw,
        if (!pubsta)
                goto drop;

-       rx.sta = container_of(pubsta, struct sta_info, sta);
-       rx.sdata = rx.sta->sdata;
-
-       if (status->link_valid &&
-           !ieee80211_rx_is_valid_sta_link_id(pubsta, status->link_id))
-               goto drop;
+       if (status->link_valid)
+               link_id = status->link_id;

        /*
         * TODO: Should the frame be dropped if the right link_id is not
@@ -4900,19 +4897,8 @@ static void __ieee80211_rx_handle_8023(struct ieee80211_hw *hw,
         * link_id is used only for stats purpose and updating the stats on
         * the deflink is fine?
         */
-       if (status->link_valid)
-               rx.link_id = status->link_id;
-
-       if (rx.link_id >= 0) {
-               struct ieee80211_link_data *link;
-
-               link =  rcu_dereference(rx.sdata->link[rx.link_id]);
-               if (!link)
-                       goto drop;
-               rx.link = link;
-       } else {
-               rx.link = &rx.sdata->deflink;
-       }
+       if (!ieee80211_rx_data_set_sta(&rx, pubsta, link_id))
+               goto drop;

        fast_rx = rcu_dereference(rx.sta->fast_rx);
        if (!fast_rx)
@@ -4930,6 +4916,8 @@ static bool ieee80211_rx_for_interface(struct ieee80211_rx_data *rx,
 {
        struct link_sta_info *link_sta;
        struct ieee80211_hdr *hdr = (void *)skb->data;
+       struct sta_info *sta;
+       int link_id = -1;

        /*
         * Look up link station first, in case there's a
@@ -4939,24 +4927,19 @@ static bool ieee80211_rx_for_interface(struct ieee80211_rx_data *rx,
         */
        link_sta = link_sta_info_get_bss(rx->sdata, hdr->addr2);
        if (link_sta) {
-               rx->sta = link_sta->sta;
-               rx->link_id = link_sta->link_id;
+               sta = link_sta->sta;
+               link_id = link_sta->link_id;
        } else {
                struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb);

-               rx->sta = sta_info_get_bss(rx->sdata, hdr->addr2);
-               if (rx->sta) {
-                       if (status->link_valid &&
-                           !ieee80211_rx_is_valid_sta_link_id(&rx->sta->sta,
-                                                              status->link_id))
-                               return false;
-
-                       rx->link_id = status->link_valid ? status->link_id : -1;
-               } else {
-                       rx->link_id = -1;
-               }
+               sta = sta_info_get_bss(rx->sdata, hdr->addr2);
+               if (status->link_valid)
+                       link_id = status->link_id;
        }

+       if (!ieee80211_rx_data_set_sta(rx, &sta->sta, link_id))
+               return false;
+
        return ieee80211_prepare_and_rx_handle(rx, skb, consume);
 }

@@ -5015,19 +4998,15 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw,

        if (ieee80211_is_data(fc)) {
                struct sta_info *sta, *prev_sta;
-               u8 link_id = status->link_id;
+               int link_id = -1;

-               if (pubsta) {
-                       rx.sta = container_of(pubsta, struct sta_info, sta);
-                       rx.sdata = rx.sta->sdata;
+               if (status->link_valid)
+                       link_id = status->link_id;

-                       if (status->link_valid &&
-                           !ieee80211_rx_is_valid_sta_link_id(pubsta, link_id))
+               if (pubsta) {
+                       if (!ieee80211_rx_data_set_sta(&rx, pubsta, link_id))
                                goto out;

-                       if (status->link_valid)
-                               rx.link_id = status->link_id;
-
                        /*
                         * In MLO connection, fetch the link_id using addr2
                         * when the driver does not pass link_id in status.
@@ -5045,7 +5024,7 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw,
                                if (!link_sta)
                                        goto out;

-                               rx.link_id = link_sta->link_id;
+                               ieee80211_rx_data_set_link(&rx, link_sta->link_id);
                        }

                        if (ieee80211_prepare_and_rx_handle(&rx, skb, true))
@@ -5061,30 +5040,27 @@ static void __ieee80211_rx_handle_packet(struct ieee80211_hw *hw,
                                continue;
                        }

-                       if ((status->link_valid &&
-                            !ieee80211_rx_is_valid_sta_link_id(&prev_sta->sta,
-                                                               link_id)) ||
-                           (!status->link_valid && prev_sta->sta.mlo))
+                       rx.sdata = prev_sta->sdata;
+                       if (!ieee80211_rx_data_set_sta(&rx, &prev_sta->sta,
+                                                      link_id))
+                               goto out;
+
+                       if (!status->link_valid && prev_sta->sta.mlo)
                                continue;

-                       rx.link_id = status->link_valid ? link_id : -1;
-                       rx.sta = prev_sta;
-                       rx.sdata = prev_sta->sdata;
                        ieee80211_prepare_and_rx_handle(&rx, skb, false);

                        prev_sta = sta;
                }

                if (prev_sta) {
-                       if ((status->link_valid &&
-                            !ieee80211_rx_is_valid_sta_link_id(&prev_sta->sta,
-                                                               link_id)) ||
-                           (!status->link_valid && prev_sta->sta.mlo))
+                       rx.sdata = prev_sta->sdata;
+                       if (!ieee80211_rx_data_set_sta(&rx, &prev_sta->sta,
+                                                      link_id))
                                goto out;

-                       rx.link_id = status->link_valid ? link_id : -1;
-                       rx.sta = prev_sta;
-                       rx.sdata = prev_sta->sdata;
+                       if (!status->link_valid && prev_sta->sta.mlo)
+                               goto out;

                        if (ieee80211_prepare_and_rx_handle(&rx, skb, true))
                                return;
--
2.38.1

@randomcodepanda
Copy link

Updated today to 6.1.2 and it's still freezing, hopefully it makes it on the next release.

@morrownr
Copy link
Owner Author

morrownr commented Jan 2, 2023

It sure would be good for the patch to go into 6.1 and 6.2 as soon as possible but we need to respect that this is going into a part if the kernel that is heavily used by many drivers so a mistake would not be good.

@jip149
Copy link

jip149 commented Jan 4, 2023

Could this apply to mt7921 as well? I have a recently acquired wifi adapter that I can't get to work. I'm trying to use it in ap mode, hostapd runs fine but when another computer tries to connect the AP freezes.

@morrownr
Copy link
Owner Author

morrownr commented Jan 4, 2023

Could this apply to mt7921 as well? I have a recently acquired wifi adapter that I can't get to work. I'm trying to use it in ap mode, hostapd runs fine but when another computer tries to connect the AP freezes.

@Giga

I have not seen this problem with the mt7921. Recommend you start a new issue and describe what you are seeing. A new issue with a good title will attract users that may be able to help.

@sha5672
Copy link

sha5672 commented Jan 9, 2023

@morrownr

Hello, I am using Arch Linux 6.1.4 with a new Alfa AWUS036ACM and am experiencing a crash within a few seconds of logging in. What can I do until the issue is fixed? Is there any way that I can downgrade to 6.0?

@morrownr
Copy link
Owner Author

morrownr commented Jan 9, 2023

Hi @sha5672

Hello, I am using Arch Linux 6.1.4 with a new Alfa AWUS036ACM and am experiencing a crash within a few seconds of logging in

Yes, this is an ugly issue. Programmers call it a null pointer dereference. On my systems, it is pull the plug time...and remove the adapter before rebooting. The problem is happening with the 7612u and 7610u chipsets. This started with kenel 6.1 by a patch from a Intel dev. A patch I will link below has been submitted by a Mediatel dev but the patch is not to a Mediatek driver, it is to mac80211 which is a Linux Wireless system stack driver and that is delaying the merging to mainline as a gatekeeper has to be satisfied it it not going to cause system wide problems. In looking at the patch and the reasons behind it, I tend to agree with what is being done. It would have helped if it had not happened during the holiday season. Hopefully the patch will be merged soon.

What can I do until the issue is fixed?

Downgrade to any kernel prior to 6.1.

Is there any way that I can downgrade to 6.0?

I'm sure there is but my expertise is with Debian/Ubuntu based systems so you don't want me to help you downgrade your kernel. You probably should go to the appropriate Arch forum and ask how to downgrade.

Below is a link to the patch:

https://lore.kernel.org/all/20221230200747.19040-1-nbd@nbd.name/

I have tested it on 6.1 and 6.2-rc2 and it works. I hope it is merged soon as this could get ugly as more distros upgrade to 6.1.

Nick

@randomcodepanda
Copy link

@morrownr

Hello, I am using Arch Linux 6.1.4 with a new Alfa AWUS036ACM and am experiencing a crash within a few seconds of logging in. What can I do until the issue is fixed? Is there any way that I can downgrade to 6.0?

If you are not on some bleeding edge hardware that needs newer kernels the best bet for stability in Arch right now would be to install the linux-lts package and update your bootloader so you can boot with linux-lts which is on 5.15.

It is possible to downgrade to 6.0 if you still have the old package on your cache or if you download it from the arch archive and install it with pacman -U, but then you would have to add the linux package to the ignorepkg line in pacman.conf so it doesn't get upgraded on a regular upgrade.

It's easier to just use the linux-lts kernel if that is compatible with the rest of your hardware.

@adelias
Copy link

adelias commented Jan 30, 2023

This is now fixed in Debian kernel 6.1.8-1. From the changelog:
- wifi: mac80211: fix initialization of rx->link and rx->link_sta (Closes: #1029816)

@patrakov
Copy link

I can confirm that linux-6.1.8 fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants