Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: match pci devs or drvs against DT nodes #11345

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

InsaneKnight
Copy link
Contributor

@InsaneKnight InsaneKnight commented Nov 26, 2022

A TP-LINK archer C7 v2 device running OpenWRT v22.03+ will drop into reboot loop after its internal but socketed ath10k mini PCIe wlan card gets replaced with an ath9k one (namely, an ar9580 working well on my laptop). Further observation in OpenWRT's failsafe mode shows that the router reboots right after the ath9k driver is manually loaded, but after a firmware image built with the only change being the node corresponding to the wlan card (the node wifi@0,0) removed from the device tree is flashed, everything works just fine.

The node wifi@0,0 should only compatible with "qcom,ath10k", but the reboot seems to be caused by the "calibration" data (targeting ath10k devices) declared in this node being loaded by the ath9k driver for the ath9k card installed on its place, regardless the compatibility string.

In order to resolve such issue, a PCI driver should check whether its
positionally corresponding device tree nodes is compatible before loading
any data from the node.

This commit contains 3 kernel patches:
931 adds routines to match pci devices or drivers against device tree nodes,
602 and 991 add checks to ath9k and ath10k respectively.

Signed-off-by: Edward Chow equu@openmail.cc

@github-actions github-actions bot added core packages pull request/issue for core (in-tree) packages target/ath79 pull request/issue for ath79 target labels Nov 26, 2022
@hauke
Copy link
Member

hauke commented Nov 26, 2022

I do not think we will be able to get this integrated into the upstream Linux kernel.
ath10k and ath9k should run into a kernel panic when you provide the wrong calibration data. The drivers should validate the data and then just fail to load. This looks like a bug in the driver.

When you add a compatibility string it should only load for one card when when it is possible to add multiple PCIe cards into the slot: See: compatible = "pci168c,0033";

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@mpratt14
Copy link
Contributor

something more simple is wrong with the DTS

nvmem-cell-names gives a categorical label to the cell, it's not meant to be a unique identifier like what's called a "label" in DTS

all device nodes are instantiated as kernel objects at some point, and they are all unique pointers regardless of what names are used, and standard DTS labels are unique in order to point to a specific node without using properties.

nodes dont even need a name, but they need a "compatible" and a name for each cell it has a pointer to so the driver knows which is which.

I recommend you just post your DTS somewhere like on the forum and ask for help

I just made the switch on a device I have and I had no issues with the following format

&art {
	compatible = "nvmem-cells";

	macaddr_art_0: macaddr@0 {
		reg = <0x0 0x6>;
	};

	calibration_ath9k: calibration@1000 {
		reg = <0x1000 0x440>;
	};

	calibration_ath10k: calibration@5000 {
		reg = <0x5000 0x844>;
	};
};

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@john-tho
Copy link
Contributor

After it is replaced with a stand-along ath9k card (e.g. ar9580)

When you replace the pcie card, update the DTS to reflect the new card? Otherwise, your DTS is (now) wrong for your hardware.

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@mpratt14
Copy link
Contributor

ok I understand now

in my opinion the proper fix would be to edit ath9k and ath10k drivers to look for their own compatible property in the nodes and otherwise ignore the device

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@mpratt14
Copy link
Contributor

it would not hurt to just add another generic compatible "atheros,ath9k" or similar. even if upstream wouldn't like that, its small enough to keep in the "hack" category without issues

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 27, 2022 via email

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Nov 28, 2022 via email

@github-actions github-actions bot added kernel pull request/issue with Linux kernel related changes and removed target/ath79 pull request/issue for ath79 target labels Nov 28, 2022
@InsaneKnight InsaneKnight changed the title ath9k: Use a cal cell name different with ath10k kernel: match pci devs or drvs against DT nodes Dec 7, 2022
@InsaneKnight
Copy link
Contributor Author

I have pushed another revision, but it seems the back ported mac80211 drivers cannot properly use newly exported symbols from the Linux kernel proper.

@mpratt14
Copy link
Contributor

this looks really good now, just need to solve the build problem and test

@hauke
Copy link
Member

hauke commented Jan 6, 2023

Refresh the patches to make the CI happy:

make target/linux/{clean,refresh} V=99 -j5

The kernel patch in target/linux/generic/hack-5.15/931-pci-add-functions-to-match-pci-dev-against-OF-DT-node.patch is also needed for kernel 5.10 because OpenWrt currently uses both.

Could you also send this to the upstream Linux kernel community to get this integrated into mainline Linux kernel. I think this looks like a useful feature also for other Linux distributions. If you do so please share the Link to the mailing list discussion or patchwork.

@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Jan 7, 2023 via email

Currently, the nvcell name "calibration" for calibration data of ath9k and
ath10k is identical, which could cause kernel panic during trial of
applying calibration data of ath10k to an ath9k card if for example
the ath10k card within tp-link archer c7 v2 is replaced with an ath9k card
with calibration data engraved in.

This commit contains 3 kernel patches:
921 adds routines to match pci devices against device tree nodes, 602 and
991 add checks to ath9k and ath10k respectively.

Signed-off-by: Edward Chow <equu@openmail.cc>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 2, 2023
As reported in openwrt/openwrt#11345 , ath9k
would load calibration data from a device tree node declared
incompatible.

Now, ath9k will first check whether the device tree node is compatible
with it, using the functionality introduced with the first patch of
this series, ("PCI: of: Match pci devices or drivers against OF DT
nodes") and only proceed loading calibration data from compatible node.

Signed-off-by: Edward Chow <equu@openmail.cc>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 2, 2023
ath10k might also be sensitive to the issue reported on
openwrt/openwrt#11345 , loading calibration
data from a device tree node declared incompatible.

ath10k will first check whether the device tree node is compatible
with it, using the functionality introduced with the first patch of
this series, ("PCI: of: Match pci devices or drivers against OF DT
nodes") and only proceed loading calibration data from compatible node.

Signed-off-by: Edward Chow <equu@openmail.cc>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 2, 2023
As reported in openwrt/openwrt#11345 , ath9k
would load calibration data from a device tree node declared
incompatible.

Now, ath9k will first check whether the device tree node is compatible
with it, using the functionality introduced with the first patch of
this series, ("PCI: of: Match pci devices or drivers against OF DT
nodes") and only proceed loading calibration data from compatible node.

Signed-off-by: Edward Chow <equu@openmail.cc>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 2, 2023
ath10k might also be sensitive to the issue reported on
openwrt/openwrt#11345 , loading calibration
data from a device tree node declared incompatible.

ath10k will first check whether the device tree node is compatible
with it, using the functionality introduced with the first patch of
this series, ("PCI: of: Match pci devices or drivers against OF DT
nodes") and only proceed loading calibration data from compatible node.

Signed-off-by: Edward Chow <equu@openmail.cc>
Reported-by: kernel test robot <lkp@intel.com>
@InsaneKnight
Copy link
Contributor Author

InsaneKnight commented Feb 9, 2023 via email

fengmushu pushed a commit to fengmushu/openwrt that referenced this pull request Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core packages pull request/issue for core (in-tree) packages kernel pull request/issue with Linux kernel related changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants