Skip to content

【bugfix】修复iommu的几个严重bug #144

Merged
sterling-teng merged 40 commits into
RVCK-Project:rvck-6.6from
uestc-gr:iommu-fix
Oct 16, 2025
Merged

【bugfix】修复iommu的几个严重bug #144
sterling-teng merged 40 commits into
RVCK-Project:rvck-6.6from
uestc-gr:iommu-fix

Conversation

@uestc-gr
Copy link
Copy Markdown
Contributor

@uestc-gr uestc-gr commented Sep 25, 2025

问题详见issue
#142

验证方法
问题1:
1、启动uefi镜像,配置virtio-scsi和iommu,镜像使用systemd作为1号进程,系统能够成功进入shell
问题2:
1、可以成功配置vfio

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 25, 2025


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/18002493393

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/144/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[09:10:15] Testing complete. Ran 455 tests: passed: 443, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/144/

f1b81d77d15636b43e580f1381239e8e /srv/guix_result/233ca7385d977d2d989a83e4779e8608ceaadb5e/Image
aa42a75dd7800f9ae04b46d5893f56e5 /root/initramfs.img

LAVA Check

args:

result:

Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/773

lava result count: [fail]: 174, [pass]: 1434, [skip]: 291

Check Patch Result

Total Errors 0
Total Warnings 22

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 26, 2025


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/18029770770

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/144/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[06:29:23] Testing complete. Ran 455 tests: passed: 443, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/144/

3b81513c0b77e60c0f3d5c3eea547ddf /srv/guix_result/852490f4b92f39dfcafb7e9820ecaf7804f4118a/Image
9a82b1599f7b2eb475ce31f324c36dd0 /root/initramfs.img

LAVA Check

args:

result:

Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/787

lava result count: [fail]: 173, [pass]: 1435, [skip]: 291

Check Patch Result

Total Errors 3
Total Warnings 93

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 26, 2025


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/18029879928

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/144/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[06:36:05] Testing complete. Ran 455 tests: passed: 443, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/144/

585f3a7c4a4135a44ce51928412e60e7 /srv/guix_result/04a298e9dcaafece38eb295a58adb52d3b391c1b/Image
6a8364ea4a9a1917ac53b5274c45f16f /root/initramfs.img

LAVA Check

args:

result:

Lava check fail! lava log: https://lava.oerv.ac.cn/scheduler/job/788

lava result count: call: 1

Check Patch Result

Total Errors 0
Total Warnings 22

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 30, 2025


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/18121544652

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/144/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build failure
lava-trigger skipped
check-patch success

Kunit Test Result

[07:06:49] Testing complete. Ran 455 tests: passed: 443, skipped: 12

Kernel Build Result

Kernel build failed.

Check Patch Result

Total Errors 0
Total Warnings 40

@sterling-teng
Copy link
Copy Markdown
Contributor

开发树已经变基,请尽快 rebase。

mainline inclusion
from mainline-v6.7-rc4
commit 48ed127
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

The pattern for picking the first device out of the group list is
repeated a few times now, so it's clearly worth factoring out, which
also helps hide the iommu_group_dev detail from places that don't need
to know. Similarly, the safety check for dev_iommu_ops() at certain
public interfaces starts looking a bit repetitive, and might not be
completely obvious at first glance, so let's factor that out for clarity
as well, in preparation for more uses of both.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/566cbd161546caa6aed49662c9b3e8f09dc9c3cf.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc4
commit 1d8d43b
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Much as I'd like to remove iommu_present(), the final remaining users
are proving stubbornly difficult to clean up, so kick that can down the
road and just rework it to preserve the current behaviour without
depending on bus ops. Since commit 57365a0 ("iommu: Move bus setup
to IOMMU device registration"), any registered IOMMU instance is already
considered "present" for every entry in iommu_buses, so it's simply a
case of validating the bus and checking we have at least once IOMMU.

Reviewed-by: Jason Gunthorpe<jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/caa93680bb9d35a8facbcd8ff46267ca67335229.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc4
commit a9c362d
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Before we can allow drivers to coexist, we need to make sure that one
driver's domain ops can't misinterpret another driver's dev_iommu_priv
data. To that end, add a token to the domain so we can remember how it
was allocated - for now this may as well be the device ops, since they
still correlate 1:1 with drivers. We can trust ourselves for internal
default domain attachment, so add checks to cover all the public attach
interfaces.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/097c6f30480e4efe12195d00ba0e84ea4837fb4c.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc4
commit b4c0497
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

As the final remaining piece of bus-dependent API, iommu_domain_alloc()
can now take responsibility for the "one iommu_ops per bus" rule for
itself. It turns out we can't safely make the internal allocation call
any more group-based or device-based yet - that will have to wait until
the external callers can pass the right thing - but we can at least get
as far as deriving "bus ops" based on which driver is actually managing
devices on the given bus, rather than whichever driver won the race to
register first.

This will then leave us able to convert the last of the core internals
over to the IOMMU-instance model, allow multiple drivers to register and
actually coexist (modulo the above limitation for unmanaged domain users
in the short term), and start trying to solve the long-standing
iommu_probe_device() mess.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/6c7313009aae0e39ae2855920990ebf85af4662f.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc4
commit 17de3f5
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

With the rest of the API internals converted, it's time to finally
tackle probe_device and how we bootstrap the per-device ops association
to begin with. This ends up being disappointingly straightforward, since
fwspec users are already doing it in order to find their of_xlate
callback, and it works out that we can easily do the equivalent for
other drivers too. Then shuffle the remaining awareness of iommu_ops
into the couple of core headers that still need it, and breathe a sigh
of relief.

Ding dong the bus ops are gone!

CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/a59011ef65b4b6657cb0b7a388d786b779b61305.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc4
commit e708066
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Some drivers already implement their own defence against the possibility
of being given someone else's device. Since this is now taken care of by
the core code (and via a slightly different path from the original
fwspec-based idea), let's clean them up.

Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/58a9879ce3f03562bb061e6714fe6efb554c3907.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc6
commit 4720287
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

This is not being used to pass ops, it is just a way to tell if an
iommu driver was probed. These days this can be detected directly via
device_iommu_mapped(). Call device_iommu_mapped() in the two places that
need to check it and remove the iommu parameter everywhere.

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc6
commit 6ff6e18
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Nothing needs this pointer. Return a normal error code with the usual
IOMMU semantic that ENODEV means 'there is no IOMMU driver'.

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc6
commit 5b4ea8b
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Instead of returning 1 and trying to handle positive error codes just
stick to the convention of returning -ENODEV. Remove references to ops
from of_iommu_configure(), a NULL ops will already generate an error code.

There is no reason to check dev->bus, if err=0 at this point then the
called configure functions thought there was an iommu and we should try to
probe it. Remove it.

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc6
commit 64945d1
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Allocation of dev->iommu must be done under the
iommu_probe_device_lock. Mark this with lockdep to discourage future
mistakes.

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Hector Martin <marcan@marcan.st>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc6
commit eda1a94
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

A perfect driver would only call dev_iommu_priv_set() from its probe
callback. We've made it functionally correct to call it from the of_xlate
by adding a lock around that call.

lockdep assert that iommu_probe_device_lock is held to discourage misuse.

Exclude PPC kernels with CONFIG_FSL_PAMU turned on because FSL_PAMU uses a
global static for its priv and abuses priv for its domain.

Remove the pointless stores of NULL, all these are on paths where the core
code will free dev->iommu after the op returns.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.7-rc6
commit cdbc723
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Nothing needs this pointer. Return a normal error code with the usual
IOMMU semantic that ENODEV means 'there is no IOMMU driver'.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 3f7c320
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

There's no real need for callers to resolve ops from a fwnode in order
to then pass both to iommu_fwspec_init() - it's simpler and more sensible
for that to resolve the ops itself. This in turn means we can centralise
the notion of checking for a present driver, and enforce that fwspecs
aren't allocated unless and until we know they will be usable.

Also use this opportunity to modernise with some "new" helpers that
arrived shortly after this code was first written; the generic
fwnode_handle_get() clears up that ugly get/put mismatch, while
of_fwnode_handle() can now abstract those open-coded dereferences.

Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0e2727adeb8cd73274425322f2f793561bdc927e.1719919669.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 78596b5
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Now that iommu_fwspec_init() can signal for probe deferral directly,
acpi_iommu_fwspec_ops() is unneeded and can be cleaned up.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/011e39e275aba3ad451c5a1965ca8ddf20ed36c2.1719919669.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 5f937bc
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

We no longer have a notion of partially-initialised fwspecs existing,
and we also no longer need to use an iommu_ops pointer to return status
to of_dma_configure(). Clean up the remains of those, which lends itself
to clarifying the logic around the dma_range_map allocation as well.

Acked-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/61972f88e31a6eda8bf5852f0853951164279a3c.1719919669.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 3e36c15
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

The ops in iommu_fwspec are only needed for the early configuration and
probe process, and by now are easy enough to derive on-demand in those
couple of places which need them, so remove the redundant stored copy.

Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/55c1410b2cd09531eab4f8e2f18f92a0faa0ea75.1719919669.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.14-rc7
commit 29c6e1c
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

The drivers doing their own fwspec parsing have no need to call
iommu_fwspec_free() since fwspecs were moved into dev_iommu, as
returning an error from .probe_device will tear down the whole lot
anyway. Move it into the private interface now that it only serves
for of_iommu to clean up in an error case.

I have no idea what mtk_v1 was doing in effectively guaranteeing
a NULL fwspec would be dereferenced if no "iommus" DT property was
found, so add a check for that to at least make the code look sane.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/36e245489361de2d13db22a510fa5c79e7126278.1740667667.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.14-rc7
commit b46064a
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

It turns out that deferred default domain creation leaves a subtle
race window during iommu_device_register() wherein a client driver may
asynchronously probe in parallel and get as far as performing DMA API
operations with dma-direct, only to be switched to iommu-dma underfoot
once the default domain attachment finally happens, with obviously
disastrous consequences. Even the wonky of_iommu_configure() path is at
risk, since iommu_fwspec_init() will no longer defer client probe as the
instance ops are (necessarily) already registered, and the "replay"
iommu_probe_device() call can see dev->iommu_group already set and so
think there's nothing to do either.

Fortunately we already have the right tool in the right place in the
form of iommu_device_use_default_domain(), which just needs to ensure
that said default domain is actually ready to *be* used. Deferring the
client probe shouldn't have too much impact, given that this only
happens while the IOMMU driver is probing, and thus due to kick the
deferred probe list again once it finishes.

Reported-by: Charan Teja Kalla <quic_charante@quicinc.com>
Fixes: 98ac73f ("iommu: Require a default_domain for all iommu drivers")
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e88b94c9b575034a2c98a48b3d383654cbda7902.1740753261.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.14-rc7
commit fd598f7
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Since iommu_init_device() was factored out, it is in fact the only
consumer of the ops which __iommu_probe_device() is resolving, so let it
do that itself rather than passing them in. This also puts the ops
lookup at a more logical point relative to the rest of the flow through
__iommu_probe_device().

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/fa4b6cfc67a352488b7f4e0b736008307ce9ac2e.1740753261.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.14-rc7
commit 3832862
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

At the moment, if of_iommu_configure() allocates dev->iommu itself via
iommu_fwspec_init(), then suffers a DT parsing failure, it cleans up the
fwspec but leaves the empty dev_iommu hanging around. So far this is
benign (if a tiny bit wasteful), but we'd like to be able to reason
about dev->iommu having a consistent and unambiguous lifecycle. Thus
make sure that the of_iommu cleanup undoes precisely whatever it did.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/d219663a3f23001f23d520a883ac622d70b4e642.1740753261.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.14-rc7
commit bcb81ac
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

In hindsight, there were some crucial subtleties overlooked when moving
{of,acpi}_dma_configure() to driver probe time to allow waiting for
IOMMU drivers with -EPROBE_DEFER, and these have become an
ever-increasing source of problems. The IOMMU API has some fundamental
assumptions that iommu_probe_device() is called for every device added
to the system, in the order in which they are added. Calling it in a
random order or not at all dependent on driver binding leads to
malformed groups, a potential lack of isolation for devices with no
driver, and all manner of unexpected concurrency and race conditions.
We've attempted to mitigate the latter with point-fix bodges like
iommu_probe_device_lock, but it's a losing battle and the time has come
to bite the bullet and address the true source of the problem instead.

The crux of the matter is that the firmware parsing actually serves two
distinct purposes; one is identifying the IOMMU instance associated with
a device so we can check its availability, the second is actually
telling that instance about the relevant firmware-provided data for the
device. However the latter also depends on the former, and at the time
there was no good place to defer and retry that separately from the
availability check we also wanted for client driver probe.

Nowadays, though, we have a proper notion of multiple IOMMU instances in
the core API itself, and each one gets a chance to probe its own devices
upon registration, so we can finally make that work as intended for
DT/IORT/VIOT platforms too. All we need is for iommu_probe_device() to
be able to run the iommu_fwspec machinery currently buried deep in the
wrong end of {of,acpi}_dma_configure(). Luckily it turns out to be
surprisingly straightforward to bootstrap this transformation by pretty
much just calling the same path twice. At client driver probe time,
dev->driver is obviously set; conversely at device_add(), or a
subsequent bus_iommu_probe(), any device waiting for an IOMMU really
should *not* have a driver already, so we can use that as a condition to
disambiguate the two cases, and avoid recursing back into the IOMMU core
at the wrong times.

Obviously this isn't the nicest thing, but for now it gives us a
functional baseline to then unpick the layers in between without many
more awkward cross-subsystem patches. There are some minor side-effects
like dma_range_map potentially being created earlier, and some debug
prints being repeated, but these aren't significantly detrimental. Let's
make things work first, then deal with making them nice.

With the basic flow finally in the right order again, the next step is
probably turning the bus->dma_configure paths inside-out, since all we
really need from bus code is its notion of which device and input ID(s)
to parse the common firmware properties with...

Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci-driver.c
Acked-by: Rob Herring (Arm) <robh@kernel.org> # of/device.c
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/e3b191e6fd6ca9a1e84c5e5e40044faf97abb874.1740753261.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit a27bf27
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Commit <17de3f5fdd35> ("iommu: Retire bus ops") removes iommu ops from
bus. The iommu subsystem no longer relies on bus for operations. So the
bus parameter in iommu_domain_alloc() is no longer relevant.

Add a new interface named iommu_paging_domain_alloc(), which explicitly
indicates the allocation of a paging domain for DMA managed by a kernel
driver. The new interface takes a device pointer as its parameter, that
better aligns with the current iommu subsystem.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240610085555.88197-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit d5b7485
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in ath10k_fw_init() and is attached to
ar_snoc->fw.dev in the same function. Use iommu_paging_domain_alloc() to
make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-11-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit ef50d41
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in ath11k_ahb_fw_resources_init() and is
attached to ab_ahb->fw.dev in the same function.

Use iommu_paging_domain_alloc() to make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-12-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit bf7835f
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

In nvkm_device_tegra_probe_iommu(), a paging domain is allocated for @dev
and attached to it on success. Use iommu_paging_domain_alloc() to make it
explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240902014700.66095-2-baolu.lu@linux.intel.com
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 9719c7b
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in host1x_iommu_attach() and is attached to
host->dev. Use iommu_paging_domain_alloc() to make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-8-baolu.lu@linux.intel.com
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812071605.9513-1-baolu.lu@linux.intel.com
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 93ee2d7
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in tegra_vde_iommu_init() and is attached to
vde->dev. Use iommu_paging_domain_alloc() to make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-9-baolu.lu@linux.intel.com
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 7ce5552
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in venus_firmware_init() and is attached to
core->fw.dev in the same function. Use iommu_paging_domain_alloc() to
make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-10-baolu.lu@linux.intel.com
Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 8a8622b
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in rproc_enable_iommu() and is attached to
rproc->dev.parent in the same function.

Use iommu_paging_domain_alloc() to make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240812072811.9737-1-baolu.lu@linux.intel.com
Acked-by: Beleswar Padhi <b-padhi@ti.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 77a1a51
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

An iommu domain is allocated in portal_set_cpu() and is attached to
pcfg->dev in the same function.

Use iommu_paging_domain_alloc() to make it explicit.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-14-baolu.lu@linux.intel.com
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 3b10f25
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

usnic_uiom_alloc_pd() allocates a paging domain for a given device.
In this case, iommu_domain_alloc(dev->bus) is equivalent to
iommu_paging_domain_alloc(dev). Replace it as iommu_domain_alloc()
has been deprecated.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240610085555.88197-15-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit d8c07be
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Commit <421be3ee36a4> ("drm/rockchip: Refactor IOMMU initialisation") has
refactored rockchip_drm_init_iommu() to pass a device that the domain is
allocated for. Replace iommu_domain_alloc() with
iommu_paging_domain_alloc() to retire the former.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Andy Yan <andyshrk@163.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240902014700.66095-3-baolu.lu@linux.intel.com
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 45c690a
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

Commit <17de3f5fdd35> ("iommu: Retire bus ops") removes iommu ops from
the bus structure. The iommu subsystem no longer relies on bus for
operations. So iommu_domain_alloc() interface is no longer relevant.

Replace iommu_domain_alloc() with iommu_paging_domain_alloc() which takes
the physical device from which the host1x_device virtual device was
instantiated. This physical device is a common parent to all physical
devices that are part of the virtual device.

Suggested-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240902014700.66095-4-baolu.lu@linux.intel.com
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit 6632863
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

The last callsite of iommu_present() is removed by commit <45c690aea8ee>
("drm/tegra: Use iommu_paging_domain_alloc()"). Remove it to avoid dead
code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20241009051808.29455-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
mainline inclusion
from mainline-v6.10-rc7
commit f6440fc
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

The iommu_domain_alloc() interface is no longer used in the tree anymore.
Remove it to avoid dead code.

There is increasing demand for supporting multiple IOMMU drivers, and this
is the last bus-based thing standing in the way of that.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241009041147.28391-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 9, 2025


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/18364440019

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/144/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[03:22:54] Testing complete. Ran 455 tests: passed: 443, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/144/

344c6523aa864f66394eec57a2a63bd4 /srv/guix_result/5c0009df0bcabc3555d3c8d79971f1ff502892a4/Image
43425ddf5d7a7a3c076023bee1bd50d4 /root/initramfs.img

LAVA Check

args:

result:

Lava check done! lava log: https://lava.oerv.ac.cn/scheduler/job/817

lava result count: [fail]: 173, [pass]: 1435, [skip]: 291

Check Patch Result

Total Errors 0
Total Warnings 40

mainline inclusion
from mainline-v6.14-rc7
commit 73d2f10
category: bugfix
bugzilla: RVCK-Project#142

--------------------------------

The warning for suspect probe conditions inadvertently got moved too
early in a prior respin - it happened to work out OK for fwspecs, but in
general still needs to be after the ops->probe_device call so drivers
which filter devices for themselves have a chance to do that.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Fixes: bcb81ac ("iommu: Get DT/ACPI parsing into the proper probe path")
Link: https://lore.kernel.org/r/72a4853e7ef36e7c1c4ca171ed4ed8e1a463a61a.1741791691.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: gaorui <gao.rui@zte.com.cn>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 13, 2025


开始测试 log: https://github.com/RVCK-Project/rvck/actions/runs/18456842515

参数解析结果
args value
repository RVCK-Project/rvck
head ref pull/144/head
base ref rvck-6.6
LAVA repo RVCK-Project/lavaci
LAVA Template lava-job-template/qemu/qemu-ltp.yaml
Testcase path lava-testcases/common-test/ltp/ltp.yaml

测试完成

详细结果:

RVCK result

check result
kunit-test success
kernel-build success
lava-trigger success
check-patch success

Kunit Test Result

[06:26:27] Testing complete. Ran 455 tests: passed: 443, skipped: 12

Kernel Build Result

Kernel build succeeded: RVCK-Project/rvck/144/

fc00f55ff9389ee390ecb8160867c432 /srv/guix_result/4a17a35b67a26338aacd52647d3bba68fe7b73f8/Image
2cd3e0ff8c5334b0b56946682583ffec /root/initramfs.img

LAVA Check

args:

result:

Lava check fail! lava log: https://lava.oerv.ac.cn/scheduler/job/847

lava result count: call: 1

Check Patch Result

Total Errors 0
Total Warnings 41

@uestc-gr
Copy link
Copy Markdown
Contributor Author

帮忙优先评审一下这单,后面的iommu开发需要依赖这单的修复,多谢 @sterling-teng

@sterling-teng
Copy link
Copy Markdown
Contributor

帮忙优先评审一下这单,后面的iommu开发需要依赖这单的修复,多谢 @sterling-teng

好的,麻烦引用一下 issue。

@uestc-gr
Copy link
Copy Markdown
Contributor Author

帮忙优先评审一下这单,后面的iommu开发需要依赖这单的修复,多谢 @sterling-teng

好的,麻烦引用一下 issue。

issue为:#142

@sterling-teng
Copy link
Copy Markdown
Contributor

物理机测试通过

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants