Skip to content

ISP#181

Closed
eiln wants to merge 613 commits intoAsahiLinux:asahi-wipfrom
eiln:isp
Closed

ISP#181
eiln wants to merge 613 commits intoAsahiLinux:asahi-wipfrom
eiln:isp

Conversation

@eiln
Copy link
Copy Markdown

@eiln eiln commented Aug 31, 2023

Signed-off-by: Eileen Yoon eyn@gmx.com

hoshinolina and others added 30 commits July 20, 2023 11:33
Signed-off-by: Asahi Lina <lina@asahilina.net>
This lets us play around with unknown stuff from userspace, without
having to commit to adding it to the UAPI (or knowing whether it's safe
to expose at all).

Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Asahi Lina <lina@asahilina.net>
This isn't available on t602x, so it was probably a bad guess. Just use
the known die count for the userspace info, from our HW info tables.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Hector Martin <marcan@marcan.st>
This can't work, and should not be needed in these cases.

Signed-off-by: Hector Martin <marcan@marcan.st>
In the next patch this will be used a third time.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This is required.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
The iommus for the display processors on Apple silicon machines have
locked TTBR registers. To support iommu domain switching use a shadow
L1 page table and sync it on flush to the HW L1 table.

TODO: investigate if it's possible / necessary to optimize the syncing

Signed-off-by: Janne Grunau <j@jannau.net>
Without the RPM_ACTIVE flag, runtime PM core only seems to consider
the link insofar as it prevents the DART from suspending in case of
consumers *considered active by runtime PM*. Other devices, like those
on which runtime PM has yet to be enabled, or which lack any runtime PM
support, are not considered in preventing the DART from suspending.

DART going through suspend/resume cycle with active consumers can break
the consumers' operation by the DART being reset in its resume path,
among other things.

Add RPM_ACTIVE flag to the link to have the consumer in the link prevent
the DART from being suspended, unless the consumer itself is runtime PM
suspended. This supersedes an earlier PCIe-only workaround.

(TODO: Does this mean devices without bound drivers will keep their
DARTs up indefinitely? This depends on the timing of the iommu
probe_device/release_device calls. Investigate.)

Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
DARTs on t602x SoCs are of the t8110 variant but have an IAS of 42,
which means optional support for an extra page table level.

Refactor the PTE management to support an arbitrary level count, and
then calculate how many levels we need for any given configuration.

Signed-off-by: Hector Martin <marcan@marcan.st>
These registers exist at least on the t602x variant, and if not cleared
the IRQ will never clear.

Signed-off-by: Hector Martin <marcan@marcan.st>
The registers are 32-bit and the offsets definitely don't need 64 bits
either, these should've been u32s.

Signed-off-by: Hector Martin <marcan@marcan.st>
The T8110 variant DART implementation on T602x SoCs indicates an IAS of
42, which requires an extra page table level. The extra level is
optional, but let's implement it.

Later it might be useful to restrict this based on the actual attached
devices, since most won't need that much address space anyway.

Signed-off-by: Hector Martin <marcan@marcan.st>
Apple DARTs are often connected directly to devices that expect only a
portion of their address space to be used for DMA (for example, because
other ranges are mapped directly to something else). Add an
apple,dma-range property to allow specifying this range.

This range *can* be outside of the DART's IAS. In that case, it is
assumed that the hardware truncates addresses and the page tables will
only map the lower bits of the address. However, the specified range
cannot straddle an IAS boundary (you cannot cover more than IAS worth
of address space nor wrap).

This corresponds to the vm-base and vm-size properties on the Apple
device tree side of things.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Janne Grunau <j@jannau.net>
An atomic_commit for brightness changes will consume a DCP swap without
frame buffer updates and will result in a lost frame. After updating
the next brightness values wait for 1 frame duration (at 23.976 fps).
Check if the brightness update still needs to be send to DVCP or if a
swap did that in the meintime.

Signed-off-by: Janne Grunau <j@jannau.net>
It's only needed to configure the display contoller's iommu to share
buffers between the DCP co-processor and the display controller.
Possible concern is runtime PM for it and its iommu. If we don't set it
up the power domain might never go to lower power states even if it
could.

Signed-off-by: Janne Grunau <j@jannau.net>
The current use of of dma_get_sgtable/dma_map_sgtable is deemed unsafe.
Replace it with an unmanaged iommu domain for the piodma iommu to map
the buffers.

Signed-off-by: Janne Grunau <j@jannau.net>
The iommu scatter table/list mapping can only map full iommu page size
extents. Just align the actual the allocation to the iommu page size.
This could be handled differently using DARTs subpage protection but
there's no easy way to integrate that.

Signed-off-by: Janne Grunau <j@jannau.net>
Some CPUs expose IMPDEF features in ACTLR_EL1 that can be meaningfully
controlled per-thread (like TSO control on Apple cores). Add the basic
scaffolding to save/restore this register as part of context switching.

This mechanism is disabled by default both by config symbol and via a
runtime check, which ensures it is never triggered unless the system is
known to need it for some feature (which also implies that the layout of
ACTLR_EL1 is uniform between all CPU core types).

Signed-off-by: Hector Martin <marcan@marcan.st>
On some architectures, it is possible to query and/or change the CPU
memory model. This allows userspace to switch to a stricter memory model
for performance reasons, such as when emulating code for another
architecture where that model is the default.

Introduce two prctls to allow userspace to query and set the memory
model for a thread. Two models are initially defined:

- PR_SET_MEM_MODEL_DEFAULT requests the default memory model for the
  architecture.
- PR_SET_MEM_MODEL_TSO requests the x86 TSO memory model.

PR_SET_MEM_MODEL is allowed to set a stricter memory model than
requested if available, in which case it will return successfully. If
the requested memory model cannot be fulfilled, it will return an error.
The memory model that was actually set can be queried by a subsequent
call to PR_GET_MEM_MODEL.

Examples:
- On a CPU with not support for a memory model at least as strong as
  TSO, PR_SET_MEM_MODEL(PR_SET_MEM_MODEL_TSO) fails.
- On a CPU with runtime-configurable TSO support, PR_SET_MEM_MODEL can
  toggle the memory model between DEFAULT and TSO at will.
- On a CPU where the only memory model is at least as strict as TSO,
  PR_GET_MEM_MODEL will return PR_SET_MEM_MODEL_DEFAULT, and
  PR_SET_MEM_MODEL(PR_SET_MEM_MODEL_TSO) will return success but leave
  the memory model at PR_SET_MEM_MODEL_DEFAULT. This implies that the
  default is in fact at least as strict as TSO.

Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Eileen Yoon <eyn@gmx.com>
s/asc/coproc/

Signed-off-by: Eileen Yoon <eyn@gmx.com>
Signed-off-by: Eileen Yoon <eyn@gmx.com>
Signed-off-by: Eileen Yoon <eyn@gmx.com>
Signed-off-by: Eileen Yoon <eyn@gmx.com>
Signed-off-by: Eileen Yoon <eyn@gmx.com>
Not really anymore.

Signed-off-by: Eileen Yoon <eyn@gmx.com>
@marcan marcan force-pushed the asahi-wip branch 4 times, most recently from fea774d to b486fd3 Compare September 22, 2023 14:05
@marcan marcan force-pushed the asahi-wip branch 11 times, most recently from f159fe3 to 618f14c Compare October 11, 2023 17:50
@marcan marcan force-pushed the asahi-wip branch 4 times, most recently from 837bf5a to 0f143d4 Compare November 3, 2023 12:35
@marcan
Copy link
Copy Markdown

marcan commented Nov 3, 2023

This was manually merged :)

@marcan marcan closed this Nov 3, 2023
dberlin pushed a commit to dberlin/linux that referenced this pull request Nov 12, 2023
[ Upstream commit e502801 ]

Running the following command on Juno triggers the warning:

 $ perf record -e cs_etm// -m ,128M ...

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 412 at mm/page_alloc.c:4453 __alloc_pages+0x334/0x1420
 CPU: 1 PID: 412 Comm: perf Not tainted 6.5.0-rc3+ AsahiLinux#181
 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Feb  1 2019
 pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : __alloc_pages+0x334/0x1420
 lr : dma_common_alloc_pages+0x108/0x138
 sp : ffffffc087fb7440
 x29: ffffffc087fb7440 x28: 0000000000000000 x27: ffffffc07e48fba0
 x26: 0000000000000001 x25: 000000000000000f x24: ffffffc081f24880
 x23: 0000000000000cc0 x22: ffffff88012b6f08 x21: 0000000008000000
 x20: ffffff8801433000 x19: 0000000000000000 x18: 0000000000000000
 x17: ffffffc080316e5c x16: ffffffc07e46406c x15: ffffffc0803af580
 x14: ffffffc08036b460 x13: ffffffc080025cbc x12: ffffffb8108c3fc4
 x11: 1ffffff8108c3fc3 x10: 1ffffff810ff6eac x9 : 00000000f204f204
 x8 : 000000000000f204 x7 : 00000000f2f2f2f2 x6 : 00000000f3f3f3f3
 x5 : 0000000000000001 x4 : 0000000000000000 x3 : 0000000000000000
 x2 : 0000000000000cc0 x1 : 0000000000000000 x0 : ffffffc085333000
 Call trace:
  __alloc_pages+0x334/0x1420
  dma_common_alloc_pages+0x108/0x138
  __dma_alloc_pages+0xf4/0x108
  dma_alloc_pages+0x18/0x30
  tmc_etr_alloc_flat_buf+0xa0/0x190 [coresight_tmc]
  tmc_alloc_etr_buf.constprop.0+0x124/0x298 [coresight_tmc]
  alloc_etr_buf.constprop.0.isra.0+0x88/0xc8 [coresight_tmc]
  tmc_alloc_etr_buffer+0x164/0x2f0 [coresight_tmc]
  etm_setup_aux+0x32c/0x520 [coresight]
  rb_alloc_aux+0x29c/0x3f8
  perf_mmap+0x59c/0xce0
  mmap_region+0x340/0x10e0
  do_mmap+0x48c/0x580
  vm_mmap_pgoff+0x160/0x248
  ksys_mmap_pgoff+0x1e8/0x278
  __arm64_sys_mmap+0x8c/0xb8

With the flat mode, we only attempt to allocate large memory if there is an IOMMU
connected to the ETR. If the allocation fails, we always have a fallback path
and return an error if nothing else worked. So, suppress the warning for flat
mode allocations.

Cc: Mike Leach <mike.leach@linaro.org>
Cc: James Clark <james.clark@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20230817161951.658534-1-suzuki.poulose@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.