New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intel/gpu: Add support for Gen9+ #4260
Comments
Offer GPU session for testing 3D on Genode. Note, this will also work in Qemu because the mutliplexer will probe for supported GPUs during startup. issue genodelabs#4260
* Wait for for completion before return from 'execbuffer2'. This makes buffer execution synchronous. * Because the Iris driver manages the virtual address space of the GPU and creates one GEM context for each batch buffer we have to map/unmap all buffer objects before and after batch buffer execution. issue genodelabs#4260
This commit contains features and buf fixes: * Catch errors during resource allocation * Because Mesa tries to allocate fence (hardware) registers for each batch buffer execution, do not allocate new fences for buffer objects that are already fenced * Add support for global hardware status page. Each context additionally has a per-process hardware status page, which we used to set the global hardware status page during Vgpu switch. This was obviously wrong. There is only one global hardware status page (set once during initialization) and a distinct per-process page for contexts. * Write the sequence number of the currently executing batch buffer to dword 52 of the per-process hardware status page. We use the pipe line command with QW_WRITE (quad word write), GLOBAL_GTT_IVB disabled (address space is per-process address space), and STORE_DATA_INDEX enabled (write goes to offset of hardware status page). This command used to write to the scratch page. But Linux now uses the first reserved word of the per-process hardware status page. * Add Gen9+ WaEnableGapsTsvCreditFix workaround. This sets the "GAPS TSV Credit fix Enable" bit of the Arbiter control register (GARBCNTLREG) as described by the documentation this bit should be set by the BIOS but is not on most Gen9/9.5 platforms. Not setting this bit leads to random GPU hangs. * Increase the context size from 20 to 22 pages for Gen9. On Gen8 the hardware context is 20 pages (1 hardware status page + 19 ring context register pages). On Gen9 the size of the ring context registers has increased by two pages to 21 pages or 81.3125 KBytes as the IGD documentation states. * The logical ring size in the ring buffer control of the execlist context has to be programmed with number of pages - 1. So 0 is 1 page. We programmed the actual number of pages before, leading to ring buffer execution of NOOPs if page behind our ring buffer was empty or GPU hangs if there was data on the page. issue genodelabs#4260
The commits above add support for Intel GPU's Gen9/9.5 to Genode. |
* Add 'Mi_arb_on_off' and 'Mi_arb_check' to commands.h * Use Mi_* commands from commands.h issue genodelabs#4260
This is a left over from Mesa-11 and we exchanged it with a 'wait_and_dispatch_one_io_signal' for synchronous signal waits. issue genodelabs#4260
Offer GPU session for testing 3D on Genode. Note, this will also work in Qemu because the mutliplexer will probe for supported GPUs during startup. issue #4260
* Wait for for completion before return from 'execbuffer2'. This makes buffer execution synchronous. * Because the Iris driver manages the virtual address space of the GPU and creates one GEM context for each batch buffer we have to map/unmap all buffer objects before and after batch buffer execution. issue #4260
This commit contains features and buf fixes: * Catch errors during resource allocation * Because Mesa tries to allocate fence (hardware) registers for each batch buffer execution, do not allocate new fences for buffer objects that are already fenced * Add support for global hardware status page. Each context additionally has a per-process hardware status page, which we used to set the global hardware status page during Vgpu switch. This was obviously wrong. There is only one global hardware status page (set once during initialization) and a distinct per-process page for contexts. * Write the sequence number of the currently executing batch buffer to dword 52 of the per-process hardware status page. We use the pipe line command with QW_WRITE (quad word write), GLOBAL_GTT_IVB disabled (address space is per-process address space), and STORE_DATA_INDEX enabled (write goes to offset of hardware status page). This command used to write to the scratch page. But Linux now uses the first reserved word of the per-process hardware status page. * Add Gen9+ WaEnableGapsTsvCreditFix workaround. This sets the "GAPS TSV Credit fix Enable" bit of the Arbiter control register (GARBCNTLREG) as described by the documentation this bit should be set by the BIOS but is not on most Gen9/9.5 platforms. Not setting this bit leads to random GPU hangs. * Increase the context size from 20 to 22 pages for Gen9. On Gen8 the hardware context is 20 pages (1 hardware status page + 19 ring context register pages). On Gen9 the size of the ring context registers has increased by two pages to 21 pages or 81.3125 KBytes as the IGD documentation states. * The logical ring size in the ring buffer control of the execlist context has to be programmed with number of pages - 1. So 0 is 1 page. We programmed the actual number of pages before, leading to ring buffer execution of NOOPs if page behind our ring buffer was empty or GPU hangs if there was data on the page. issue #4260
* Add 'Mi_arb_on_off' and 'Mi_arb_check' to commands.h * Use Mi_* commands from commands.h issue #4260
This is a left over from Mesa-11 and we exchanged it with a 'wait_and_dispatch_one_io_signal' for synchronous signal waits. issue #4260
and don't assume 8M, which leads to Region_conflicts if size is >8M (X201). Issue genodelabs#4260
and don't assume 8M, which leads to Region_conflicts if size is >8M (X201). Issue genodelabs#4260
and don't assume 8M, which leads to Region_conflicts if size is >8M (X201). Issue #4260
- Lenovo T470p, T490, T490s Issue genodelabs#4260
- Lenovo T470p, T490, T490s Issue #4260
* use Gpu::addr_t (64 Bit) where necessary instead of Genode::addr_t. issue genodelabs#4260
* use Gpu::addr_t (64 Bit) where necessary instead of Genode::addr_t. issue #4260
Adjust drivers_interactive, drivers_managed and sculpt accordingly. Issue genodelabs#4260
@alex-ab I've no objections to merge the commit beside that I like to ask you to move gpu_drv.config into gpu/intel. It feels more natural to host this driver specific information in the driver directory like we did with the event-filter chargen files. |
Adjust drivers_interactive, drivers_managed and sculpt accordingly. Issue genodelabs#4260
Adjust drivers_interactive, drivers_managed and sculpt accordingly. Issue #4260
- Lenovo T470p, T490, T490s Issue #4260
* use Gpu::addr_t (64 Bit) where necessary instead of Genode::addr_t. issue #4260
Adjust drivers_interactive, drivers_managed and sculpt accordingly. Issue #4260
This patch reverts the commit "driver_interactive-pc: start GPU multiplexer". The addition of the GPU service introduces inconsistencies with boards other than PC, and increases the quota demand of the drivers subsystem, which breaks run scripts like demo.run on OKL4 or seL4. The generic drivers_interactive subsystems should remain deliberately limited to plain input and framebuffer devices. To cater additional peripherals like networking, audio, or GPU, we should use Sculpt as testbed or create board-specific run scripts. Issue #4260
Commit 395ae75 reverts the addition of the GPU service to the drivers_interactive-pc subsystem, restoring the demo.run tests on sel4, hw, and okl4. |
* Wait for for completion before return from 'execbuffer2'. This makes buffer execution synchronous. * Because the Iris driver manages the virtual address space of the GPU and creates one GEM context for each batch buffer we have to map/unmap all buffer objects before and after batch buffer execution. issue #4260
This commit contains features and buf fixes: * Catch errors during resource allocation * Because Mesa tries to allocate fence (hardware) registers for each batch buffer execution, do not allocate new fences for buffer objects that are already fenced * Add support for global hardware status page. Each context additionally has a per-process hardware status page, which we used to set the global hardware status page during Vgpu switch. This was obviously wrong. There is only one global hardware status page (set once during initialization) and a distinct per-process page for contexts. * Write the sequence number of the currently executing batch buffer to dword 52 of the per-process hardware status page. We use the pipe line command with QW_WRITE (quad word write), GLOBAL_GTT_IVB disabled (address space is per-process address space), and STORE_DATA_INDEX enabled (write goes to offset of hardware status page). This command used to write to the scratch page. But Linux now uses the first reserved word of the per-process hardware status page. * Add Gen9+ WaEnableGapsTsvCreditFix workaround. This sets the "GAPS TSV Credit fix Enable" bit of the Arbiter control register (GARBCNTLREG) as described by the documentation this bit should be set by the BIOS but is not on most Gen9/9.5 platforms. Not setting this bit leads to random GPU hangs. * Increase the context size from 20 to 22 pages for Gen9. On Gen8 the hardware context is 20 pages (1 hardware status page + 19 ring context register pages). On Gen9 the size of the ring context registers has increased by two pages to 21 pages or 81.3125 KBytes as the IGD documentation states. * The logical ring size in the ring buffer control of the execlist context has to be programmed with number of pages - 1. So 0 is 1 page. We programmed the actual number of pages before, leading to ring buffer execution of NOOPs if page behind our ring buffer was empty or GPU hangs if there was data on the page. issue #4260
This is a left over from Mesa-11 and we exchanged it with a 'wait_and_dispatch_one_io_signal' for synchronous signal waits. issue #4260
and don't assume 8M, which leads to Region_conflicts if size is >8M (X201). Issue #4260
- Lenovo T470p, T490, T490s Issue #4260
* use Gpu::addr_t (64 Bit) where necessary instead of Genode::addr_t. issue #4260
Adjust drivers_managed and sculpt accordingly. Issue #4260
Fixed in master. |
Currently the GPU multiplexer only supports IGT-devices found on Broadwell (Gen8). We want to add support for Skylake (Gen9) and Kabylake (Gen9 and Gen9.5) as well.
The text was updated successfully, but these errors were encountered: