Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/hardware.display: init module #279789

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

nazarewk
Copy link
Member

@nazarewk nazarewk commented Jan 9, 2024

fixes #279739

Description of changes

While preparing a bug report for root cause of #279739 I came up with a possible way to fix it and confirmed it is indeed working.

https://github.com/NixOS/nixpkgs/blob/580d12f7dc941cad010f3295eda1ed0cb513611c/pkgs/build-support/kernel/modules-closure.sh seems to ignore any files that are not kernel modules

As a part of the discussion feedback I have created a NixOS module out of the whole thing for handling edid files and additional kernel-level display configuration.

Things done

Using it for a few weeks in my configurations https://github.com/nazarewk-iac/nix-configs

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
related to NixOS#279739
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
@nazarewk nazarewk marked this pull request as draft January 9, 2024 12:10
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
@nazarewk nazarewk changed the title modules-closure: include /lib/firmware/edid makeModulesClosure: include /lib/firmware/edid Jan 9, 2024
@nazarewk nazarewk marked this pull request as ready for review January 9, 2024 13:01
nazarewk added a commit to nazarewk/nixpkgs that referenced this pull request Jan 9, 2024
this allows us not to pass `compressFirmware = false;`
 inside EDID derivations as this mechanic very tricky to discover.

related to NixOS#279789
Copy link
Member

@soupglasses soupglasses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works correctly with nice logging. Only nitpick i have is to document this specific handling of lib/firmware/edid somewhere. As it might not be clear to an end user that edids are required to be put into the edid folder to function correctly.

@nazarewk
Copy link
Member Author

nazarewk commented Jan 9, 2024

Works correctly with nice logging. Only nitpick i have is to document this specific handling of lib/firmware/edid somewhere. As it might not be clear to an end user that edids are required to be put into the edid folder to function correctly.

Yeah, I also had trouble with discovering where to put edids and how to link them together on the system, but I have no idea where to document it.

@soupglasses
Copy link
Member

I am not sure either. But i could see a subsection like Using Custom EDID files under https://nixos.org/manual/nixos/stable/#sec-kernel-config could be possible. As its very close to where boot.initrd.kernelModules is taught.

@K900
Copy link
Contributor

K900 commented Jan 9, 2024

At this point, why not a NixOS option that does all the right plumbing?

Copy link
Member

@SuperSandro2000 SuperSandro2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no clue about edit but looking pretty good to my naive eyes and we can probably merge this with those small cleanups.

Comment on lines 15 to 20
# let
# edids = (linuxhw-edid-fetcher.override {
# displays = {
# PG278Q_2014 = [ "PG278Q" "2014" ];
# };
# });
# in
# "${edids}/lib/firmware/edid/PG278Q_2014.bin";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# let
# edids = (linuxhw-edid-fetcher.override {
# displays = {
# PG278Q_2014 = [ "PG278Q" "2014" ];
# };
# });
# in
# "${edids}/lib/firmware/edid/PG278Q_2014.bin";
# let
# edids = linuxhw-edid-fetcher.override {
# displays.PG278Q_2014 = [ "PG278Q" "2014" ];
# };
# in
# "${edids}/lib/firmware/edid/PG278Q_2014.bin";

we can minimize that example a bit further, to make it easier to understand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I added (unnecessary) 2560x1440 to make it obvious PG278Q_2014.bin is derived from the object key, not a search path.

''}
'';

passthru.updateScript = nix-update-script { extraArgs = [ "--version=branch=master" ]; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
passthru.updateScript = nix-update-script { extraArgs = [ "--version=branch=master" ]; };
passthru.updateScript = nix-update-script { extraArgs = [ "--version=branch=master" ]; };

test -d "$1" && test -f "$1/AnalogDisplay.md" && test -f "$1/DigitalDisplay.md"
}

main() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just drop the main function

Copy link
Member Author

@nazarewk nazarewk Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have an explicit entry point in form of main() because:

  • it's hard to follow scripts interweaving running code with function definitions
  • it's a lot harder to locate where the code starts executing

I just apply this "policy" to anything larger than ~10 lines of bash, as soon as you reach it you need helper functions etc.

repo="${TMPDIR:-/tmp}/edid"
log "Not running inside 'https://github.com/linuxhw/EDID', downloading content to ${repo}"
if ! check_repo "$repo"; then
curl -L https://github.com/linuxhw/EDID/tarball/master | tar -zx -C "${repo}" --strip-components=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
curl -L https://github.com/linuxhw/EDID/tarball/master | tar -zx -C "${repo}" --strip-components=1
curl -L https://github.com/linuxhw/EDID/tarball/HEAD | tar -zx -C "${repo}" --strip-components=1

just in case they ever change the default branch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, wasn't aware of this

exit 1
fi

: "${repo:="$PWD"}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rather use a tmp file for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just for detecting whether current directory is already checked out repository, if it's not it downloads stuff to /tmp/edid

};
}
)
(lib.mkIf (cfg.edid.packages != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably hide this behind en enable flag. Some modules, eg in nixos-hardware might fill this and having to use mkForce to clear it, is not that nice.

@nazarewk
Copy link
Member Author

nazarewk commented Apr 18, 2024

@SuperSandro2000 thanks for taking interest in it, I'm currently working on a separate script for doing the runtime configuration and would LOVE to merge after I got it to work.

Yesterday I updated nixos-unstable in my configs + swapped Radeon RX550 to Pro W6600 and the current script stopped working completely because kernel/amdgpu started doing additional wild stuff like trying to load it multiple times and turning off the display after failing to load EDID even though it was force enabled:

[   25.456331] snd_hda_intel 0000:0c:00.1: bound 0000:0c:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[   25.770685] [drm] forcing DP-4 connector on
[   25.881959] amdgpu 0000:0c:00.0: [drm] *ERROR* [CONNECTOR:122:DP-4] Requesting EDID firmware "edid/PG278Q_120.bin" failed (err=-22)
[   25.890385] [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
[   25.939326] [drm] kiq ring mec 2 pipe 1 q 0
[   25.944541] [drm] VCN decode and encode initialized successfully(under DPG Mode).
--
[   25.971560] amdgpu 0000:0c:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8
[   25.973019] [drm] Initialized amdgpu 3.54.0 20150101 for 0000:0c:00.0 on minor 1
[   25.980115] amdgpu 0000:0c:00.0: [drm] *ERROR* [CONNECTOR:122:DP-4] Requesting EDID firmware "edid/PG278Q_120.bin" failed (err=-22)
[   25.980123] [drm:amdgpu_dm_connector_mode_valid [amdgpu]] *ERROR* No EDID firmware found on connector: DP-4 ,forcing to OFF!
[   25.983103] fbcon: amdgpudrmfb (fb0) is primary device
[   25.983269] [drm] DSC precompute is not needed.
--
[   37.626741] hid-generic 0003:1050:0407.000E: input,hidraw8: USB HID v1.10 Keyboard [Yubico YubiKey OTP+FIDO+CCID] on usb-0000:07:00.1-4.3/input0
[ 1485.826017] logitech-hidpp-device 0003:046D:406D.000A: HID++ 4.5 device connected.
[ 2093.540565] amdgpu 0000:0c:00.0: [drm] *ERROR* [CONNECTOR:122:DP-4] Requesting EDID firmware "edid/PG278Q_120.bin" failed (err=-22)
[ 2093.548974] [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
[ 2093.831415] amdgpu 0000:0c:00.0: [drm] *ERROR* [CONNECTOR:122:DP-4] Requesting EDID firmware "edid/PG278Q_120.bin" failed (err=-22)
[ 2093.831421] [drm:amdgpu_dm_connector_mode_valid [amdgpu]] *ERROR* No EDID firmware found on connector: DP-4 ,forcing to OFF!
[ 2093.844356] nvme 0000:01:00.0: VPD access failed.  This is likely a firmware bug on this device.  Contact the card vendor for a firmware update
[ 2094.288519] [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read.
[ 2205.443882] usb 5-3: USB disconnect, device number 3
[ 2205.443890] usb 5-3.2: USB disconnect, device number 5

Needed to add a few additional guards against it and thought it'll be easier to develop it in my own repo then upstream it.

@nazarewk nazarewk force-pushed the hardware-firmware-edid-fix branch 2 times, most recently from 62dc18b to 0fe1957 Compare April 18, 2024 12:40
@nazarewk nazarewk force-pushed the hardware-firmware-edid-fix branch 3 times, most recently from cb0b041 to 0467e26 Compare April 18, 2024 14:35
@nazarewk
Copy link
Member Author

ok, I've created and tested in my configs edido script which applies (selected) video= and complete drm.edid_firmware= kernel parameters during runtime. It loads /proc/cmdline automatically unless you pass it arguments directly.

@meutraa
Copy link
Contributor

meutraa commented Apr 19, 2024

I have issues with the latest version of the display-edid-apply script (using applyAtRuntime, as the kernel parameters is still broken).

With applyWithKernelParameters false, the only log line displayed is:

log "loading kernel parameters from /proc/cmdline"

In this case the edid is not applied.


With applyWithKernelParameters true it ends with:

log "EDID is already up to date with '%s'" "${edidPath}"

I checked the hash of the current-firmware edid.bin and it did indeed match that of the edid_override. So CHANGED remains 0, and the trigger hotplug doesn't occur. With a manual trigger hotplug, the edid is functioning. I'm really not sure what is going on here. Perhaps something odd with the kernel where it loads it, but doesn't really.

@nazarewk
Copy link
Member Author

Yeah, the kernel has major issues loading EDID files over last few weeks, just getting worse with each update.

Good catch with having no boot.kernelParams unless applyWithKernelParameters is set.

@billksun
Copy link
Contributor

billksun commented May 2, 2024

Never thought I would need this, but yes, I need this!

modules-closure.sh seems to consider everything under
 /lib/firmware to be a kernel module,
 this change adds a special handling of `edid` directory,
 which does not contain kernel modules

fixes NixOS#279739
@jys1670
Copy link

jys1670 commented May 12, 2024

Seems that errors like
amdgpu 0000:e7:00.0: [drm] *ERROR* [CONNECTOR:111:DP-5] Requesting EDID firmware "edid/edid.bin" failed (err=-22)
are caused by 1ed5df1 (and unrelated to kernel version bumps)
With its revert, applyWithKernelParameters = true; works fine on my machine.

@K900
Copy link
Contributor

K900 commented May 12, 2024

Complete shot in the dark, but if you use the unmodified kernel and add initcall_blacklist=simpledrm_platform_driver_init, does it work?

@jys1670
Copy link

jys1670 commented May 12, 2024

Complete shot in the dark, but if you use the unmodified kernel and add initcall_blacklist=simpledrm_platform_driver_init, does it work?

Nope, I can't even boot with this one, sadly.

pathsToLink = [ "/lib/firmware/edid" ];
ignoreCollisions = true;
}) // {
compressFirmware = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't actually need compressFirmware = false;, since the kernel is built with compressed firmware support by default. While this looks trivial either way (compression vs nixos config complexity), I still think it should be removed if it isn't needed. If it is needed, then I think that means there are other problems with how nixos is configuring the kernel.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is needed, I think code paths in kernel/drivers do not decompress EDID files or at least not properly. Don't remember the details, but i tried to investigate quite extensively (without much knowledge of C/C++ itself)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But my archlinux partition is using a zstd compressed edid firmware file without issue. Using Archlinux vanilla kernel v6.8.9.arch1-2. So it isn't a problem with the kernel, but with Nixos kernel configuration.

$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=eb3729f4-547c-42a7-86b4-6020f9a31762 rw loglevel=3 quiet nowatchdog intel_iommu=on iommu=pt zswap.enabled=0 ibt=off amdgpu.ppfeaturemask=0xffffffff drm.edid_firmware=edid/g70nc_edid.bin

$ ls /usr/lib/firmware/edid/
g70nc_edid.bin.zst

I compressed it manually with zstd -19 g70nc_edid.bin

@FlyingWombat
Copy link

Seems that errors like amdgpu 0000:e7:00.0: [drm] *ERROR* [CONNECTOR:111:DP-5] Requesting EDID firmware "edid/edid.bin" failed (err=-22) are caused by 1ed5df1 (and unrelated to kernel version bumps) With its revert, applyWithKernelParameters = true; works fine on my machine.

That's odd that commit is causing the error, since the kernel configs in that commit match what is set in my Archlinux partition (dual boot), and it loads edid_firmware (compressed) without issue. It must be failing in combination with some other options nixos is setting.

@FlyingWombat
Copy link

I noticed Archlinux has this patch in their kernel, to "skip simpledrm if nvidia-drm.modeset=1 is set". It could be relevant to the issues that @jys1670 mentioned were introduced with 1ed5df1 -- if the user has an Nvidia gpu.

Though, I have an AMD gpu and still get the same err=-22 when loading edid firmware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

amdgpu: cannot set modified EDID binaries. "Requesting EDID firmware ... failed (err=-2)"