Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X63 support #7

Closed
minego opened this issue May 11, 2021 · 40 comments · Fixed by #66
Closed

X63 support #7

minego opened this issue May 11, 2021 · 40 comments · Fixed by #66

Comments

@minego
Copy link

minego commented May 11, 2021

I'd like to be able to access the liquid temperature reported by my Kraken X63, so I can see it with the sensors command and use it as an input for fancontrol.

@jonasmalacofilho
Copy link
Member

While the X3 coolers are supported by liquidctl, the protocol is not very well understood. And hwmon kernel drivers work in a significantly different manner than liquidctl.

It's very likely that a lot of experimentation will be necessary, and I don't have one of these devices. But if you're willing to be the main author of the driver, I can certainly help.

@minego
Copy link
Author

minego commented May 11, 2021 via email

@jonasmalacofilho
Copy link
Member

Well, take a look at the code in the kraken3-basic-x3-support branch.

It's completely untested, and quite ugly (the main driver structure hasn't been adjusted for the X3 models, nor for future support of the Z3 models)... but it may be close to functional.


I suggest you try it in QEMU first (passing through the device), as it's not fun to crash drivers on your running kernel. Personally, I use virtme as a helper, which avoids having to build a FS image.

When using virtme/QEMU it will probably be easier to build a full (but minimal) kernel. For this you can use my w-hwmon-add-nzxt-kraken3 branch (from my linux repository), which already has necessary Kconfig/Makefile changes.

# clone the appropriate branch from jonasmalacofilho/linux and cd into the directory
git clone https://github.com/jonasmalacofilho/linux/ --branch w-hwmon-add-nzxt-kraken3
cd linux

# create a base kernel config with support for KVM (for QEMU/virtme)
make defconfig
virtme-configkernel --update

# enable the nzxt-kraken3 driver in the kernel config
echo CONFIG_SENSORS_NZXT_KRAKEN3=y >> .config
make olddefconfig

# build the kernel
# (adjust the -j parameter to the number of hardware threads)
make -j12

# run the kernel in QEMU with virtme
# (adjust USB hostbus and hostaddr to match your setup)
# (to exit QEMU: Ctrl-a + x)
virtme-run --kdir . --mods=auto --cpus 4 -q-usb -q-device -qusb-host,hostbus=1,hostaddr=5 -adebug --pwd

@jonasmalacofilho
Copy link
Member

jonasmalacofilho commented May 14, 2021

@minego have you been able to take a look at the code in the kraken3-basic-x3-support branch?

By the way, if if seems that no response is ever processed, it may be that the check for STATUS_REPORT_ID needs to be removed. Taking a second look at Tom's driver for liquidctl, it's possible that report IDs/numbers are not used by this device.

@minego
Copy link
Author

minego commented May 14, 2021 via email

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2021

I also interesting in because this device does not support FAN control and we need to emulate somehow FAN control depends on water temp instead of CPU temp. It's why this driver very important.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2021

nzxt-kraken3 driver looks very simple, I can help you.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2021

@jonasmalacofilho not really work:

kraken3-hid-3-4
Adapter: HID adapter
Pump:             N/A
Coolant:          N/A  

but init is fine:

[77276.830518] nzxt-kraken3 0003:1E71:2007.0004: hidraw1: USB HID v1.11 Device [NZXT. - Inc. NZXT USB Device] on usb-0000:03:00.0-11/input0

liquidctl at the same time working fine.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2021

Ok, I can see only one and wrong raw event:
size:64 id:0x71 data15:0x0 data16:0x1
looks like we wrongly init the device.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2021

ok, I know why... let's test it

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2021

it's working! I will make PR.

kraken3-hid-3-4
Adapter: HID adapter
Pump:        1869 RPM
Coolant:      +31.9°C 

@stalkerg
Copy link
Contributor

stalkerg commented Jul 6, 2021

@jonasmalacofilho I am interesting in upstream it.

@jonasmalacofilho
Copy link
Member

That's great, @stalkerg!

Let me know if you need help preparing or sending the patch.

@jonasmalacofilho
Copy link
Member

Actually, don't we want to implement pump control on this driver before upstreaming it?

The older Kraken X2 coolers had some limitations that made fan/pump control unsuitable for a hwmon driver, but I don't think this necessarily applies anymore for X3 [and later Z3] coolers.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 7, 2021

Actually, don't we want to implement pump control on this driver before upstreaming it?

Yes, we can try, but we should put some time limits. I mean, upstreaming still much more important than pump control.
Also, as I understand, the liquidctl didn't use direct control for PWM(?).

@jonasmalacofilho
Copy link
Member

Also, as I understand, the liquidctl didn't use direct control for PWM(?).

Do you mean that we set a PWM x temperature control curve? If so, that's correct, but the hwmon subsystem supports that as well; the hwmon documentations refers to those as "auto points".

Additionally, it's possible that the device also features a "direct" mode, The previous generation (X2) does, albeit with some caveats.

@stalkerg
Copy link
Contributor

stalkerg commented Aug 4, 2021

auto points look tricky, we should create a custom attribute group... also currently we write (59+1)-20 = 40 points what unusual for hwmon because for each point we create two sysfs files. We can also use interpretation but in that case, we should decide how many points we want to see and it will be static.

@jonasmalacofilho
Copy link
Member

auto points look tricky, we should create a custom attribute group...

The hwmon documentation explicitly instructs avoiding custom attributes.

for each point we create two sysfs files

I don't follow.

We can also use interpretation but in that case, we should decide how many points we want to see and it will be static.

I don't think that's necessary, as the available set of points at the protocol level is static.

@stalkerg
Copy link
Contributor

stalkerg commented Aug 11, 2021

The hwmon documentation explicitly instructs avoiding custom attributes.

At the same time, I can't find examples with auto points and without custom attributes. (I checked probably all drivers) Also, official documentation not talking much about auto points, they provide some sysfs patterns as is. If you can find more will be interesting to looking into.

I don't think that's necessary, as the available set of points at the protocol level is static.

I hope too but again, usually, all other drivers have ~4-5 points maximum but now we should provide 40.

@jonasmalacofilho
Copy link
Member

At the same time, I can't find examples with auto points and without custom attributes. (I checked probably all drivers)

I didn't know that.

What custom attributes do you have in mind?

And it might be worth discussing on the hwmon mailling list about this before working on the actual patch.

@stalkerg
Copy link
Contributor

And it might be worth discussing on the hwmon mailling list about this before working on the actual patch.

yes, it's a good idea, probably it's possible just a lack of documentation this auto point is not so popular thing in hwmon.

@aleksamagicka
Copy link
Member

Both sensors and curves have been implemented for some time for the X63 and others. I think that this issue can be closed?

@stalkerg
Copy link
Contributor

stalkerg commented Jan 5, 2023

I think so, but I would like to check "write" function. I will try to do it today.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 5, 2023

hmm pwm1 produce invalide argument for values 0-50. fan2go because of this can't build the curve. Also, sometimes
echo '1' > pwm1_enable also produce invalid argument

@stalkerg
Copy link
Contributor

stalkerg commented Jan 5, 2023

Okey seems like if we have already pwm1_enable is 1 we produce an error if we try to set it again, I suppose it's wrong behavior.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 5, 2023

even if I fixed set too small and too big values (instead return an error we must set such boundary), I still have issues:
изображение

Basically, fan2go can't measure the correlation between RPM and PWM.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 5, 2023

okey, we have two issues:

  1. internally, it's a percentage, which means the write value is not equal to the read value in many cases.
  2. we a getting the PWM update value with significant delay, userspace apps expected what if they did write they can immediately ready basically write should be a sync operation.

@aleksamagicka
Copy link
Member

Thanks, I'll look into it. For values 0-50, that's because the firmware mandates that the minimum duty for the pump is 20%.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 5, 2023

thanks, yes, I understand, but PWM from userspace should be "virtual" (from my point of view). If I settle 20, I should read 20 even if the device receives 50 instead or any other valid minimum value.
Basically, only the correlation between PWM and RPMs can be not linear and working async (e.g., RPM will be reached after some time after settled PWM)

@aleksamagicka
Copy link
Member

The PWM is converted from the duty which is read directly from the device, we can't lie about it.

@jonasmalacofilho
Copy link
Member

@stalkerg,

  1. internally, it's a percentage, which means the write value is not equal to the read value in many cases.

Do you remember an exact instance of this happening, and what were both of the values (written and read back) in question?


  1. we a getting the PWM update value with significant delay, userspace apps expected what if they did write they can immediately ready basically write should be a sync operation.

thanks, yes, I understand, but PWM from userspace should be "virtual" (from my point of view). If I settle 20, I should read 20 even if the device receives 50 instead or any other valid minimum value. [...]

I agree with @aleksamagicka that the driver should present the current PWM value (as) reported by the device.

And my interpretation of the hwmon sysfs ABI is that it would be wrong for userspace to expect otherwise, since hwmon drivers should match device behavior and state as closely as possible.

@aleksamagicka
Copy link
Member

internally, it's a percentage, which means the write value is not equal to the read value in many cases.

I see I didn't explain this. @stalkerg is referring to the cases when you write (for example) 229 into pwm1 and get 230 when you read it back. That's due to how DIV_ROUND_CLOSEST() is behaving with the data its given.

229 on a pure PWM->[0, 100] scale is 89.803... %. The device however only accepts integer percentages, so that's rounded to 90%. When reading that back, we get 90*255/100, which is 229.5, and since we can't have decimal values in hwmon that gets rounded to 230.

This happens because the devices don't offer enough precision for the percentage, and in that case mapping between 0-100 range and 0-255 range can't be completely retained. In the case of Aquacomputer devices and the aquacomputer_d5next driver, centi-percents are used, so the calculation looks like this:

229 is converted into 229*(100*100)/255 = 8980 and that's written to the device. When we read 8980 back, it's converted to PWM: 8980*255/(100*100) = 228.99, which is rounded to 229, the exact value we started with because we have a tighter mapping with more precision.

@jonasmalacofilho
Copy link
Member

@aleksamagicka, oh, that makes sense. Thanks for the explanation.

@jonasmalacofilho
Copy link
Member

Closing: we can't address @stalkerg's point #1 (it's a device limitation), and we don't think we should change things regarding point #2.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 8, 2023

@jonasmalacofilho

hwmon drivers should match device behavior and state as closely as possible.

Can you point this into documentation or so? I need arguments for the fan2go developer. He made the daemon and program from entirely different assumption.

@amezin
Copy link
Member

amezin commented Jan 8, 2023

@aleksamagicka

The PWM is converted from the duty which is read directly from the device, we can't lie about it.

@jonasmalacofilho

and we don't think we should change things regarding point #2.

Actually, nzxt-smart2 driver does "lie" about pwm a bit: https://github.com/liquidctl/liquidtux/blob/master/nzxt-smart2.c#L490-L503 Yes, the version in the kernel tree too: https://github.com/torvalds/linux/blob/master/drivers/hwmon/nzxt-smart2.c#L484-L497

Because otherwise pwmconfig and fancontrol don't work well (see the comment).

@jonasmalacofilho
Copy link
Member

jonasmalacofilho commented Jan 8, 2023

@amezin,

Actually, nzxt-smart2 driver does "lie" about pwm a bit:

liquidtux/nzxt-smart2.c

Lines 490 to 503 in 8524d61

/*
* pwmconfig and fancontrol scripts expect pwm writes to take effect
* immediately (i. e. read from pwm* sysfs should return the value
* written into it). The device seems to always accept pwm values - even
* when there is no fan connected - so update pwm status without waiting
* for a report, to make pwmconfig and fancontrol happy. Worst case -
* if the device didn't accept new pwm value for some reason (never seen
* this in practice) - it will be reported incorrectly only until next
* update. This avoids "fan stuck" messages from pwmconfig, and
* fancontrol setting fan speed to 100% during shutdown.
*/
spin_lock_bh(&drvdata->wq.lock);
drvdata->fan_duty_percent[channel] = duty_percent;
spin_unlock_bh(&drvdata->wq.lock);

Oh, I forgot about that. Thanks for reminding us.

@aleksamagicka, in light of that, do you want to do the same here?

Additionally, according to that comment, the issue also affects fancontrol, which is more or less the cannonical userspace for hwmon PWM control.


@stalkerg,

Please disregard that statement. I can't find anything the hwmon documentation to really back that up... maybe I got that impression from watching the mailing list, or maybe I'm just plainly wrong.

@amezin
Copy link
Member

amezin commented Jan 8, 2023

Although the comment isn't accurate... nzxt-smart2 performs rounding too, so you won't always read back exactly the same pwm value you've written. And if you changed it by just 1 or 2, you may read the same value - the difference might be lost during the conversion to percent and back.

But when the userspace changes the value significantly, the driver should respond - i. e. after write() returned to userspace, all read()s should read a new changed value.

At least that's what was necessary to make fancontrol behave reasonably well.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 8, 2023

Sorry, I want to add a more general topic:
The hwmon it's an abstraction interface from sensors and fan controls in the kernel into userspace. How is it possible to write an effective userspace application if this API behavior very depends on a specific driver and specific hardware behavior?
Sorry, but fancontrol working is much more stable because it's primitive. https://github.com/markusressel/fan2go doing much more interesting things and much more accurate from user's point of view.
But even fancontrol can't work properly if hwmon drivers can't keep consistent behavior. Yes, we can leave with rounded numbers but it will be better to work without them.
Also, set value 0 never should provide error, it should set a minimum value! fan2go just break on it and I suppose pwmconfig can as well (both trying to check pwm/fan rpm dependencies just increase numbers from 0 to 255, and if found error stops the process (what else they should do?))

I also fixed a similar bug with rounded in amdgpu for vega (they did even more - simulate PWM based on RPM and depend on firmware it show noise).
hwmon API should provide more consistent high-level API, current just has no sense for userspace.

Sorry, for the "offtopic" and some emotions.

@aleksamagicka
Copy link
Member

aleksamagicka commented Jan 8, 2023

Ah, had no idea that was allowed. Thanks for the pointer in nzxt-smart2. Like you, I got the same impression of no leeway from the docs.

Noted on the 0, will look into that as well. Reopening for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants