Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Can't get value of subfeature temp1_input: Can't read #445

Closed
mikeSimonson opened this issue Jun 4, 2016 · 37 comments
Closed

ERROR: Can't get value of subfeature temp1_input: Can't read #445

mikeSimonson opened this issue Jun 4, 2016 · 37 comments
Labels
sensors Related to the use of lm-sensors to get temperature data temperature Related to temperature data or display
Milestone

Comments

@mikeSimonson
Copy link

When I update to the tag v_1.10 I get the error above.
The error is not present in v_1.9.

@dolmen
Copy link
Collaborator

dolmen commented Jun 4, 2016

On which operating system? Linux?

Could you give me the output of those commands:

LANG=C sensors -u
LANG=C acpi -t

@mikeSimonson
Copy link
Author

mikeSimonson commented Jun 5, 2016

Linux indeed with bash.

I had to install acpi.
I don't know if it's also used by liquidprompt.

The ouptut is coming from version v_1.10.
It's the same on v_1.9 minus the error message.

$ LANG=C sensors -u
acpitz-virtual-0
Adapter: Virtual device
temp1:
  temp1_input: 48.000
  temp1_crit: 103.000

nouveau-pci-0100
Adapter: PCI adapter
temp1:
ERROR: Can't get value of subfeature temp1_input: Can't read
  temp1_max: 95.000
  temp1_max_hyst: 3.000
  temp1_crit: 105.000
  temp1_crit_hyst: 5.000
  temp1_emergency: 135.000
  temp1_emergency_hyst: 5.000

thinkpad-isa-0000
Adapter: ISA adapter
fan1:
  fan1_input: 2136.000

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:
  temp1_input: 47.000
  temp1_max: 87.000
  temp1_crit: 105.000
  temp1_crit_alarm: 0.000
Core 0:
  temp2_input: 47.000
  temp2_max: 87.000
  temp2_crit: 105.000
  temp2_crit_alarm: 0.000
Core 1:
  temp3_input: 44.000
  temp3_max: 87.000
  temp3_crit: 105.000
  temp3_crit_alarm: 0.000
Core 2:
  temp4_input: 45.000
  temp4_max: 87.000
  temp4_crit: 105.000
  temp4_crit_alarm: 0.000
Core 3:
  temp5_input: 43.000
  temp5_max: 87.000
  temp5_crit: 105.000
  temp5_crit_alarm: 0.000

ERROR: Can't get value of subfeature temp1_input: Can't read


$ LANG=C acpi -t
Thermal 0: ok, 50.0 degrees C
ERROR: Can't get value of subfeature temp1_input: Can't read

Thanks for your response

@dolmen
Copy link
Collaborator

dolmen commented Jun 5, 2016

The ERROR: Can't get value of subfeature temp1_input: Can't read doesn't comes from Liquid Prompt: it comes from utilities reading temperature information and who probably get it from the kernel.

This may be a hardware problem (see for example this issue which may be (or not) related to your problem) but I'm not an expert in the domain. In any case, you should investigate the issue to find how to fix it.

Hiding this problem (as was done in v_1.9) was not a good service to users, so it has been removed.

@dolmen dolmen added wontfix A bug report that is rejected because the fix should be external temperature Related to temperature data or display sensors Related to the use of lm-sensors to get temperature data labels Jun 5, 2016
@mikeSimonson
Copy link
Author

As far as I can guess the errors comes from my nvidia graphic card. Do you know how I could make sure that it doesn't get queried by liquidprompt ?

@rahilwazir
Copy link

I'm getting the same error.

@mikeSimonson Mine is not nvidia though, it is Intel Ivybridge Mobile

Here's my output:

$ LANG=C sensors -u
acpitz-virtual-0
Adapter: Virtual device
temp1:
  temp1_input: 53.000
  temp1_crit: 128.000
temp2:
  temp2_input: 0.000
  temp2_crit: 128.000
temp3:
  temp3_input: 49.000
  temp3_crit: 128.000
temp4:
  temp4_input: 54.000
  temp4_crit: 128.000
temp5:
  temp5_input: 38.000
  temp5_crit: 128.000
temp6:
  temp6_input: 127.000
  temp6_crit: 128.000

radeon-pci-0100
Adapter: PCI adapter
temp1:
ERROR: Can't get value of subfeature temp1_input: Can't read
  temp1_crit: 120.000
  temp1_crit_hyst: 90.000

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:
  temp1_input: 54.000
  temp1_max: 87.000
  temp1_crit: 105.000
  temp1_crit_alarm: 0.000
Core 0:
  temp2_input: 53.000
  temp2_max: 87.000
  temp2_crit: 105.000
  temp2_crit_alarm: 0.000
Core 1:
  temp3_input: 54.000
  temp3_max: 87.000
  temp3_crit: 105.000
  temp3_crit_alarm: 0.000

ERROR: Can't get value of subfeature temp1_input: Can't read

Linux Kernel: 4.4.0-21-generic
Ubuntu: 16.04
Shell: Bash

@mikeSimonson
Copy link
Author

@rahilwazir From what I can see the error comes from your radeon card.

@rahilwazir
Copy link

@mikeSimonson Ah yeah forgot that 😄, so do you think this error comes from the external gpu?

@mikeSimonson
Copy link
Author

yep, I want to find a way to power it off completely

@mikeSimonson
Copy link
Author

@rahilwazir Going in the bios to shutdown completely the nvidia card worked perfectly.

@dolmen
Copy link
Collaborator

dolmen commented Jun 7, 2016

We are currently averaging all the temperature we collect.
I discovered that during the few last years the Linux kernel has exposed more and more temperature sensors. Averaging was ok when we got only temperature from 2 CPUs. But averaging temperature from a CPU and a GPU doesn't make sense.

The temperature indicator should be completely redesigned. Cc: @nojhan

@mikeSimonson
Copy link
Author

Well when it's an integrated intel GPU it might make sense but it's right that for a discrete card it doesn't make much sense.

@Rmano
Copy link
Contributor

Rmano commented Dec 3, 2016

I think that the problem is that in newer kernels, the power management has got much better. Discrete GPU are switched off dinamically when not needed; so you can't read the temperature because the gizmo is off.
I really think that giving an error is a bug of sensors; but anyway, I also think that liquidprompt should really silence it...
Thank you anyway for the nice scripts --- I'll try to workaround it myself (without disabling the card, that's useful when needed).

For anyone concerned, you can simply change the line calling sensors (around 1513 in current version) to

 for i in $(sensors -u 2> /dev/null |

and you'll silence the error. I understand that this is hiding the problem, but well...

@dolmen
Copy link
Collaborator

dolmen commented Dec 13, 2016

@Rmano Noone before told me that temperature sensors appeared/disappeared dynamically, and so the output of errors by sensors may not happen every time.
I may reconsider my position with this new information.

@dolmen dolmen reopened this Dec 13, 2016
@Rmano
Copy link
Contributor

Rmano commented Dec 13, 2016

@dolmen --- I think (notice that I am not sure) related to this: http://askubuntu.com/a/469439/16395. On my laptop which have a dynamic discrete CPU it used to say -128°C when the GPU was powered off, but now it spits out the error (and I really think that both ways of deal with it are wrong from sensor point of view, but...)
Will check tomorrow if this is the case when I have the laptop handy and I can trigger a GPU switch-on.

@Rmano
Copy link
Contributor

Rmano commented Dec 14, 2016

Ok. I can confirm that the error is only in sensors -u when the discrete GPU is off, and that there is no error when it's on.
The simpler sensors output is more reasonable; no error is displayed ever and in case the GPU is off, it says:

radeon-pci-0100
Adapter: PCI adapter
temp1:            N/A  (crit = +120.0°C, hyst = +90.0°C)

@reedobrien
Copy link

I have a similar problem, but when wifi is turned off. It can't read the wifi temp.
Relevant sensors -u output. Of course when it is off it happens each time the prompt is redrawn.

iwlwifi-virtual-0
Adapter: Virtual device
temp1:
ERROR: Can't get value of subfeature temp1_input: I/O error

@shermozle
Copy link

Interestingly, I just switched from the Cinnamon to MATE windowing environments in Linux Mint and this error message cropped up. Device is a Lenovo X1 Carbon 5th Gen. I switched to MATE because the compositing window manager was a bit flaky graphcally.

So it does sound like something GPU-related.

@redbeardcreator
Copy link

I can confirm @reedobrien's statement about WiFi. But I also have an nVidia card that doesn't report the temperature with the Nouveau drivers.

One thing that might be useful (for Linux at least) is to be able to specify which temperature sensors to use. For instance, with my laptop, I could use the custom Dell, which include CPU, two ambient sensors and a fourth, unidentified sensor. I suppose you could go so far as to say which sensors turn on and off dynamically.

@20uf
Copy link

20uf commented Jan 16, 2018

I have the same error, to solve my problem I installed acpi.

sudo apt install acpi

@GameKyuubi
Copy link

GameKyuubi commented Mar 19, 2018

This problem really ruins the terminal experience, and I'm of the opinion that it's not liquidprompt's job to tell me that there's a problem with some unrelated part of my setup. It's liquidprompt's job to make the terminal more usable, not less in most circumstances. "What you need, when you need it" is the motto, is it not? This is not something I need to see every single time I perform a command, and it may not even be fixable since the error in my case stems from sensor -u trying to read the temp of a virtual wifi adapter (wtf) which could be related to my Windows VM. Installing acpi did nothing, and quite frankly I don't have the time or patience to hunt down what is causing this "problem" (if it's anything at all!) when I just want my terminal to be more usable. This is a problem with sensors or an unrelated part of a user's setup (virtual wifi adapter lol) which doesn't even need a temperature reading anyway. I would much rather the temperature readings in liquidprompt be inaccurate and leave it up to me try to fix it (or choose not to fix it and just disable temp readings) than this problem which basically makes liquidprompt unusable until the source is edited. It's a small edit but not something the user should be expected to do since it's not in the config file. ERROR: Can't get value of subfeature temp1_input: Can't read should display either once when the bash instance starts or not at all. Every time a command is issued is unacceptable.

My system is ATI Ryzen 1800x, 2x GTX 1080ti, Arch

@Rycieos
Copy link
Collaborator

Rycieos commented Mar 19, 2018

@GameKyuubi please remember that this project is open source and free.

https://github.com/nojhan/liquidprompt/blob/9c80396021a8106bfaeade9a1ea51b85152e951d/LICENSE#L589-L596

Contributors try to make the project bug free, but we have limited time and resources, and we don't have access to all the different machines and configurations that could cause issues. You can open a pull request if you want to help out.

If you don't want temperature display at all, that module can be disabled (removing the error) with the config option LP_ENABLE_TEMP.

@GameKyuubi does bring up a good point: could we show the error once when the prompt starts, then suppress it after that? Right now we do the opposite; suppressing the error when the prompt starts, then showing it after that:
https://github.com/nojhan/liquidprompt/blob/9c80396021a8106bfaeade9a1ea51b85152e951d/liquidprompt#L1545-L1546
https://github.com/nojhan/liquidprompt/blob/9c80396021a8106bfaeade9a1ea51b85152e951d/liquidprompt#L1569-L1572

I agree with @dolmen that we shouldn't be hiding helpful errors, but since sensors seems to be bugged and this is a wide spread issue, maybe we need a bit of a filter.

@adamryczkowski
Copy link

I sympathize with @dolmen, but the proper fix is to fix the lm-senors, so it ignores the troublesome chip.

For iwlwifi-virtual-0 virtual device do the following:

Write a file /etc/sensors.d/iwlwifi with the following contents

chip "iwlwifi-virtual-*"
    ignore temp1

Change iwlwifi-virtual- into anything else that throws errors when running sensors -u.

@dolmen
Copy link
Collaborator

dolmen commented Aug 28, 2018

Thanks @adamryczkowski for this great tip. We should definitely add this to the documentation.

@nocive
Copy link

nocive commented Nov 12, 2018

Thanks @adamryczkowski, that was definitely the proper fix for me 👍

@Rodsevich
Copy link

Rodsevich commented Oct 16, 2019

My error is:

ERROR: Can't get value of subfeature in0_input: Can't read
ERROR: Can't get value of subfeature in1_input: Can't read

I did the following:
sensors -u

...
amdgpu-pci-0800
Adapter: PCI adapter
vddgfx:
ERROR: Can't get value of subfeature in0_input: Can't read
vddnb:
ERROR: Can't get value of subfeature in1_input: Can't read
temp1:
...
ERROR: Can't get value of subfeature in0_input: Can't read
ERROR: Can't get value of subfeature in1_input: Can't read

There we get the marked info and do the following:

#/etc/sensors.d/amdgpu
chip "amdgpu-*"
ignore in0
ignore in1

@jangorecki
Copy link

are there any plans to resolve that? PR does look good to address problems discussed here

@Rycieos
Copy link
Collaborator

Rycieos commented Sep 7, 2020

This is by far the project's most active issue. It also should be my number 1 priority, but you will have to read on to see what the problem is.

Like I said, this is such a crazy common problem that I have added it to the FAQ. I'm going to try to keep the discussion focused on one issue thread, instead of responding to every single duplicate issue (#598, #611). There are also many related issues (#448, it doesn't make sense to include CPU and other sources together) (#610, there can be sensors that are grossly incorrect).

There seem to be 3 separate but related issues here:

  1. A device always errors when its temperature is read.
  2. A device errors when its temperature is read while the device is switched off (see @Rmano's comment and @reedobrien's comment. This seems common in laptops where the desktop manager dynamically disables a dedicated GPU or the user disables the WiFi card.
  3. A device always returns a grossly incorrect number (see Debug mode? #610).

By far, the most suggested fix is to redirect stderr from sensors to null. Both I (and @dolmen in the past) have been strongly against that. While it might seem that there won't be any side effects, lm-sensors is not an install-and-use tool. It needs configuration. If we hid all errors all the time, a new user of lm-sensors would not realize that the temps they are seeing in Liquidprompt are incorrect, or even worse, get no temps showing at all, and without an error message, have no idea why or how to fix it. We would see quite a few issues coming in that would be much harder to diagnose than this specific issue.

The other 2 solutions that I have suggested in the FAQ are to install acpi instead (which I haven't heard much feedback on except for this comment, so at it worked for at least 5 people). Or to configure lm-sensors to ignore a specific sensor. Which works for a lot of people, but only for cases 1 and 3 above. Not for 2, as people still want that sensor, they just don't want the error when it is offline. Maybe some people don't care and are fine ignoring it, but I'm sure there are some who will want it.

This leads me to my preferred fix: lm-sensors should fix the issue of printing an error when the device is simply offline. @jnweiger opened a ticket with them (lm-sensors/lm-sensors/issues/229), but it has no replies. Anyone with the problem volunteer to report it as an issue with their distro? I don't have the "device switched off" problem on any of my machines, so I can't report a bug with my distros.

Since that hasn't happened yet, my next preferred fix is to print the error once on prompt startup, and then suppress it from then on. Right now Liquidprompt does the opposite: it suppresses at startup (since it is testing if lm-sensors is working correctly), then prints errors on each prompt gen. This has its own issues: not only would this make testing lm-sensors more difficult, I know from experience that an error on each prompt startup is almost as annoying as on each prompt.

So, I'm looking for suggestions. I want this fixed, but so far I'm not happy with any of the solutions.

@jangorecki
Copy link

Why not just make temperature readings an opt-in feature then? Any idea about percentage of users that are actually needing this feature?

@Rycieos
Copy link
Collaborator

Rycieos commented Sep 7, 2020

Why not just make temperature readings an opt-in feature then?

Not a terrible idea, but would have to be something changed in v2.0. Good to point out that users can always disable it if they don't want it. Most users reporting this issue actually do want temperature I think.

Any idea about percentage of users that are actually needing this feature?

We don't collect metrics, so I have no idea. Since it's enabled by default, probably quite a few, but probably less than you might think since you need to have lm-sensors or acpi installed for anything to work, and most distros don't ship with those.

@jnweiger
Copy link
Contributor

jnweiger commented Sep 7, 2020

I like to see the temperature there too.
In my opinion, liquidprompt should definitely hide the error message from sensors. liquidprompt relies on an external tool with a known defect. Fixing that depends on other parties to act.

From the discussion above, I understand that some users expressed that, they want to see error messages from the tools called by liquidprompt. In this case it is a message that is printed with every prompt. I find that intrusive to normal typing and not helpful, as it is seen out of context. I'd rather go back and discuss the origin of this 'never hide errors' concept.

@Rycieos
Copy link
Collaborator

Rycieos commented Sep 7, 2020

I'd rather go back and discuss the origin of this 'never hide errors' concept.

Liquidprompt hides errors all the time. A search for /dev/null can show that.

The issue here is hiding useful errors. Yes, often they are not useful, as these many users are reporting, but that doesn't change that sometimes these specific errors are useful. And if we hide those errors all the time, that is a big problem for those times when the error is helpful.

This is why I want a better solution. We should hide the error when it is not helpful (when the user doesn't care) and show it when it is. Obviously this is difficult, which is why we don't have an acceptable solution yet. This is why my ideal solution is that lm-sensors fixes their bug, since it is easiest to tell there what is useful or not.

@jnweiger
Copy link
Contributor

jnweiger commented Sep 7, 2020

@Rycieos Sorry for over-generalizing. I did not check how other tools are handled.
Regarding lm-sensors, I doubt that they are in any better position to tell a useful error from a not useful error.
Usefulness depends on context and expectations. They may have more technical context, but no post-processing context (aka expectations).

If their config would label temp1_input as CPU then liquidprompt could alert the user in a more meaningful way. Without labels we cannot destinguish between GPU, CPU, Power supply, and unused values.

Two ideas:

  • silence stderr of sensors -u lm-sensors/lm-sensors#277 aims at bringing this specific error message back inline into stdout, leaving stderr for other 'useful' errors.
  • Treat values with default names temp\d+_input less important and hide their errors. To see errors, users should opt in by assigning labels via /etc/sensors.d/*.conf files.

@Rycieos
Copy link
Collaborator

Rycieos commented Sep 8, 2020

@jnweiger I had forgotten that labels are a supported feature of sensors; it had obviously been a while since I read man 5 sensors.conf. I added this to my config:

chip "k10temp-pci-00c3"
    label temp1 "CPU"

and got this out:

$ sensors -u
k10temp-pci-00c3
Adapter: PCI adapter
CPU:
  temp1_input: 19.250
  temp1_max: 70.000
  temp1_crit: 75.000
  temp1_crit_hyst: 74.000

It labels the first part of the section, but not the temp1_input line, which is what Liquidprompt is looking for. I think this is because of the desciption of the -u flag:

Raw output. This mode is suitable for debugging and for post-processing of the output by scripts. It is also useful when writing a configuration file because it shows the raw input names which must be referenced in the configuration file.

Not ideal, though maybe we could add more processing to the output stream. Anyway, good thought on labels, something to keep in mind.

The problems I see with them: we shouldn't completely rely on them as most users won't want to add configuration to get this feature to work. And besides that, some users won't be able to, as the /etc/sensors.d/ location needs admin access on the machine.

@jnweiger
Copy link
Contributor

jnweiger commented Sep 8, 2020

Ah, I had looked at sensors output without -u. There we only see the label. With -u we actually see both, the label and the raw name. That is even better!

@jkugler
Copy link

jkugler commented Sep 9, 2020

Another data point. This isn't just an issue with temperatures. In my laptop, I am getting ERROR: Can't get value of subfeature curr1_input: Can't read

It has to do with the current input for the laptop. When just running sensors it simply prints

curr1:            N/A  

with no error message.

However, when run with sensors -u we see the error message ERROR: Can't get value of subfeature curr1_input: Can't read

So, in this case, it isn't an error message really...or maybe it's a bug in the sensors package?

When the power is plugged in, I don't get an error, and I get a value for curr1.

@sabuhigr
Copy link

WA doesn't help me.I got error from PCI Adapter and error like the following:

p8p1-pci-0200
Adapter: PCI adapter
PHY temperature:
ERROR: Can't get value of subfeature temp1_input: Can't read
root@corei7:/home/tux# cat /etc/sensors.d/p8p1-pci-0200
chip "p8p1-pci*"
    ignore temp1

@Rmano method is working for me for hiding error like that

root@corei7:/home/tux# for i in $(sensors -u 2> /dev/null);do echo $i;done | awk NR==10
temp1_max:
root@corei7:/home/tux# for i in $(sensors -u 2> /dev/null);do echo $i;done | awk NR==11
86.000

@Rycieos Rycieos removed the wontfix A bug report that is rejected because the fix should be external label Dec 10, 2020
@Rycieos Rycieos added this to the v2.0 milestone Dec 10, 2020
@Rycieos
Copy link
Collaborator

Rycieos commented Dec 10, 2020

Fixed in 4a52696. See the commit message for the full details.

I apologize, I was an idiot on this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sensors Related to the use of lm-sensors to get temperature data temperature Related to temperature data or display
Projects
None yet
Development

No branches or pull requests