Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"stuck" temperature reading #13

Open
corsac-s opened this issue Sep 29, 2022 · 17 comments
Open

"stuck" temperature reading #13

corsac-s opened this issue Sep 29, 2022 · 17 comments
Labels
help wanted Extra attention is needed Stalled This issue is stalled or blocked for some reason

Comments

@corsac-s
Copy link

Hi, I noticed I sometime get the same temperature reading for multiple hours using TEMPER2 external sensor:

pcsensordual_TEMPer2-day

I'm unsure if it's something in the firmware or in the way it's queried with Python. It looks a bit like that “old” issue in a different project: padelt/temper-python#61

Not sure there's anything which can be done at the software level but in case some other people experience that and/or have a workaround, I'd be interested.

@eode
Copy link
Collaborator

eode commented Nov 23, 2022

I've sent a message to the manufacturer, but am doubtful of getting a response.

@corsac-s
Copy link
Author

I've sent a message to the manufacturer, but am doubtful of getting a response.

Thanks! I'm not holding my breath indeed.

Considering the following quote from #9 applies to this issue:

I haven't run into this issue -- how frequent is it? Do you have a repeatable case we can use for testing?

Honestly I don't know. I run temper.py on various boxes during munin runs, in order to graph temperature over time. So I don't really have a repeatable case besides just running it every 5 minutes and check if the graphs look correct or not.

Over the last week there really was an example of the external sensor beeing stuck around 11°C here:
image

In the mean time, I've created the branch stuck_temp_fix that you can check out.

Please let me know if it helps. If it does, I'll optimize it and include it (at least as a configurable option). It will probably be disabled by default, because it does incur a +50% overhead in run time for each device -- or, if optimized, would still incur a lot of irregularity in the call time.

Yes I'll report back. I don't think I'm too bothered by the overhead but I can understand it's an issue.

@eode
Copy link
Collaborator

eode commented Nov 23, 2022

@corsac-s Thanks, it getting stuck like that is pretty weird, and has to be is probably a hardware issue (considering it shows up in other projects as well). Looks like it's happening often enough that it should be confirmable in a week or two.

That branch automatically uses the reset, so you don't need any particular cli or function arguments.

Oh, BTW, just confirming relative to the PR and README -- the temper-hum 3.9 doesn't support temperature, just humidity, right?

@corsac-s
Copy link
Author

@corsac-s Thanks, it getting stuck like that is pretty weird, and has to be is probably a hardware issue (considering it shows up in other projects as well). Looks like it's happening often enough that it should be confirmable in a week or two.

Yes, I'll report back

Oh, BTW, just confirming relative to the PR and README -- the temper-hum 3.9 doesn't support temperature, just humidity, right?

I don't know, I only have Temper2 and TemperGold so no humidity sensor.

@eode
Copy link
Collaborator

eode commented Nov 23, 2022

Ah. Was adding another one, I think it's my typo then.

@corsac-s
Copy link
Author

@corsac-s Thanks, it getting stuck like that is pretty weird, and has to be is probably a hardware issue (considering it shows up in other projects as well). Looks like it's happening often enough that it should be confirmable in a week or two.

Yes, I'll report back

I'll let it run a bit more but unfortunately it doesn't seem fixed:
image

@eode
Copy link
Collaborator

eode commented Nov 26, 2022 via email

@eode
Copy link
Collaborator

eode commented Nov 26, 2022

Still, since it's consistently the same incorrect temperature, we might be able to at least increase the accuracy by making it really work hard if it wants to return that value.

This would be a significant lag spike, so it would need to be optional, but if we get that specific value, we could poll it N times and see if it only returns that value or if it occasionally returns another one. Then, if it occasionally returns another one, use that value instead.

Of course, this depends on the behavior of the device, and I don't really know if it would vary, or if it's solidly out of commission during those times.

@corsac-s
Copy link
Author

So after a few days:
image

If you have ideas on how to debug or poke the device and you can't really reproduce on yours don't hesitate to ask, I can definitely run experimental code here.

@eode
Copy link
Collaborator

eode commented Jan 12, 2023

Still on my radar, but I haven't had time recently to work on it.

I haven't heard back from PCSensor.

The only thing I can think of that might work (and I doubt it) is we can poll it multiple times if the result we get is 11c, and see if it gets anything else. But by the graphs, I'm not too hopeful about that.

@corsac-s
Copy link
Author

corsac-s commented Jan 18, 2023

The only thing I can think of that might work (and I doubt it) is we can poll it multiple times if the result we get is 11c, and see if it gets anything else. But by the graphs, I'm not too hopeful about that.

Yeah I don't think it'll work, it get stucks at multiple temperatures: 11C but also around 19C (not the same device):

image

I'm not sure what makes the thing “unstuck”.

@eode eode added help wanted Extra attention is needed Stalled This issue is stalled or blocked for some reason labels May 13, 2023
@eusoubrasileiro
Copy link

eusoubrasileiro commented Sep 17, 2024

I have the same issue @eode and @corsac-s with a temper 2 v4.1
I have been using it for 1 year already I got used to "fix" it using a rolling window with pandas.

I have a sqlite database with thousand of hours of readings with these kinds of errors.
Just updating: made this new picture with 150k readings I have here on my database. The issue is with both sensors in and out. df[['temp_in', 'temp_out']].plot(figsize=(8,5), marker='.', markersize=0.8, alpha=0.6, linestyle='none', ylim=[22, 30])

image

Look those lines of prefered readings for both sensors in (internal) and out (external).
Exactly like what happened here on this old issue. That's crazy....

Also, here is an example of last night from 21 to 7 am this morning. temp_in and temp_out are from Temper. temp_zb is from a zigbee termomether.

image

Chatgpt 4.0 proposed to be a eletric or manufacturing defect that makes it 'fall back' to close to a default value...

This site here points some issues on low-powered systems not sure if related? I use mine on an orangepi5.

Any updates from the manufacturer?

@Taomyn
Copy link

Taomyn commented Sep 17, 2024

Having similar issues with my sensors. Had an old one TEMPer1V1.2 that I thought had gone defective when it got stuck at the same temp, I think it was 23.5c unless the temp went up, it was an standard AMD PC, so I tried switching to another system, a RPI-4 and it did the same at the same temp. So I ordered another, this time TEMPer2_V4.1 with an external sensor, and it does the same for the external and is stuck at 22.56c. The strange thing is I have another sensor identical to the TEMPer1V1.2 and I now notice it also gets stuck, but for shorter periods at 23.44c

image

@eusoubrasileiro
Copy link

eusoubrasileiro commented Sep 19, 2024

I am suspecting it's related to this part of the code and this issue I have change the timeout values to 1 second and 2 seconds for temperature readings. In fact this will only delay the readings between the firmware and the temperature but I am suspecting that is related to the response time of the sensor that's bad... I've also increased my readings to every 2 minutes...
I’m just taking shots in the dark... didnt work

@eode
Copy link
Collaborator

eode commented Sep 20, 2024

This issue is one I've just about given up on, and I'm probably going to move to using a different sensor than pcSensor. But, I'm still not absolutely certain it's a hardware issue.

Thanks for exploring the issue @eusoubrasileiro . Your graphs are really enlightening.

What we see here looks like binary truncation. That is, the number is rounded down to a lower value, because some amount of the data is lopped off. You can see this in the graphs by both the solid bands and the clear spaces above the bands. The problem is that it doesn't seem to be consistently truncated, which smacks of a read error. But, it occurs in situations where both the internal and the external temperature are present, which strongly smacks not of being a read error.

I'm starting to suspect that the second byte, which does fractional celcius degrees, is not read correctly. Either this was an error in the original C code, or it was introduced in this library when it was translated out of the original C. But another possibility is that the fractional degree byte isn't populated correctly by the pcSensor hardware.

@eusoubrasileiro
Copy link

eusoubrasileiro commented Sep 20, 2024

Thanks a lot for your answer @eode and for the honor to receive it. And congratulations for the awesome project!
I will reread your answer many many times I am sure...

I still think like you too. The windows software seems to work (altough I never used it).

On this approach, yesterday I downloaded this old C code (https://github.com/shakemid/pcsensor-temper)
Unfortunantly, even after I modified it to include my TEMPER_TYPE { 0x3553, 0xa001, "TEMPer2_V4.1", 1, 2, 0, decode_answer_fm75 }, // TEMPer2* eg. TEMPer2V4.1 and manage to compile and run it I cant read the temperature...

I got errors related to USB reading Couldn't find the USB device, Exiting: 0 maybe the device descriptor is wrong? maybe the hid protocol? I understand almost nothing of this... even with ChatGPT4 I didn't manage yet to understand this subject that seems quite complex using libusb.

Well if someone has time maybe that old piece of C, or even better your original code (you translated), could help us on this.

I'll certainly explore your idea of binary truncation using some LLM like ChatGP4 or Claude.
Thanks a thousand!

@eusoubrasileiro
Copy link

eusoubrasileiro commented Sep 20, 2024

For the 150 k data I have I filtered for the more dense region 22 to 30 degrees (I'm from Brazil tropical country).

dfiltered = df[(df['temp_out'] < 30) & (df['temp_out'] > 22)]
unique = dfiltered.temp_out.unique()
unique.sort()
np.diff(unique)
array([0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07,
       0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06,
       0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06,
       0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06,
       0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07,
       0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06,
       0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06,
       0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06,
       0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07,
       0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06,
       0.07, 0.06, 0.06, 0.06, 0.07, 0.06, 0.06, 0.06, 0.19, 0.06, 0.07,
       0.06, 0.06, 0.06])

The min is 29.93 and min is 22.06 if we calculate the step with 125+3(added 3 because of 0.19) unique counts we get ~0.0620. My sensor is Temper2_v4.1 from internet I found it is FM75.
From there it says 0.0625 is one the supported resolutions and matches exacly the unique values from 22.06 to 29.93 for the samples above.

The exercise above explain the fixed interval between samples above ...

But the mistery of the prefered or "stuck" values is still open....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed Stalled This issue is stalled or blocked for some reason
Projects
None yet
Development

No branches or pull requests

4 participants