Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1-wire doesn't work when using pigpiod #60

Closed
Mausy5043 opened this issue Apr 26, 2016 · 22 comments
Closed

1-wire doesn't work when using pigpiod #60

Mausy5043 opened this issue Apr 26, 2016 · 22 comments

Comments

@Mausy5043
Copy link
Contributor

I'm having issues with pigpiod in combination with 1-wire.
When pigpiod is not running, the 1-wire sensor (DS18B20) can be read without any problems.

However, when I'm also reading measurements from a DHT22 -- using pigpiod and the DHTXXD executable -- I find that the 1-wire sensor file (w1_slave) occasionally disappears from /sys/bus/w1/devices/28*/. (DS18B20 id: 28-000006978d28)

Ultimately, this ends in a kernel "Oops" like this (sorry journalctl lines seem to have been truncated):

Apr 26 18:49:24 rbagain kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004
Apr 26 18:49:24 rbagain kernel: pgd = dd3e0000
Apr 26 18:49:24 rbagain kernel: [00000004] *pgd=1d2da831, *pte=00000000, *ppte=00000000
Apr 26 18:49:25 rbagain kernel: Internal error: Oops: 817 [#1] PREEMPT ARM
Apr 26 18:49:25 rbagain kernel: Modules linked in: cpufreq_stats nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 dns_res
Apr 26 18:49:25 rbagain kernel: CPU: 0 PID: 841 Comm: python Not tainted 3.18.0-trunk-rpi #1 Debian 3.18.5-1~exp1+rpi19
Apr 26 18:49:25 rbagain kernel: task: dd003200 ti: dc71e000 task.ti: dc71e000
Apr 26 18:49:25 rbagain kernel: PC is at w1_slave_show+0x2d8/0x398 [w1_therm]
Apr 26 18:49:25 rbagain kernel: LR is at vsnprintf+0x294/0x414
Apr 26 18:49:25 rbagain kernel: pc : [<bf02336c>]    lr : [<c02c8544>]    psr: 80000013
                                sp : dc71fe08  ip : bf0235e0  fp : dc71fe54
Apr 26 18:49:25 rbagain kernel: r10: dc71fe27  r9 : 00000017  r8 : dc71fe27
Apr 26 18:49:25 rbagain kernel: r7 : dd3d4650  r6 : dd9e7000  r5 : 00000fe5  r4 : 00000000
Apr 26 18:49:25 rbagain kernel: r3 : 00000000  r2 : 1007ff7f  r1 : 464b0129  r0 : 0000000d
Apr 26 18:49:25 rbagain kernel: Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Apr 26 18:49:25 rbagain kernel: Control: 00c5387d  Table: 1d3e0008  DAC: 00000015
Apr 26 18:49:25 rbagain kernel: Process python (pid: 841, stack limit = 0xdc71e1b0)
Apr 26 18:49:25 rbagain kernel: Stack: (0xdc71fe08 to 0xdc720000)
Apr 26 18:49:25 rbagain kernel: fe00:                   bf0235e8 dc71fe18 00000001 dc5a1c94 dc949040 297a41cc
Apr 26 18:49:25 rbagain kernel: fe20: 7f464b01 171007ff c011ed24 dc7dd600 bf023704 ddbe4200 00000fff 00001000
Apr 26 18:49:25 rbagain kernel: fe40: dd9e7000 c0575284 dc71fe6c dc71fe58 c0336928 bf0230a0 dc7dd600 dd3d4658
Apr 26 18:49:25 rbagain kernel: fe60: dc71fe9c dc71fe70 c01bba34 c0336908 c01bb998 00002000 dc71fec0 dc7dd600
Apr 26 18:49:25 rbagain kernel: fe80: dd029b40 00000001 00000001 dc71ff78 dc71feac dc71fea0 c01ba2dc c01bb9a4
Apr 26 18:49:25 rbagain kernel: fea0: dc71fefc dc71feb0 c016d970 c01ba2b4 dd029b48 00000000 dc7dd630 be993744
Apr 26 18:49:25 rbagain kernel: fec0: 00000000 00000000 dc71fefc dc71fed8 c0240000 ddbe4200 be993744 be993744
Apr 26 18:49:25 rbagain kernel: fee0: dc71ff78 00002000 00002000 be993744 dc71ff3c dc71ff00 c01badf0 c016d7b8
Apr 26 18:49:25 rbagain kernel: ff00: 571f9b6c 00000000 00000022 00000003 00000000 dd029b40 be993744 dc71e000
Apr 26 18:49:25 rbagain kernel: ff20: dc71ff78 00002000 dc71e000 be993744 dc71ff74 dc71ff40 c01498f4 c01bacd8
Apr 26 18:49:25 rbagain kernel: ff40: dc71ff5c dc71ff50 c0166930 00000000 00000000 dd029b40 dd029b40 00002000
Apr 26 18:49:25 rbagain kernel: ff60: dc71e000 be993744 dc71ffa4 dc71ff78 c0149a3c c0149864 00000000 00000000
Apr 26 18:49:25 rbagain kernel: ff80: 00000000 00002000 ffffffff 00000003 c000f884 00000000 00000000 dc71ffa8
Apr 26 18:49:25 rbagain kernel: ffa0: c000f640 c01499fc 00000000 00002000 00000004 be993744 00002000 00000000
Apr 26 18:49:25 rbagain kernel: ffc0: 00000000 00002000 ffffffff 00000003 020ca738 00002000 00002000 be993744
Apr 26 18:49:25 rbagain kernel: ffe0: 00000000 be9936a4 b6d9b234 b6df290c 60000010 00000004 00000000 00000000
Apr 26 18:49:25 rbagain kernel: [<bf02336c>] (w1_slave_show [w1_therm]) from [<c0336928>] (dev_attr_show+0x2c/0x58)
Apr 26 18:49:25 rbagain kernel: [<c0336928>] (dev_attr_show) from [<c01bba34>] (sysfs_kf_seq_show+0x9c/0x120)
Apr 26 18:49:25 rbagain kernel: [<c01bba34>] (sysfs_kf_seq_show) from [<c01ba2dc>] (kernfs_seq_show+0x34/0x38)
Apr 26 18:49:25 rbagain kernel: [<c01ba2dc>] (kernfs_seq_show) from [<c016d970>] (seq_read+0x1c4/0x4a0)
Apr 26 18:49:25 rbagain kernel: [<c016d970>] (seq_read) from [<c01badf0>] (kernfs_fop_read+0x124/0x16c)
Apr 26 18:49:25 rbagain kernel: [<c01badf0>] (kernfs_fop_read) from [<c01498f4>] (vfs_read+0x9c/0x198)
Apr 26 18:49:25 rbagain kernel: [<c01498f4>] (vfs_read) from [<c0149a3c>] (SyS_read+0x4c/0x98)
Apr 26 18:49:25 rbagain kernel: [<c0149a3c>] (SyS_read) from [<c000f640>] (ret_fast_syscall+0x0/0x30)
Apr 26 18:49:25 rbagain kernel: Code: eb4a9705 e5173004 e51b2031 e51b1035 (e5832004)
Apr 26 18:49:25 rbagain kernel: ---[ end trace 028330aecbc655fa ]---

Some basic system info:

$ uname -a
Linux rbagain 3.18.0-trunk-rpi #1 PREEMPT Debian 3.18.5-1~exp1+rpi19 (2015-08-08) armv6l GNU/Linux
$ cat /etc/modules
bcm2708-rng
w1-therm
w1-gpio

source: http://abyz.co.uk/rpi/pigpio/code/DHTXXD.zip

DHTXXD.h : 2015-11-15
DHTXXD.c : 2016-02-16
$ pigpiod -v
51
@joan2937
Copy link
Owner

That seems to be a 1-wire fault rather than a pigpio fault.

Which model Pi are you using? If I get a chance I'll try out a DS18B20.

@Mausy5043
Copy link
Contributor Author

That seems to be a 1-wire fault rather than a pigpio fault.

That would be very unfortunate.

Which model Pi are you using? If I get a chance I'll try out a DS18B20.

Raspberry Pi 1B+.

@Mausy5043
Copy link
Contributor Author

I was thinking maybe pigpiod and w1-therm/w1-gpio are fighting for access to pin 4?

@joan2937
Copy link
Owner

pigpio does not change any GPIO unless requested. Sampling a GPIO level (whether it's an input or an output or in one of the other 6 modes) should not impinge on any other usage. There is no way a userland process like pigpiod should affect an unrelated kernel module. Personally I think the 1-wire module is a bit flaky.

@Mausy5043
Copy link
Contributor Author

😞

@Mausy5043
Copy link
Contributor Author

Alternative idea: Is there a way to use pigpio(d) to read a 1-wire sensor? Instead of using the kernel modules? Something like DHTXXD but for 1-wire?

@joan2937
Copy link
Owner

Not with pigpio. What GPIO are you using for the DHT22?

@joan2937
Copy link
Owner

I can't duplicate this failure. I have had two DS18B20 connected to GPIO 4 and a DHT22 connected to GPIO 11 running for an hour forty minutes without problem.

Did you connect your DHT22 to GPIO 4? A DHT22 is not a 1-wire bus device and I can see that would cause a problem.

@Mausy5043
Copy link
Contributor Author

Mausy5043 commented Apr 27, 2016

@joan2937 : Thanks for your assistance. The DS18B20 is connected as follows:

# Sensor pin       : R-Pi B+ pin
# =================:==============
# VIN   (red)      : 01  = 3v3
# Data  (yellow)   : 07  = GPIO04 & R=4k7 --> 3v3
# GND   (blue)     : 09  = GND

The DHT22 is connected as follows:

# Wiring (facing frontside of DHT22, left to right):
# Sensor pin       : R-Pi B+ pin
# =================:==============
# PWR              : 01  = 3v3
# data (digital)   : 12  = GPIO18 & R=4k7 --> 3v3
# NC               : not connected
# GND              : 14  = GND

(I also have a BMP183 connected to the SPI pins but I'm not yet using that.)

The DS18B20 gives no problem when used by itself (pigpiod not started). It's been running all night sampling it every 12 seconds with no problems showing in the logs.

EDIT: This morning I (also) started the measurements from the DHT22. I'm using a python script to call DHTXXD with subprocess.call(). Samplingtime here is also 12sec
EDIT2: Been running for an hour now. Two read errors on the 1-wire sensor in the last half hour.

@joan2937
Copy link
Owner

I was using a original model Pi B (rev. 1). If anything it is less powerful than a Pi B+. You have quite old firmware (3.18.0). I was using 4.1.13+ #826 with Debian (soft float).

I'll set mine running again and leave it until it faults or as many hours as I need to convince me It's not going to fault.

@Mausy5043
Copy link
Contributor Author

Mausy5043 commented Apr 27, 2016

I was using a original model Pi B (rev. 1). If anything it is less powerful than a Pi B+. You have quite old firmware (3.18.0). I was using 4.1.13+ #826 with Debian (soft float).

I'm using the official Raspbian kernel (not the Foundation one) as installed by raspbian-ua-netinst

I'll set mine running again and leave it until it faults or as many hours as I need to convince me It's not going to fault.

Might take some time. It's been another hour here, without any problems. Yesterday (I seem to remember) it took the better part of the day until it "oopsed". The read failures on the 1-wire slowly started to increase. This could be indicative of a memory leak?

Do you think I might have more success using the Python version (your DHT22.py) instead of the C version (DHTXXD)? What's your opinion?

EDIT : The kernel just "Oops"ed again (see below). Since you are not having any problems with kernel 4.x I'm going to try rpi-update to upgrade the kernel and see how that works out.

Apr 27 16:25:26 rbagain kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004
Apr 27 16:25:26 rbagain kernel: pgd = dc94c000
Apr 27 16:25:26 rbagain kernel: [00000004] *pgd=1c7c7831, *pte=00000000, *ppte=00000000
Apr 27 16:25:26 rbagain kernel: Internal error: Oops: 817 [#1] PREEMPT ARM
Apr 27 16:25:26 rbagain kernel: Modules linked in: cpufreq_stats nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 dns_resolver nfs lockd grace sunrpc fscache cfg80211 rfkill snd_soc_wm8804 snd_
soc_pcm512x_i2c snd_soc_tas5713 snd_soc_pcm512x regmap_spi regmap_i2c 8192cu snd_soc_bcm2708_i2s regmap_mmio snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer i2c_bcm2708 spi_bcm2708 snd sound
core w1_gpio w1_therm wire cn bcm2708_rng autofs4
Apr 27 16:25:26 rbagain kernel: CPU: 0 PID: 654 Comm: python Not tainted 3.18.0-trunk-rpi #1 Debian 3.18.5-1~exp1+rpi19
Apr 27 16:25:26 rbagain kernel: task: dd3e2d00 ti: dc914000 task.ti: dc914000
Apr 27 16:25:26 rbagain kernel: PC is at w1_slave_show+0x2d8/0x398 [w1_therm]
Apr 27 16:25:26 rbagain kernel: LR is at vsnprintf+0x294/0x414
Apr 27 16:25:26 rbagain kernel: pc : [<bf02336c>]    lr : [<c02c8544>]    psr: 80000013
                                sp : dc915e08  ip : bf0235e0  fp : dc915e54
Apr 27 16:25:26 rbagain kernel: r10: dc915e27  r9 : 0000004c  r8 : dc915e27
Apr 27 16:25:26 rbagain kernel: r7 : dc56a250  r6 : dd0cc000  r5 : 00000fe5  r4 : 00000000
Apr 27 16:25:26 rbagain kernel: r3 : 00000000  r2 : 1008ff7f  r1 : 464b0128  r0 : 0000000d
Apr 27 16:25:26 rbagain kernel: Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Apr 27 16:25:26 rbagain kernel: Control: 00c5387d  Table: 1c94c008  DAC: 00000015
Apr 27 16:25:26 rbagain kernel: Process python (pid: 654, stack limit = 0xdc9141b0)
Apr 27 16:25:26 rbagain kernel: Stack: (0xdc915e08 to 0xdc916000)
Apr 27 16:25:26 rbagain kernel: 5e00:                   bf0235e8 dc915e18 00000001 dc5c6094 dc7d96c0 2877d4e4
Apr 27 16:25:26 rbagain kernel: 5e20: 7f464b01 4c1008ff c011ed24 dd0121e0 bf023704 dd03f980 00000fff 00001000
Apr 27 16:25:26 rbagain kernel: 5e40: dd0cc000 c0575284 dc915e6c dc915e58 c0336928 bf0230a0 dd0121e0 dc56a258
Apr 27 16:25:26 rbagain kernel: 5e60: dc915e9c dc915e70 c01bba34 c0336908 c01bb998 00002000 dc915ec0 dd0121e0
Apr 27 16:25:26 rbagain kernel: 5e80: dd36b5a0 00000001 00000001 dc915f78 dc915eac dc915ea0 c01ba2dc c01bb9a4
Apr 27 16:25:26 rbagain kernel: 5ea0: dc915efc dc915eb0 c016d970 c01ba2b4 dd36b5a8 00000000 dd012210 be856744
Apr 27 16:25:26 rbagain kernel: 5ec0: 00000000 00000000 dc915efc dc915ed8 c0240000 dd03f980 be856744 be856744
Apr 27 16:25:26 rbagain kernel: 5ee0: dc915f78 00002000 00002000 be856744 dc915f3c dc915f00 c01badf0 c016d7b8
Apr 27 16:25:26 rbagain kernel: 5f00: 5720c9c4 00000000 00000022 00000003 00000000 dd36b5a0 be856744 dc914000
Apr 27 16:25:26 rbagain kernel: 5f20: dc915f78 00002000 dc914000 be856744 dc915f74 dc915f40 c01498f4 c01bacd8
Apr 27 16:25:26 rbagain kernel: 5f40: dc915f5c dc915f50 c0166930 00000000 00000000 dd36b5a0 dd36b5a0 00002000
Apr 27 16:25:26 rbagain kernel: 5f60: dc914000 be856744 dc915fa4 dc915f78 c0149a3c c0149864 00000000 00000000
Apr 27 16:25:26 rbagain kernel: 5f80: 00000000 00002000 ffffffff 00000003 c000f884 00000000 00000000 dc915fa8
Apr 27 16:25:26 rbagain kernel: 5fa0: c000f640 c01499fc 00000000 00002000 00000004 be856744 00002000 00000000
Apr 27 16:25:26 rbagain kernel: 5fc0: 00000000 00002000 ffffffff 00000003 00cba738 00002000 00002000 be856744
Apr 27 16:25:26 rbagain kernel: 5fe0: 00000000 be8566a4 b6dd3234 b6e2a90c 60000010 00000004 00000000 00000000
Apr 27 16:25:26 rbagain kernel: [<bf02336c>] (w1_slave_show [w1_therm]) from [<c0336928>] (dev_attr_show+0x2c/0x58)
Apr 27 16:25:26 rbagain kernel: [<c0336928>] (dev_attr_show) from [<c01bba34>] (sysfs_kf_seq_show+0x9c/0x120)
Apr 27 16:25:26 rbagain kernel: [<c01bba34>] (sysfs_kf_seq_show) from [<c01ba2dc>] (kernfs_seq_show+0x34/0x38)
Apr 27 16:25:26 rbagain kernel: [<c01ba2dc>] (kernfs_seq_show) from [<c016d970>] (seq_read+0x1c4/0x4a0)
Apr 27 16:25:26 rbagain kernel: [<c016d970>] (seq_read) from [<c01badf0>] (kernfs_fop_read+0x124/0x16c)
Apr 27 16:25:26 rbagain kernel: [<c01badf0>] (kernfs_fop_read) from [<c01498f4>] (vfs_read+0x9c/0x198)
Apr 27 16:25:26 rbagain kernel: [<c01498f4>] (vfs_read) from [<c0149a3c>] (SyS_read+0x4c/0x98)
Apr 27 16:25:26 rbagain kernel: [<c0149a3c>] (SyS_read) from [<c000f640>] (ret_fast_syscall+0x0/0x30)
Apr 27 16:25:26 rbagain kernel: Code: eb4a9705 e5173004 e51b2031 e51b1035 (e5832004)
Apr 27 16:25:26 rbagain kernel: ---[ end trace 358e9c373c3ff899 ]---

@joan2937
Copy link
Owner

I am using DHTXXD rather than DHT22.py. I'm not monitoring bad 1-wire reads. I have no idea what a normal 1-wire error rate would be. I'm just waiting for a crash. It's been over 12 hours now.

@Mausy5043
Copy link
Contributor Author

I decided to rpi-update the kernel.

  • In /boot/config.txt I commented out the kernel= and initramfs... lines and added dtoverlay=w1-gpio.
  • In /etc/modules I commented out w1-therm and w1-gpio`.

I'm now on Linux rbagain 4.4.8+ #880 Fri Apr 22 21:27:42 BST 2016 armv6l GNU/Linux.

The 1-wire script is running flawlessly now (zero errors since 22:00 yesterday evening). I've found that when the 1-wire sensor is on a breadboard there can be some noise from bad connections that disturb the 1-wire protocol. But, when soldered the 1-wire protocol can be expected to be flawless, provided sample times are adhered to (obviously).

The DHTXXD binary now occasionally returns a time-out (output: 3 0.0 0.0). Between 02:00 and 09:00 today I've got 36 time-outs. So, the tables are turned 😉 Not a big problem, but something to keep an eye on.
The script running the DHTXXD keeps the measurements between 3 and 12 seconds apart.

@joan2937
Copy link
Owner

joan2937 commented Apr 28, 2016

I'm just over 23 hours. 1 DHTXXD (DHT22) timeout and 9 CRC failures on my two DS18B20. There is nothing else being run on my first revision Pi B.

Have you got anything else running on yours?

@Mausy5043
Copy link
Contributor Author

Some python scripts: lnxdiagd and domod. These gather various system data, nothing fancy. domod runs the scripts that sample the DHT11 and DS18B20. Both push their data to a MySQL database on the LAN and pull data from it to create some GNUplot graphs. Typically this all happens once every minute.

X is not installed.

@Mausy5043
Copy link
Contributor Author

Mausy5043 commented Apr 28, 2016

As far as I'm concerned we can consider the issue resolved by the upgrade to kernel 4.4.x.
1-wire errors past 18hours : 0

I may come back for the DHTXXD timeouts, but they aren't biting me right now (74 occurences in the past 18hours). They tend to come in groups, always at the start of a minute (xx:xx:00) and then occasionally also at xx:xx:12.

@joan2937
Copy link
Owner

I expect the DHTXXD are simply time-outs because the system is especially busy. I am not seeing them.

Mine has been running for 36 hours during which I've had 3 timeouts and 1 checksum error among over 42 thousand good readings.

It would be useful if you could patch your DHTXXD.c and extend the time-out from 0.25 seconds to say 0.5 seconds. To do that change line 337 from for (i=0; i<5; i++) /* 0.25 seconds */ to for (i=0; i<10; i++) /* 0.5 seconds */

I'll leave mine running overnight and if nothing happens I'll close this off as nothing to do with pigpio.

@Mausy5043
Copy link
Contributor Author

I agree with your analysis.
Now running with the patched version of DHTXXD.

@joan2937
Copy link
Owner

Okay. Two days running. Have now stopped the test.

Final stats.

DHT22

56296 DHT22 reading (every 3.07 seconds, 3 seconds between readings)

56289 good
2 checksum fail
5 timeout

DS18B20

72908 DS18B20 every 4.74 seconds (a pair were used, 3 seconds between readings)

36448 28-000005d34cd2
36448 28-001414abbeff

12 CRC fails (which seemed to kill both readings)

Conclusion

pigpio doesn't affect 1-wire (any more than any other process running at the same time would).

@Mausy5043
Copy link
Contributor Author

FYI: Increasing the timeout timer in DHTXXD.c doesn't help (yes I did re-compile 😉 ). I've tried upto 2.00 seconds.
At this point it is not a problem for me.
Thanks for your time and help. Much appreciated. 👍

@joan2937
Copy link
Owner

That's a pity. When I get a chance I might try to see what is going on. It's unlikely to be any time soon though. I'll probably use piscope to capture the data concurrently and see if I can identify the failures (probably easiest if each reading is time-stamped) to see what is happening.

@Mausy5043
Copy link
Contributor Author

Well, since the fails tend to occur simultaneously with high CPU-load due to other program's activities I'm guessing that would be (part of) the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants