Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

particle.io OneBus transfer failures #20

Merged
merged 6 commits into from Jun 26, 2019

Conversation

@jonpcar
Copy link

jonpcar commented Jun 25, 2019

Sorry, this is my first ever use of Github....Don't know exactly what I am doing.
See this thread for more detail:

https://community.particle.io/t/onewire-library-bug-proposed-fix-june-2019-affects-ds18b20-and-other-onewire-devices/50589

Here is the OneWire.cpp file with some changes that FIX the error, my changes can be found by searchcing for //JC
OneWireFIX.txt

The current particle.io implementation of the OneWire protocol has a unique problem: transfers on the OneWire are subject to being “interrupted” by higher level system functions/calls. When this occurs inside the critical window of a OneWire “bit transfer”, that transfer will fail and result in a CRC error.

Solutions should always be designed to deal with a low percentage of CRC errors, but the current particle.io implementation has a rate that is too high.

Particularly troublesome, this “bit transfer fail” seems to cause havoc for the OneWire “Search for OneWire devices” algorithm which is “bit transfer” intensive. Many applications use this OneWire function to “find” and enumerate devices attached to the bus. With this “interrupt issue”, it is very common to see this algorithm fail to find the addresses of all attached devices, especially when there are 3+ devices attached to the bus.

SOLUTION:
Protect the critical windows of the OneWire protocol by REALLY blocking all interrupts during those critical “bit transfer” times. Although the OneWire protocol DOES implement a disable/enable interrupt procedure around these windows, higher level system operations are not prevented making their way in.

For the following devices, there is a FIX that works. That FIX uses ATOMIC_BLOCKS around the critical OneWire code to protect the bit transfers. That FIX works for:

Photon, Argon (mesh disabled), Xenon (mesh disabled), Boron (mesh disabled).

You see the pattern….currently a solution for “mesh enabled” devices is being sought.

@Hotaman Hotaman merged commit 8ad17b2 into Hotaman:master Jun 26, 2019
@Hotaman

This comment has been minimized.

Copy link
Owner

Hotaman commented Jun 26, 2019

Thanks for the pull request. I was under the impression Particle was going to add this lib into the firmware to avoid this kind of issue when they switched over to RTOS. Thats one of the reasons I handed support over to them. Considering this is one of the most popular libs I would think they would be a little more on top of it. I have mesh hardware but I do not currently have anything using it.

I'm currently involved with three major projects so I won't have any time to work on mesh issues until late Aug. I would suggest that you use manual mode and do all enumerations on the 1wire bus at boot before turning on any radios/networking. Then your normal error handling/retry logic will take care of spurious errors during normal processing.

When I get some time I'll set up a multi-node mesh with 18B20's on the leaves to simulate a typical temp monitoring system to see if I can help smooth things out. Based on what I've read, the solution will probably involve some firmware changes to get the mesh code to 'pause' for other time critical code to run uninterrupted without killing the mesh completely if possible. Perhaps someone else (Particle) will take on this task in the mean time.

@jonpcar

This comment has been minimized.

Copy link
Author

jonpcar commented Jun 27, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.