Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gen 3] Fixes various issues caused by the gateway reset #1778

Merged
merged 6 commits into from May 16, 2019

Conversation

@avtolstoy
Copy link
Member

commented May 15, 2019

Problem

Mesh devices may be unable to connect to the cloud or appear to be connected but can neither send nor receive data to/from the cloud under certain conditions.

Solution

  1. OpenThread version we've upgraded to in the end of January moved SLAAC functionality into its core and enabled it by default, which interfered with our own SLAAC logic. A PR was submitted later to the OpenThread repo to make this configurable, so for now we are simply pulling in that PR and disable SLAAC: particle-iot/openthread@ea382ea
  2. Under certain conditions (see 'Steps to Test' to replicate) mesh devices don't notice that the gateway has been reset. This is resolved by artificially publishing a prefix with preferred flag set to 0 (which means that the addresses generated from this prefix should be 'DEPRECATED' but may still be used for communications), waiting for this change to propagate through the network and then re-publishing with proper preferred set to 1, so that the device can receive events from OpenThread about a potential change in upstream connection availability and explicitly check their cloud connection state by issuing a ping.
  3. When changing the state of IP addresses on OpenThread interface, make sure that events are being fired, so that system level subscribers may understand that there is a potential change in connection availability.
  4. Make sure that Pref::1/PrefLen address is always generated (instead of a random one) from the prefix announced by the gateway itself no matter the current role (child or router).

Steps to Test

The easiest way to test this is to create a mesh network of 3 devices.

  1. Power down all devices
  2. Power up non-gateway devices
  3. Make sure that one of them becomes the leader (for now take a look at the logs for that)
  4. Power on the gateway device
  5. Wait until all three are connected to the cloud and are breathing cyan
  6. Power off the gateway device for about a minute or so. The other two devices should stay breathing cyan (occasionally you might see some blinks due to ping timeouts and session resumption)
  7. Power the gateway back on BEFORE the other devices start blinking green
  8. All three should connect to the cloud. The non-gateway devices should not go through blinking green state.
  9. Make sure you may reach your non-gateway devices from the cloud (e.g. particle nyan)
  10. Wait for some minutes, you should still be able to reach your non-gateway devices and they should be breathing cyan

Steps 9 - 10 should fail on develop and should work correctly with this PR.

Example App

N/A

References

  • [CH29981]

Completeness

  • User is totes amazing for contributing!
  • Contributor has signed CLA (Info here)
  • Problem and Solution clearly stated
  • Run unit/integration/application tests on device
  • Added documentation
  • Added to CHANGELOG.md after merging (add links to docs and issues)

  • [bugfix] [gen 3] Fixes various issues caused by the gateway reset #1778

@avtolstoy avtolstoy requested a review from sergeuz May 15, 2019

@avtolstoy avtolstoy added this to the 1.2.0-rc.1 milestone May 15, 2019

@sergeuz sergeuz force-pushed the fix/gen3-mesh-gw-loss-ch29981 branch from 92895ea to b6bf47a May 16, 2019

@sergeuz sergeuz merged commit c0931fe into develop May 16, 2019

1 check passed

continuous-integration/travis-ci/push The Travis CI build passed
Details

@sergeuz sergeuz deleted the fix/gen3-mesh-gw-loss-ch29981 branch May 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.