Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQTT Device Management #754

Closed
4 tasks done
sammachin opened this issue Jul 5, 2022 · 7 comments
Closed
4 tasks done

MQTT Device Management #754

sammachin opened this issue Jul 5, 2022 · 7 comments
Assignees
Labels
scope:device Agent feature for Gateways and PLCs size:XXL - 13 Sizing estimation point story A user-oriented description of a feature
Milestone

Comments

@sammachin
Copy link
Contributor

sammachin commented Jul 5, 2022

Epic

#464

Description

As a: User

I want: my devices to communicate with the forge application over MQTT

So that: I have a realtime channel and do not rely on polling.

This is dependent on #464 delivering the MQTT broker infrastructure.

Dependencies

Acceptance Criteria

  • Update Forge application to communicate over MQTT
  • Update Device agent code to use MQTT
@sammachin sammachin added the story A user-oriented description of a feature label Jul 5, 2022
@sammachin
Copy link
Contributor Author

Do we want to keep polling as a fallback mechanism?

@sammachin sammachin added this to the 0.8 milestone Jul 5, 2022
@sammachin sammachin added scope:device Agent feature for Gateways and PLCs v1 labels Jul 5, 2022
@knolleary
Copy link
Member

Do we want to keep polling as a fallback mechanism?

I think so. I wouldn't choose to rip it out straight away as we'll have a period of 'legacy' device agents out there that only know to do polling.

@sammachin
Copy link
Contributor Author

@ Does this still need its own design work or can it be estimated and scheduled (tentatively) for development in 0.8?

@knolleary
Copy link
Member

It does require some design work, although I think much of the unknowns are fairly well understood at this point.

Happy for this to go in to 0.8

@knolleary knolleary self-assigned this Jul 8, 2022
@knolleary
Copy link
Member

knolleary commented Jul 14, 2022

updated 21/7 to add project to status payload and update topics used
Lets start to fill out the design details for how devices will make use the broker.

From #464 we have defined the top level topic structures.

  • Devices will subscribe to ff/v1/<team>/d/<device>/command to receive commands from the platform
  • Devices will publish to ff/v1/<team>/d/<device>/status to send their status to the platform
    • Include team hashid in the credentials object they are provided The broker username includes the teamId - we can extract it from that rather than add yet another field to the credentials object.
  • Devices will still use the existing HTTP endpoint to download project snapshots and settings - these will not be sent over MQTT.
  • Devices will subscribe to ff/v1/<team>/p/<project>/command to receive commands from the platform sent to all devices for a give project

Status

The Device status event has the form:

{
   state: '<state>',
   snapshot: '<snapshot>',
   settings: '<settings>',
   health: {
      uptime: 123,
      snapshotRestartCount: 1
   }
}

This is the same structure as the existing HTTP Ping - which we need to keep consistent with. We haven't properly specified the health properties and formalised how they are used. Will need to come back to that.

It publishes status events whenever there is a change in the local status, including:

  • When the device agent first starts (with a random delay between 1-10 secs to avoid reconnection storms)
  • Any unexpected change in Node-RED state
  • Before/after updating the local snapshot/settings

The precise values of state are TBD.

Commands

Commands published by the platform are JSON blobs containing the type of command and any additional meta data the particular command provides.

Currently, the only command the platform may send the device is a notification there is a new project snapshot to load.

update

{
   "command": "update",
   "project": "<project-id>",
   "snapshot": "<snapshot-id>",
   "settings": "<settings-hash>"
}

When the device receives this command it must compare the snapshot and settings values with its locally stored values. If either differs, then:

  1. publish an updating status message
  2. call the corresponding HTTP endpoint to get the new snapshot/settings
  3. apply the new values
  4. restart node-red
  5. publish a running status message

@knolleary knolleary added size:M - 3 Sizing estimation point size:XXL - 13 Sizing estimation point and removed size:M - 3 Sizing estimation point labels Jul 14, 2022
@knolleary
Copy link
Member

This is proving to be more involved than simply adding MQTT instead of HTTP.

With the current topic structure design each device subscribes to its own 'command' topic. But the most common command to send is that the snapshot a device should be using has changed - and that has to be sent to all devices. This means the platform has to get a list of all devices and publish a message to each one. Having implemented it, it feels wrong - that's a lot of unnecessary work.

It would be better if the platform could publish one message to notify all devices assigned to the project of any change. But to achieve that, the devices would have to subscribe to a topic specific to the project - which in turn means:

  • devices need to know what project id they are assigned to (they don't currently get that info - just the snapshot id)
    • Add that to the status object returned by the deviceLive end-point and the 'update' messages
    • Store that information in the device project file
    • ...
  • devices need to dynamically subscribe/unsubscribe from the project topic
  • the acl handling needs to consider whether a device is allowed to subscribe to a project status topic. We could relax that to say a device in a team can subscribe to any project status topic.

I have updated the topic table in #464 (comment) to reflect this.

  • Launchers (which haven't been implemented yet) will now use ff/v1/<team>/l/<project>/command and ff/v1/<team>/l/<project>/status - (note the /l/ to indicate this is for the launcher).
  • The topic ff/v1/<team>/p/<project>/command (note the /p/) is used to send commands to all devices assigned to this project.

Device MQTT lifecycle

  1. Agent starts. Sees a broker config has been provided so enables the MQTT handler. If no broker config is provided, it will fallback to HTTP polling.
  2. Agent publishes to ff/v1/<team>/d/<device>/status - with its current snapshot/settings hashes (project can be inferred from snapshot) - with a state of <to be defined>. This is the 'birth' message. It does not start the project running at this point in time.
  3. When the platform receives a device status message, it validates the snapshot/settings hashes are correct. If the state is <to be defined>, or the snapshot/settings hashes are wrong, it publishes an update message to the device command topic - this includes the project id.
  4. When the agent receives an update message - on either its .../d/... or .../p/... topic, it compares the snapshot/settings/project with its local configuration.
    1. if everything matches, it starts the project if not running.
    2. if project has changed,
      1. unsubscribe from old project command topic
      2. stop launcher and delete old project
    3. if project not null, if snapshot/settings changed
      1. get settings/snapshot from platform
      2. if project changed, subscribe to new project command topic
      3. start new project
Event Response
Device is added/removed from a project Platform publishes to ff/v1/<team>/d/<device>/command to notify the device
Device settings modified (env vars) Platform publishes to ff/v1/<team>/d/<device>/command to notify the device
Target Snapshot is changed (including when deleted) Platform publishes to ff/v1/<team>/p/<project>/command to notify all devices
Project deleted Platform publishes to ff/v1/<team>/p/<project>/command to notify all devices (with project: null)

@ZJvandeWeg
Copy link
Member

@sammachin Can this issue be updated? I think the milestone is incorrect and it has been fully delivered?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:device Agent feature for Gateways and PLCs size:XXL - 13 Sizing estimation point story A user-oriented description of a feature
Projects
Archived in project
Development

No branches or pull requests

3 participants