Skip to content

Leshan Cluster Module

Simon edited this page Mar 20, 2020 · 7 revisions

This is documentation about our experimental/abandoned leshan-server-cluster module. This implementation was based on Redis.
If you are interested about using Leshan Server in a cluster, you should read this first.

Should I use leshan-server-cluster in production now ?

Development on this module is not really active anymore.
The last milestones version of Leshan containing this module is Leshan-1.0.0-M12 and the most recent version of the code is available in cluster branch. (see #679 for more details).
This is clearly an experimental works and this should not be reasonable to use it in production. But you can use it as an example or as a good start to implement your own.

You could ever consider revive the module by contributing to this part of the project.

What is Northbound interface

The northbound interface is the interface between the Leshan cluster and the end user, like an IoT backend.

When you have a device connected to a server, only this server (which contains all the DTLS states) is able to communicate with the device.

So when a northbound interface user send a request it should be "routed" to the server in charge of this device.

The proposed solution is to use fan-out publishing in a broker. The request is published from the user to all the Leshan server instances. if one of them sees that request is for a device it currently "managed", it must acknowledge back to the user that it's going to process the request and send the request to the device. Once the request is processed the server publish the result in the broker (still in fan-out).

The client put a ticket id (a token) on the request and use it to correlate with the received fan-out responses.

(source: https://docs.google.com/presentation/d/1rGkpmPx_W37ojvlp6SL4oLuCkDlIAdd6XkTyHx2Li5k/edit?usp=sharing)

How Leshan server instances know which request they should handle ?

As you can see a Northbound request is broadcasted to each Leshan server instance and only one instance should handle it. In our current implementation we choose to give an ID to each Leshan server instance. When a client do a registration or an update, the instance which receive this request will store in Redis that this client is now handle by this server instance. On deregistration we remove it. (see RedisTokenHandler)

What does it happen if an instance die ?

  • If you are using Queue mode the client should start its new communication with an update, so the instance which will receive this update will be the next instance in charge.
  • If you are not using Queue mode, all the request will be lost until client do a new registration or registration update. This is not exactly what we could expect... Note that if we find a way to enhance it allowing another instance to handle new requests, the DTLS session/connection is lost so a new DTLS connection should be established and so leshan server will act as DTLS client and device should act as DTLS server.

(see Server-Failover for more details)

Northbound API

This API will be based on redis Pub/Sub.

Registration

New registration event will be accessible on LESHAN_REG_NEW channel. The payload is the new registration.

{
    regDate: 1467883514122,
    address: "127.0.0.1",
    port: 60071,
    regAddr: "0.0.0.0",
    regPort: 5683,
    lt: 30,
    ver: "1.0",
    bnd: "U",
    ep: "myDevice",
    regId: "ELP5Ql2v4b",
    objLink: [{
          "url": "/","at": {"rt": "oma.lwm2m"}
      }, {
          "url": "/1/0","at": {}
      }, {
          "url": "/3/0","at": {}
      }, {
          "url": "/6/0", "at": {}
    }],
    addAttr: {},
    root: "/",
    lastUp: 1467883514122
}

Registration update event will be accessible on LESHAN_REG_UP channel. The payload is the new registration (regUpdated) and the registration update (regUpdate).

{
    regUpdate: {
        regId: "ELP5Ql2v4b",
        address: "127.0.0.1",
        port: 60071
    },
    regUpdated: {
        regDate: 1467883514122,
        address: "127.0.0.1",
        port: 60071,
        regAddr: "0.0.0.0",
        regPort: 5683,
        lt: 30,
        ver: "1.0",
        bnd: "U",
        ep: "myEndpoint",
        regId: "ELP5Ql2v4b",
        objLink: [{
            url: "/", at: {rt: "oma.lwm2m"}
        }, {
            url: "/1/0", at: {}
        }, {
            url: "/3/0", at: {}
        }, {
            url: "/6/0", at: {}
        }],
        addAttr: {},
        root: "/",
        lastUp: 1467884189419
    }
}

De-registration event will be accessible on LESHAN_REG_DEL channel. The payload is the registration.

{
    regDate: 1467883514122,
    address: "127.0.0.1",
    port: 60071,
    regAddr: "0.0.0.0",
    regPort: 5683,
    lt: 30,
    ver: "1.0",
    bnd: "U",
    ep: "myDevice",
    regId: "ELP5Ql2v4b",
    objLink: [{
          "url": "/","at": {"rt": "oma.lwm2m"}
      }, {
          "url": "/1/0","at": {}
      }, {
          "url": "/3/0","at": {}
      }, {
          "url": "/6/0", "at": {}
    }],
    addAttr: {},
    root: "/",
    lastUp: 1467883514122
}

Request/Response

The channel LESHAN_REQ is used to send request. The payload is a ticket for this request (ticket), the destination endpoint (ep) and the request to send (req)

{
  "ep":"myEndpoint",
  "ticket":"8c90592249c74a9b8a2da5754145dcc0",
  "req":{
    "kind":"read",
    "path":"/3/0/1",
    "contentFormat":1541
  }
}

The channel LESHAN_RESP is used to receive response. Several message can be received on this channel.

An Ack message which means that the request is handled by 1 instance in the cluster.

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  ack:true
}

An Error message which means that an error occurred on the instance which choose to handle the request.

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  err:{
    errorMessage:"an error message",
  }
}

A Response message with the response returned by the device.

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  resp:{
    kind: "read",
    code: "CONTENT",
    node: {
        kind:"singleresource",
        id:1,
        type: "string",
        value: "Lightweight M2M Client",
    }
  }
}

or

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  resp:{
    kind: "read",
    code: "NOT_FOUND",
    errorMessage:"a custom CoAP error message",
  }
}

API to define: Timeout? cancel request ? observe ?