Skip to content
Simon edited this page Oct 30, 2019 · 15 revisions

⚠️ This page is deprecated ! ⚠️ You should look at this instead !

Leshan Cluster

Today a Leshan server cannot be easily deployed in a cluster for high-availability and scalability. This page list the modification to be done to Leshan and Californium and propose an example on how to deploy Leshan as a cluster.

Southbound interface

The southbound interface is the interface between the device and server communication (CoAP).

Load-Balancer

If we deploy Leshan as mutiple machine we need a front load balancer for sending the incoming messages to one of the cluster Leshan instance.

Today Leshan accept CoAP+DTLS device communications. It's all based on UDP (maybe later TCP). One of the few load-balancer supporting UDP is LVS - Linux Virtual Server. We will focus on providing a solution compatible with it. (Please if you know another Level 3 load-balancer usable in this situation please mention it!). We don't explore DNS round robin load balancing, because the amount of state to share would be larger (whole DTLS states). We prefer level 3 source IP/port based routing.

Proposed network infrastructure

States

  • A DTLS connection contains a lot of states (fragment, epoch, handshake states, master key etc..)
  • CoAP states: MID, tokens
  • LWM2M: registrations, security parameters, observations

If we try to share all of those states we are going to greatly reduce the performance. We should try to keep some on the leshan instance. For example we can say the MID and most of the DTLS states (outside of the ones needed for DTLS resume) can be kept in the Leshan instance because the level3 state-full load-balancer will push all the UDP packets coming from the same source IP/port to the same Leshan instance.

The remaining long term states to share would be: CoAP observations (Tokens), LWM2M registration, and parameters for starting a DTLS handshake.

For a first step DTLS resume can be excluded because we can force the client into a full re-handshake (which is not so costly with PSK schemes).

Code impacts

Leshan: the registration store is already good, the only problem is the Observation registry.

Californium: observation are based on the "exchange" abstraction and is quite difficult to persist, this should be refactored or we also could create a more low level API for receiving .

Scandium: on a second step we can extract the information for re-handshaking and share it.

Northbound interface

The northbound interface is the interface between the Leshan cluster and the end user, like an IoT backend.

When you have a device connected to a server, only this server (which contains all the DTLS states) is able to communicate with the device.

So when a northbound interface user send a request it should be "routed" to the server in charge of this client.

The proposed solution is to use fan-out publishing in a broker. The request is published from the user to all the Leshan server instanced and if one of the server sees it's a request for one of it's devices it needs to answer back to the client that it's going to process the request. Once the request is processed the server publish the result in the broker (still in fan-out).

The client put a ticket id (a token) on the request and use it to correlate with the received fan-out responses.

(source: https://docs.google.com/presentation/d/1rGkpmPx_W37ojvlp6SL4oLuCkDlIAdd6XkTyHx2Li5k/edit?usp=sharing)

HTTP API and Demo

Since Leshan server exposes its capabilities from a broker interface (redis pub/sub, AMQP, etc..) the demo web UI will not be provided in the Leshan server.

So for demo/testing the leshan-server-demo project can use an in memory implementation of the borker and tunnel this to the javascript UI using web-sockets.

Redis

A first cluster implementation will be based on Redis.

Northbound API

This API will be based on redis Pub/Sub. This API was experimentally implemented but is no more available in master. This old code is still available in branch cluster but it is not up to date.

Registration

New registration event will be accessible on LESHAN_REG_NEW channel. The payload is the new registration.

{
    regDate: 1467883514122,
    address: "127.0.0.1",
    port: 60071,
    regAddr: "0.0.0.0",
    regPort: 5683,
    lt: 30,
    ver: "1.0",
    bnd: "U",
    ep: "myDevice",
    regId: "ELP5Ql2v4b",
    objLink: [{
          "url": "/","at": {"rt": "oma.lwm2m"}
      }, {
          "url": "/1/0","at": {}
      }, {
          "url": "/3/0","at": {}
      }, {
          "url": "/6/0", "at": {}
    }],
    addAttr: {},
    root: "/",
    lastUp: 1467883514122
}

Registration update event will be accessible on LESHAN_REG_UP channel. The payload is the new registration (regUpdated) and the registration update (regUpdate).

{
    regUpdate: {
        regId: "ELP5Ql2v4b",
        address: "127.0.0.1",
        port: 60071
    },
    regUpdated: {
        regDate: 1467883514122,
        address: "127.0.0.1",
        port: 60071,
        regAddr: "0.0.0.0",
        regPort: 5683,
        lt: 30,
        ver: "1.0",
        bnd: "U",
        ep: "myEndpoint",
        regId: "ELP5Ql2v4b",
        objLink: [{
            url: "/", at: {rt: "oma.lwm2m"}
        }, {
            url: "/1/0", at: {}
        }, {
            url: "/3/0", at: {}
        }, {
            url: "/6/0", at: {}
        }],
        addAttr: {},
        root: "/",
        lastUp: 1467884189419
    }
}

De-registration event will be accessible on LESHAN_REG_DEL channel. The payload is the registration.

{
    regDate: 1467883514122,
    address: "127.0.0.1",
    port: 60071,
    regAddr: "0.0.0.0",
    regPort: 5683,
    lt: 30,
    ver: "1.0",
    bnd: "U",
    ep: "myDevice",
    regId: "ELP5Ql2v4b",
    objLink: [{
          "url": "/","at": {"rt": "oma.lwm2m"}
      }, {
          "url": "/1/0","at": {}
      }, {
          "url": "/3/0","at": {}
      }, {
          "url": "/6/0", "at": {}
    }],
    addAttr: {},
    root: "/",
    lastUp: 1467883514122
}

Request/Response

The channel LESHAN_REQ is used to send request. The payload is a ticket for this request (ticket), the destination endpoint (ep) and the request to send (req)

{
  ep:"myEndpoint",
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  req:{
    kind:"read",
    path:"/3/0/1",
    contentFormat:1541,
  }
}

The channel LESHAN_RESP is used to receive response. Several message can be received on this channel.

An Ack message which means that the request is handled by 1 instance in the cluster.

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  ack:true
}

An Error message which means that an error occurred on the instance which choose to handle the request.

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  err:{
    errorMessage:"an error message",
  }
}

A Response message with the response returned by the device.

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  resp:{
    kind: "read",
    code: "CONTENT",
    node: {
        kind:"singleresource",
        id:1,
        type: "string",
        value: "Lightweight M2M Client",
    }
  }
}

or

{
  ticket:"8c90592249c74a9b8a2da5754145dcc0",
  resp:{
    kind: "read",
    code: "NOT_FOUND",
    errorMessage:"a custom CoAP error message",
  }
}

API to define: Timeout? cancel request ? observe ?