-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIP: LongFi Semantics #3
Changes from all commits
cfc4b07
88e6421
f84ae00
ef17f8d
9c4be1b
e030bbc
b449137
911d588
e20422d
b82bc04
eb7f07b
2495576
5593c9f
c333cf6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
- Start Date: 2019-08-16 | ||
- HIP PR: <!-- leave this empty --> | ||
- Tracking Issue: <!-- leave this empty --> | ||
|
||
Table of Contents | ||
================= | ||
|
||
* [Summary](#summary) | ||
* [Regulatory](#regulatory) | ||
* [Protocol](#protocol) | ||
* [Versioning](#versioning) | ||
* [Joining](#joining) | ||
* [Datagram](#datagram) | ||
* [Uplink](#uplink) | ||
* [Downlink](#downlink) | ||
* [Fragmentation](#fragmentation) | ||
* [Channels](#channels) | ||
|
||
## Summary | ||
[summary]: #summary | ||
|
||
This whitepaper introduces the high-level semantics of LongFi, the Helium network's wireless protocol. | ||
|
||
Providing wide-area wireless connectivity is the Helium network's _raison d'être_. Providing this connectivity in a manner that is implementable by both Helium and third-parties requires a free and open protocol that devices, hotspots, and routers understand. | ||
|
||
This proposal is not an all-encompassing specification but lays the foundation for further HIPs which will serve as the specification. | ||
|
||
## Versioning | ||
[versioning]: #versioning | ||
|
||
LongFi is versioned so that it can be improved in future revisions without breaking backward compatibility. | ||
|
||
## Regulatory | ||
[regulatory]: #regulatory | ||
|
||
Regulations on intentional radiators vary region by region. These regulations inform much of LongFi's design, primarily... | ||
|
||
> TODO: | ||
> - time on air | ||
> - duty cycle | ||
|
||
## Protocol | ||
[protocol]: #protocol | ||
|
||
LongFi is a session-oriented protocol. However, unlike most wireless protocols which operate within a network of trusted base stations, devices in the Helium communicate _through_ untrusted hotspots. Therefore, sessions in the Helium network are between devices and routers. Sessions persist regardless of which or how many hotspots receive their packets. | ||
|
||
``` | ||
┌──────────┐ | ||
│ Router │ | ||
└──────────┘ | ||
▲ | ||
┌─────┴─────┐ | ||
▼ ▼ | ||
┌─────────┐ ┌─────────┐ | ||
│ Hotspot │ │ Hotspot │ | ||
└─────────┘ └─────────┘ | ||
▲ ▲ | ||
└─────┬─────┘ | ||
▼ | ||
┌────────────┐ | ||
│ Device │ | ||
└────────────┘ | ||
``` | ||
|
||
### Joining | ||
[joining]: #joining | ||
|
||
When device starts up, it is session-less, or not connected to its organization's router. The process of establish a session is called joining. The send and response layer is called a super frame. All call/response messages will have the following fields at a minimum: | ||
|
||
Datagram Key (DGK) - OUI - Device ID (DID) - Fingerprint (FP) | ||
|
||
| DGK | OUI | DID | FP | | ||
|-----|-----|-----|----| | ||
|
||
Sessions have a finite lifetime. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on what? A known timeout? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once the second standard deviation of total payload size in addition to the payload (198% total) is collected without getting a full message, the session is terminated and it needs to be re-initiated right now. This puts a cap on chattyness of devices and gives us a fixed memory overhead required for embedded. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You maybe answered a different question @refugeesus ? I think the maximum length of a packet is after encoding is different than lifetime session, which I think @fvasquez is asking about. I don't think we have defined lifetime of a session. It needs to be as infrequent as possible, to save on resources, but frequent enough to be secure. So it might depend on how many bytes you've sent using that session key? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am working from payloads being collections of packets. The total lifetime of a session, given a payload 100 bytes in length, is 198 bytes collected or the maximum amount of time it would take to send this amount of data. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unique session key per ACK'ed payload sounds expensive as I mention here #3 (comment) |
||
|
||
Super frames contain necessary connection information and requested payload size needed to facilitate communication. The added fields are: Session ID (SID), Payload Size (PLS). The generic super frame structure is as follows: | ||
|
||
| DGK | OUI | DID | FP | SID | PLS | | ||
|-----|-----|-----|----|-----|-----| | ||
|
||
The send structure of an unconnected device has the following fields completed: | ||
|
||
| DGK | OUI | DID | FP | SID | PLS | | ||
|-----|-----|-----|----|-----|-----| | ||
| X | X | X | X | _ | X | | ||
|
||
The received structure for an unconnected device has the following fields completed: | ||
|
||
| DGK | OUI | DID | FP | SID | PLS | | ||
|-----|-----|-----|----|-----|-----| | ||
| X | X | X | X | X | X | | ||
|
||
Once the device receives a complete super frame structure (has been assigned a session ID) it is considered connected and can begin transmitting data frames. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest that the device allocate the session ID and make the acknowledgement optional so devices can be 'fire and forget' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you give one-shot transmissions a different DGK value the optional ack is implied and session ID (which should probably be the last field) can be ignored. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think your definition of session ID is different then what I was understanding previously from @Vagabond My previous understand, you might "connect" and create a session ID once a day let's say, and then use that key to send every packet during the day, with or without ACKs. From what you're writing, it sounds like anything that's not fire and forget requires a unique session key which expires after the payload is sent. Which sounds expensive to me... |
||
|
||
#### Connection Frame | ||
|
||
Before sending a payload, a device must broadcast identifying information and receive confirmation from a router via hotspot that it is ready to receive data. The call/response for this described above again is: | ||
|
||
*Call* | ||
|
||
| DGK | OUI | DID | FP | SID | PLS | | ||
|-----|-----|-----|----|-----|-----| | ||
| X | X | X | X | _ | X | | ||
|
||
*Response* | ||
|
||
| DGK | OUI | DID | FP | SID | PLS | | ||
|-----|-----|-----|----|-----|-----| | ||
| X | X | X | X | X | X | | ||
|
||
A completed response indicates that a receiver is in range and a router is capable of receiving/forwarding data to a desired endpoint. | ||
|
||
#### Data Frame | ||
|
||
Once a connection has been established, the device, referred to as sender, will transmit an upper bounded number of packets to the receiver. The added fields are: Seed 1 (S1), Seed 2 (S2), Payload (PL). The general structure of the data frame is as follows: | ||
|
||
*Call* | ||
|
||
| DGK | OUI | DID | FP | S1 | S2 | PL | | ||
|-----|-----|-----|----|----|----|----| | ||
|
||
Once the router has successfully received the entire message (one or many transmissions by the sender), the device will receive a response data frame. Added fields are: Acknowledge (ACK). The response data frame will have the following structure: | ||
|
||
*Response* | ||
|
||
| DGK | OUI | DID | FP | ACK | | ||
|-----|-----|-----|----|-----| | ||
|
||
> TODO: | ||
> - Detail information on when ACK's can occur may be needed. Alternatively, should it be excluded for brevity in this doc? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think ACKs should be excluded from this document. IoT developers want some guidance on what to expect from a LongFi router. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can include it from the prior doc then. |
||
> - Size of fields to be included | ||
> - S1/S1 may be converted to sequence number? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are S1 and S2 simply sequence numbers? If so then why are there two for each "Call" datagram? I thought LDPC didn't require sequence numbers to distinguish between droplets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No S1 and S2 are seed numbers used for the fountain code process. |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know they're a pain but UML sequence diagrams might help others visualize session message flow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah.... I'll get to it then |
||
@vagabond | ||
|
||
> TODO: | ||
> - diffie hellman? | ||
> - how long do sessions live for? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I remember @refugeesus telling me that an LDPC decoder can determine how much of a message has been received. A device can decide to terminate a session and start a new one when a download stops making progress. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, see second standard deviation comment below. |
||
> - explanations of each field | ||
|
||
--- | ||
**DGK** | ||
|
||
Datagram kind. A tag value indicating this datagram's variant type. | ||
|
||
> **TODO:** | ||
> - How many variants are needed? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the lowest level, a sensor can be thought of as a collection of registers that you can read from and write to. So I would start with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure that is the kind of Type that should be here as it seems more relevant to Payload contents than Datagram function. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Fair enough. At the very least I think the protocol needs some way to distinguish between uplink and downlink datagrams so the network knows how to route them. That may not need to be a type encoded in the datagram. Could simply be whether the packet originated from the internet or a radio receiver. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, distinguishing uplink/downlink and/or first fragment vs not-first fragments would be good There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't need to order fragments anymore. Why do you need to distinguish between uplink/downlink? A device will have to query if data is available before receiving some payload. If the query returns true, the device inherently knows it is a receiver and the gateway the sender. Otherwise the direction is implied to be the other way. The only other bi-directional communication is joining acknowledgement and a full message received acknowledgement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think there's enough information in the spec to state this.
There is, again, not enough information to indicate that uplink/downlink could be implied. I only propose those as potential types of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Fountain codes remove this problem.
DGK is included in generic format for now. |
||
> - Can this tag serve both versioning and variant disambiguation? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think versioning should be separate from DGK. That simplifies variant encoding and version parsing. |
||
> - Is this enough? | ||
|
||
--- | ||
**OUI** | ||
|
||
Organizationally unique identifier. A globally unique number which hotspots use to forward datagrams to the correct organization's router. | ||
|
||
--- | ||
**DID** | ||
|
||
Device identifier. DIDs are assigned to devices by organizations. Every hardware device in an organization _should_ have a unique DID, but sharing DIDs is not forbidden. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So an OUI combined with a DID constitute a unique device address on the network? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's my thinking |
||
|
||
> TODO: | ||
> - Can DIDs really be shared? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From our perspective, we don't care if organizations want to do weird things with DIDs. |
||
|
||
--- | ||
**PAY** | ||
|
||
Datagram payload. Payload lengths depend on spreading and coding, but on their actual content. Payloads are intended to be encrypted and opaque to hotspots. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you mean to say "Payload lengths [do not] depend on spreading and coding ..."? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
> **TODO:** | ||
> - ~indended~ required to be encrypted? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, we don't care what organizations want to do. They probably should encrypt the information but it's not required. Imagine open networks of weather stations, etc. |
||
|
||
--- | ||
**FP** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was previously called MAC/HMAC. I like Fingerprint better because it's more expressive, but I know the blog post by Dal and perhaps the white paper make reference to MAC. |
||
|
||
Fingerprint. Packet brokerage between hotspots and routers depend on fingerprints. They allow a hotspot to prove to a router the hotspot has a datagram destined for that router, without divulging the datagram's payload. This ability is core to hotspots earning data credits for forwarding datagrams. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Packet brokerage gets complicated. Do we reward hotspots for forwarding duplicate packets? Do hotspots stop forwarding packets they aren't being rewarded for? I don't know the answers to these questions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If there is no hardware link layer session, it seems you can either pay for packets until they are no longer needed, or drop duplicates and make gateways eat the cost of doing the work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe the model is:
The interaction is supposed to happen quickly and with a small amount of trust; ie: the hotspot does not wait to see the transaction post before delivering the payload. The theory is that if a router continuously stiff the hotspot, the hotspot will blacklist said router. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Works for me but I don't think the blacklist period should be permanent. A hotspot should eventually give a router another chance to pay in the event that router's favorite hotspot goes down. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think the key thing is that the router must say "i like that fingerprint, give me the payload" if and only if it intends on burning a DC. Just providing the fingerprint and the router ignoring or NAK'ing should not be what makes a hotspot blacklist the router. In the case where a router has said "I like that fingerprint give me the payload" and repeatedly doesn't pay, this is pretty nefarious behavior and I think a very long or permanent blacklisting of that hotspot is in order. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Agreed. That is definitely bad behavior on the part of a router. So a router won't ask a hotspot for a packet payload if it already just received a packet from that same device via a different hotspot? I'm good with that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We currently do not have any method to atomically swap credits for packets via the blockchain, so this is still all up in the air and might be unspecified. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding was atomic swap is just unfeasible due to latencies, so it would be more of an untrusted microtransaction with eventual settlement. |
||
|
||
> **TODO:** | ||
> - What data are fingerprints derived from? | ||
> - Are they SHA or something more exotic? | ||
> - Is the fingerprint, along with non-payload data, a | ||
> zero-knowledge-proof (ZKP)? | ||
|
||
|
||
### Uplink | ||
[uplink]: #uplink | ||
|
||
Uplink communication is from device to router. | ||
|
||
> TODO: | ||
> - Unacknowledged vs acknowledged | ||
> - Listen before talk | ||
> - Spreading-factor vs dwell-time vs range | ||
> - Initial uplink spreading factor | ||
|
||
### Downlink | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have two categories of devices?
Or do we say everything is "sync" but that some devices have a long dwell time waiting for downlink packets. If they pulse out every 6 hours and wait for 6 hours on some channel, they're pretty much always on and this might allow for more dynamic behavior (whereas "device category" feels pretty static). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good question. The expectation that a device is "always on" seems unreasonable to me. Most downlink communications are what I call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Downlink communication will have to be initiated by a query from the device to router due to the class of device we deal with. Devices that are not dependent upon power conservation and network models which have stronger device-to-gateway association (sessions and handoff) are capable of asynchronously receiving in practice. We can't guarantee the former and explicitly do not have the latter for our devices. |
||
[downlink]: #downlink | ||
|
||
Downlink communication is from router to device. | ||
|
||
> TODO: | ||
> - Unacknowledged vs acknowledged | ||
> - Listen before talk | ||
> - Spreading-factor vs dwell-time vs range | ||
> - Initial uplink spreading factor | ||
|
||
### Fragmentation | ||
[fragmentation]: #fragmentation | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on what @refugeesus described to me in our meeting last week I believe the LDPC layer handles all message fragmentation, reassembly and retransmission. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fragmentation, reassembly, and FEC are covered by the process linked in a previous comment. No need for retransmission anymore. The paper excerpt also covers the ACK and rolling window timeout concepts needed to make the link layer mostly complete. |
||
|
||
Datagrams are the fundamental unit of messaging in LongFi. Additionally, they have regulatory imposed maximum payload sizes. This poses a problem for applications needing to send data of arbitrary length. The solution to this is problem is fragmentation. Fragmentation is the process of decomposing large application-level messages into several datagrams and reassembling those fragments at the recipient's end of the link. A naive implementation of this process is fraught with peril when communicating over unreliable links. | ||
|
||
### Channels | ||
[channels]: #channels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So hotspots have no awareness of sessions? They merely see individual packets from devices that they forward on to their intended routers.
Are sessions mandatory for LongFi? Is there no session-less means of communication like UDP?
What about downlink communications from routers to devices? Are sessions bi-directional or are downlink sessions something different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need a form of a session somewhere to have any down-link. The association between connected device to hotspot target is less certain with session associated with the router, but it may be manageable. We won't know how manageable this is until this is deployed in the wild.
I believe you will need to establish a session from the device side, where they can query if a down-link packet is available or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a practical standpoint, I don't think we can enforce sessions if a device/router pair decided not to; at least there's no way for us to force some a device/router pair from using the hotspots to move packets despite a lack of session. And I think your UDP analogy is really good.
That being said, I think the LongFi spec is describes a possible device/router protocol in addition to the hotspot routing protocol.
As a side-note, have we dropped
LoFi
/HiFi
nomenclature, @JayKickliter?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at least for now. I didn't want to force the separation from the outset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using fountain codes, we will implicitly have sessions between routers and devices to fulfill the droplet aggregation process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use fountain code, reconstruction will be on the router and not on the hotspot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what the reasoning was though? I had dreamed up schemes where aggregation/reassembly could happen router side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I thought I was saying:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or am I misreading something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am the one misreading something. My bad