New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reverse AS-NS linking direction #2312
Comments
Why not We want to be able to load-balance these and to be able to do things concurrently. Stream mapping a NS to 1 singular AS is definitely not a way to go.
Yes, that's what we agreed upon. The goal is to get rid of linking altogether. |
I've updated the issue. Should've written this earlier in order to keep all the details in mind. |
I already started on this locally couple of days ago including NS support. Let's do this together with this issue, otherwise later we'll need to deprecate yet another RPC, which is not worth it in this case, since it's such a trivial feature to add with |
Great.
We wouldn't tie the AS instance to application identifiers, right? We should be spreading traffic from applications with many devices over multiple AS instances.
I like the idea of AS-AS communication via pub/sub. For the future, I think we should document why we chose AS-AS instead of a dedicated component for applications to subscribe to.
We can use the in-memory variant by default for single process deployments and the getting started, see
What scenario is this? That an application schedules downlink duplicates via multiple ways, i.e. MQTT and webhooks? I don't think we should/want to do something against that. |
True - probably
No, and now I see that the original text wasn't clear. It's about which instance starts which PubSub integrations (that we currently have with MQTT and NATS) in multi AS environments. We cannot have multiple instances subscribe to the same external PubSub service, since then downlink queue operations would be duplicated:
NATS does have subscription groups, so for NATS this could be solved (I don't think |
I see what you mean. Yes, that is something to account for. We need some sort of sharding. That touches however on clustering, which we don't discuss here. What else needs to be discussed here to move forward with this? |
For now all of the open questions have been addressed (to the extent of this repository). I will remove the |
I've pushed my current progress in #3190 . I think I covered most (if not all) of the AS implementation, but I have some questions regarding the API:
|
|
Background
Can be skipped if you are familiar with https://github.com/TheThingsIndustries/lorawan-stack/issues/941
In the current implementation of the communication between the Network Server and Application Server, the links are established by the Application Server, which links itself to the Network Server. Over this link, uplinks are sent as a stream of
ttnpb.ApplicationUp
.The underlying issue with this approach is that Application Servers are inherently stateful due to linking. This introduces challenges with regards to load balancing between multiple Application Servers - the distribution of links over multiple instances, maintaining/migrating links when instances are down or a network partition occurs being just a number of issues introduced by the linking state.
Proposed solution
In order to tackle this issue, following our offline discussions, we've decided that we would like to reverse the direction of the links. This means that instead of the Application Server being a client to the Network Server, the Network Server becomes a client of the Application Server. This simplifies the logic for both components.
Since the Application Server may have clients (MQTT/gRPC) connect to it while not being linked by a Network Server, a new PubSub service should be used as a broker between Application Server instances. When a new application subscription (connection) arrives to an Application Server, a PubSub subscription is created for the uplinks of the said application. When an uplink is received by an Application Server instance, it is published to the topic of the application.
Implementation details:
HandleUplink
should be added to a newNsAs
service that the Application Server implements.component.GetPeer
/component.GetPeerCon
in order to receive the peer/connection to the ASbased on the application identifiers.The uplink is then published to the PubSub service, from where subscription based frontends can pick it up.
gocloud.dev/pubsub
for PubSub communication for inter-AS PubSub.Required migrations
We need to migrate the application default payload formatters to a more general Application Server application store. The migration should be lazy - when we need the application payload formatter we check if we have it in the AS application store, and if it is missing we attempt to retrieve it from the old link store. Any writes/updates are done in the application store.
Open questions
Should we want to drop external NS linking completely ?How can AS PubSub integrations handle downlink operations ? Internally we can ensure that uplinks are sent only once, using subscription groups, but I don't see how we can deduplicate downlink queue operations, since some services (MQTT) do not support subscription groups.Reworded: How can we decide which instance starts an external PubSub integration (NATS/MQTT), given that having each instance connect to every integration would result in duplicated downlink queue operations ?
The text was updated successfully, but these errors were encountered: