Skip to content

Releases: l7mp/stunner

First stable release: STUNner goes GA!

31 Oct 12:19
bb50435
Compare
Choose a tag to compare

We are extremely proud to present STUNner v1.0.0, the first stable and generally available release of the STUNner Kubernetes media gateway for WebRTC brought to you by l7mp.io. This release marks the culmination of a two-year journey to make WebRTC services a first class citizen in the cloud native ecosystem.

There are no major nor minor changes compared to RC5. This release is not without improvements though: we completely reorganized the documentation and introduced two new tutorials, one showing how to deploy STUNner with Elixir and another one for Janus. STUNner now comes with a total of 8 demos, encompassing a wide range of WebRTC applications from cloud-gaming and desktop streaming to video-conferencing. Besides, there are also some minor janitorial fixes throughout the code base.

The journey so far

We started the project a little bit more than two years ago with one simple and ambitious goal: fit WebRTC services natively into the Kubernetes world. Back in those days, the state-of-the-art was to use Kubernetes nodes as mere VMs (the dreaded "host-networking hack"), expose media servers directly to the Internet, and rely on costly 3rd-party STUN/TURN providers for NAT traversal. This model renders essentially all the cloud native selling points ineffective, like elasticity, scalability and manageability, making WebRTC applications an unwelcome guest in the cloud native ecosystem.

Clearly, this had to change. For deploying conventional web apps into Kubernetes it is customary to include a dedicated gateway component with the sole responsibility of exposing the app to the Internet in a secure and scalable way. We wanted to exploit the same pattern for WebRTC, by creating a media gateway for ingesting WebRTC traffic into a Kubernetes cluster over a single IP and port. It just occurred to us that if we implement our gateway on top of the TURN protocol we can also eliminate the need for 3rd party NAT traversal, rendering self-hosting WebRTC services simpler and more economic and cutting down latency significantly. And that is how STUNner was born.

STUNner has come a long way during the two years since its first public release. From a standalone TURN server deployed from a static YAML manifest and customized via environment variables, it has become a fully fledged media gateway service managed by a dedicated Kubernetes operator that implements the official Kubernetes Gateway API. Today STUNner provides many unique features that no other open-source or commercial alternative does, like a fully managed dataplane lifecycle, limitless scale-up/scale-down with thoroughly tested graceful shutdown, and extensive documentation with several detailed tutorials on how to install the most popular WebRTC services into Kubernetes over STUNner.

With this release we consider the project's main goal completed. We have worked closely with our users to help them deploy and troubleshot STUNner with essentially any WebRTC service imaginable, from small-scale to large-scale, from open-source to proprietary, and from proof-of-concept to production stage. No matter which WebRTC application our customers throw at it, STUNner just handles the load reliably, efficiently, and mostly invisibly.

We are new confident that STUNner is ready for the general public and it is time to roll the first GA release.

The future

The v1 release, of course, does not mean an end to our journey.

STUNner has always been an open source software available under a permissive license. The open source development model has proved uniquely fruitful to bring the initial STUNner releases out to early adopters, acquire a solid installed base, and get invaluable bug reports and fixes from our users. This will remain unchanged: STUNner v1 will always be published and maintained as an open-source software.

That being said, the main course of STUNner development will undergo a major change from v1 onward: the future of STUNner development is commercial. We have many exciting new features in preparation, slated to be released soon in the enterprise version of STUNner under an affordable monthly subscription model. Our flagship Linux/eBPF-based UDP/TURN acceleration engine provides millions of packets per second performance per CPU at ultra-low latency. And this is just the start: from Kubernetes operators for deploying and managing full WebRTC applications to Let's Encrypt integration and multi-cloud support, we have many premium features in preparation.

Acknowledgments

Finally, a round of acknowledgments is in order. First and foremost, we are thankful to our users who supported us from the start, providing us invaluable feedback and early validation of our efforts. Without you, we would have never got to this point. Second, we are grateful to the pion/webrtc project, and especially Sean DuBois: if we didn't have access to a full-featured and an amazingly reusable Golang-based WebRTC toolkit from the outset most probably we would still be struggling with the complexities of libwebrtc desperately trying to release the first usable version of STUNner. Pion is amazing! Last but not least, we thank the work for everyone contributing to our development efforts in any ways: from PRs and bug fixes to documentation updates and bug reports, your help is truly appreciated.

Enjoy STUNner, join us at Discord, and don't forget to support us!

Towards v1: Fifth stable release candidate

20 Sep 13:36
Compare
Choose a tag to compare

We are proud to present STUNner v0.21.0, the fifth stable release candidate of the STUNner Kubernetes media gateway for WebRTC, brought to you by l7mp.io.

This release candidate marks the next step of the stabilization efforts that will allow us to release the stable v1 version. As such, there have been only minor changes throughout the code base, apart from the usual assortment of documentation fixes, module import updates, and infrastructure upgrades.

News

In the last release we implemented a minor change to the default load balancer settings, which ended up breaking STUNner for some of our users. This release adds a new annotation that allows to disable this setting for the affected users. Another stability fix (in fact, a one-line commit) makes sure the STUNner gateway operator will safely exit with an error when it fails to start up for some reason (e.g., due to a missing CRD or RBAC policy), instead of just hanging forever. Failing pods early is a safe way to prevent runaway startup crash loops, letting Kubernetes to retry the pod while enforcing a secure backoff algorithm.

Disabling session affinity

By default STUNner applies the sessionAffinity: ClientIP setting on the LB services it creates to expose Gateways. Normally this setting improves stability by ensuring that each TURN session is safely pinned to the same dataplane pod for its entire lifetime. Certain hosted Kubernetes platforms, however, seem to reject UDP LB services that have this setting on, breaking STUNner deployments on these systems.

In order to prevent STUNner from enforcing session affinity on the LB Service corresponding to a Gateway, you can now set the stunner.l7mp.io/disable-session-affinity: true annotation on the Gateway. Otherwise, session affinity is turned on.

Commits

fix: Add annotation to unset LB Service session-affinity, fix #155
fix: Exit when the operator fails to start up
doc: Document the disable-session-affinity:true annotation
chore: Add buildinfo to the stunnerctl binary
chore: Bump pion/turn to v4 and update some deps
chore(rtd): Update rtd dependencies

Enjoy STUNner and don't forget to support us!

Final stabilization: Fourth stable release candidate

30 Aug 13:51
Compare
Choose a tag to compare

We are proud to present STUNner v0.20.0, the fourth stable release candidate of the STUNner Kubernetes media gateway for WebRTC, brought to you by l7mp.io.

Originally, we did not intend to release a fourth release candidate (RC). In fact, after RC1, we hadn't planned on releasing any additional RCs. However, a few minor bug fixes and feature requests have made it necessary to revise our plans once again. As a result, we have decided to roll out a fourth, and, if all goes well, final-release candidate before v1.

News

There are no major new features in this release. Minor changes include a couple of new Gateway annotations to choose the nodeport and control whether to expose the health check port on external load-balancer Services (see the rewritten documentation). Additionally, two significant architectural changes have been made that may impact our users: as of this release, we have set the default "TURN session stickiness" to "client-IP," disabled the operator finalizer, and added full support for STUN deployments (previously, only TURN was supported). This completes the NAT traversal protocols supported by STUNner, offering the full WebRTC suite.

Session stickiness

TURN is a layer-7 protocol that implements its own handshakes for creating and managing allocations, requesting permission to reach peer IPs, and more. This is quite different from most UDP-based protocols, like DNS, which typically use a single packet exchange. DNS is simpler to support in UDP load balancers, as it does not require subsequent packets from the same client to be routed to the same endpoint/pod. This can be managed without creating per-client session state in the load balancer (e.g., using round-robin or a simple hash). Unfortunately, many load balancer implementations still default to DNS-like stateless UDP session routing, which conflicts with the UDP/TURN requirement that packets from the same client IP and port be routed to the same backend pod. When this fails, you get the dreaded "no allocation found" errors, indicating that the TURN packets of a session were routed by the load balancer to a new pod where no TURN allocation exists for the session.

Enabling sticky sessions for the load balancer service can improve the robustness of TURN connection routing, especially during scale-out/scale-in events. Therefore, in this release, STUNner sets load balancer Service's session affinity to clientIP.

We have received overwhelmingly positive feedback from our users regarding this change. However, it is still not entirely clear how UDP session stickiness is implemented across different Kubernetes providers' load balancer controllers. Please file an issue if this setting causes any problems in your deployment.

Disabling the operator finalizer

In the last release, we introduced a session finalizer in the STUNner gateway operator. This feature ensured that all Kubernetes resources automatically created by the operator were properly removed and resource statuses were invalidated when the operator terminated. Unfortunately, this conflicted with the high-availability requirements of some of our customers. Specifically, during operator restarts (e.g., due to a node failure), there was a transient phase where the old operator instance had already removed the dataplanes and external load balancer services and the new instance had not yet finished creating the replacements, causing clients' live TURN sessions to break.

To address this issue, we have disabled the finalizer by default in this release. You can still enable it by using the command-line flag --enable-finalizer=true on the operator; otherwise, it remains off by default. This change means that when the operator terminates Kubernetes resources it automatically created may be left in an undefined state. To avoid potential issues, make sure you manually remove all Gateway API resources (Gateways, UDPRoutes, etc.) when uninstalling STUNner to clean up your cluster.

STUN support

Several users have indicated that they expected a project named "STUNner" to actually support the STUN protocol where the name comes from. Until now, our focus has been on supporting TURN in STUNner and, although the implementation has always included a complete STUN server, we had not explicitly supported STUN. With this release, STUN is now a first-class citizen in STUNner. For more details, see the related blog post.

Commits

feature: Set LB service session affinity to "clientIP" (#51)
feature: Allow Gateways to select the target port, fixes #50
feature: Allow stunnerd pod labels/annotations to be set from the DP
feature: Configure the exposition of the health-check port, fix #49
feature(turncat): Set SNI on TURN/TLS client connections
fix: Fix indexing error in upsertDeployment, #48
fix: Fix segfault when setting the affinity from the Dataplane
fix: Prevent concurrent WS socket write errors in the CDS client
fix: Remove STUNner annotations from Service unless defined in Gateway
fix: Set spec.externalTrafficPolicy only for external Services, fix #150
fix(turncat): Report error in turncat when invoked with empty peer IP
main origin/main feat: Disable operator finalization

Enjoy STUNner and don't forget to support us!

The final missing pieces: Third stable release candidate

28 May 17:15
Compare
Choose a tag to compare

We are proud to present STUNner v0.19.0, the third stable release candidate of the STUNner Kubernetes media gateway for WebRTC from l7mp.io.

Originally we did not plan to release a third RC before v1, however, a couple of compelling bugs and feature requests required us to revisit this plan. Now that the v1 release milestones in the main STUNner repo and the STUNner gateway operator repo have all been fixed, we decided to roll a third release candidate. We hope this version to become the final v1 release in the coming weeks.

Changes occurred mostly on the side of the Kubernetes gateway operator, including the implementation of a new endpoint discovery controller to restore the graceful backend shutdown feature, further customization in the way dataplane pods are deployed, new Gateway annotations to finetune the way STUNner gateways are exposed to clients, and the implementation of a proper finalizer that makes sure the cluster is left in a well-defined state after the operator has been shut down. With this release, STUNner goes into a soft-freeze state again: only fixes, refactors, and documentation updates will be accepted in the master branch until we release v1.

News

A new controller for endpoint discovery

One of STUNner's main security features is filtering client connections based on the requested peer address: STUNner permits clients to reach only the pods that belong to one of the backend services in the target UDPRoute, and it blocks access to all other pod IPs. This feature relies on STUNner's endpoint discovery mechanism, which makes it possible for the operator to learn the pod IPs that belong to backend services. Until this release, the operator has been watching the legacy Kubernetes Endpoints API to learn IP addresses. Unfortunately, certain limitations of this legacy API have made backend graceful shutdown impossible to support. In particular, when a backend pod is terminated it is immediately removed from the Endpoints resource by Kubernetes, which triggers STUNner into rejecting TURN packets for the terminating backends from that point. This breaks all TURN connections to terminating backends, even though the backend may very well remain functional for a while to finish servicing active client connections (this process is called graceful shutdown).

In this release, STUNner's endpoint discovery mechanism has been rewritten over the newer EndpointSlice API. Since EndpointSlices keep the IP of terminating pods, this change has restored the graceful shutdown functionality.

Note that you need at least Kubernetes v1.21 to take advantage of the EndpointSlice API (STUNner falls back to the legacy Endpoints API when EndpointSlices are unavailable), and at least v1.22 to let EndpointSlices show terminating pods.

New dataplane customization features

With STUNner going into large-scale production, further customization features have been requested by users to finetune the way the stunnerd dataplane pods are deployed. With this release, the Dataplane CRD, used as a template by the operator to create stunnerd deployments, contains some additional settings:

  • ImagePullSecrets: list of Secret references for pulling the stunnerd image. This is useful when deploying stunnerd from a private container image repository.
  • TopologySpreadConstraints: this standard Deployment spec describes how the group of stunnerd pods ought to spread across topology domains.
  • ContainerSecurityContext: this field in the Dataplane spec allows to set container-level security attributes. Setting pod-level security attributes has always been supported, but now it is possible to customize security attributes at the level of each container in the stunnerd pods. This is useful for deploying sidecar containers alongside stunnerd.

For users deploying STUNner with a Helm upgrade: make sure to manually apply the new CRDs (kubectl apply -f deploy/manifests/static/stunner-crd.yaml), otherwise you won't have access to the new Dataplane spec fields.

Retaining clients' source IP

Normally, Kubernetes load balancers apply source IP address translation when ingesting packets into the cluster. This replaces clients' original IP address with a private IP address. For STUNner's intended use case, as an ingress media gateway exposing the cluster's media services over the TURN protocol, this does not matter. However, STUNner can also act as a STUN server, which requires clients' source IP to be retained at the load balancer.

Starting from the new release, this can be achieved by adding the annotation stunner.l7mp.io/external-traffic-policy: local to a Gateway, which will set the service.spec.externalTrafficPolicy field in the Service created by STUNner for the Gateway to Local. Note that this Kubernetes feature comes with fairly complex limitations: if a STUN or TURN request hits a Kubernetes node that is not running a stunnerd pod, then the request will silently fail. This is required for Kubernetes to retain the client IP, which otherwise would be lost when passing packets between nodes. Use this setting at your own risk.

Manually provisioning the dataplane

In some cases it may be useful to manually provision a dataplane for a Gateway. One specific example is when STUNner is used as a STUN server: replacing the stunnerd Deployment created by the operator to run the dataplane for a Gateway with a Daemonset will make sure that at least one stunnerd pod runs on each node, which eliminates the above problem created by the service.spec.externalTrafficPolicy: Local setting.

In the new release, adding the annotation stunner.l7mp.io/disable-managed-dataplane: true to a Gateway will prevent STUNner from spawning a dataplane Deployment for the Gateway (the LB Service will still be created). This then allows one to manually create a stunnerd dataplane and connect it to the CDS server exposed by the operator to load fresh dataplane configuration. Remove the annotation to revert to the default mode and let STUNner to manage the dataplane for the Gateway.

For instance, in order to run the dataplane of a Gateway in a DaemonSet, first dump the automatically created Deployment into a YAML file (this will serve as a template for the manually created DaemonSet), apply the above annotation to make sure the operator removes the automatically created Deployment, edit the template in the YAML file by rewriting the resource kind from apps/v1/Deployment to apps/v1/DaemonSet and remove the useless settings (like replicas), and finally apply the modified YAML. This will deploy stunnerd to all nodes of the cluster, making sure that STUN requests will always find a running STUN server no matter which Kubernetes node they hit. The cost is, however, that the dataplane DaemonSet will have to be manually adjusted every time you change the Gateway.

Manual dataplane provisioning requires intimate knowledge with the STUNner internals, use this feature only if you know what you are doing.

Selecting the NodePort for a Gateway

By default, Kubernetes assigns a random external port from the range [32000-32767] to each listener of a Gateway exposed in a NodePort Service. This requires all ports in the default NodePort range [32000-32767] to be opened on the external firewall, which may raise security concerns for hardened deployments.

In order to assign a specific NodePort to a particular listener, you can now add the annotation stunner.l7mp.io/nodeport: {"listener_name_1":nodeport_1,"listener_name_2":nodeport_2,...} to the Gateway, where each key-value pair is a name of a listener and the selected (numeric) NodePort. The annotation value itself must be proper a JSON map. Unknown listeners are silently ignored.

Note that STUNner makes no specific effort to reconcile conflicting nodeports: whenever the selected nodeport is unavailable Kubernetes will silently reject the Service, which can lead to hard-to-debug failures. Use this feature at your own risk.

Finalizer

So far, when the STUNner gateway operator was removed all the automatically created Kubernetes resources (stunnerd Deployments, Services and ConfigMaps) have kept running, with a status indicating a functional gateway deployment. From this release the operator carefully removes all managed resources and invalidates gateway statuses on exit, which makes sure the cluster is left in a well-defined state.

Commits

chore(Helm): Operator now can be installed as a dependency chart.
feat: Add finalizer to leave cluster in well-defined state on shutdown
feat: Allow for disabling the managed dataplane for a Gateway
feat: Allow Gateways to request specific NodePorts, fix #137
feat: Implement EndpointSlice controller, fixes #26
feat: Set ExternalTrafficPolicy in LB Services, fixes #47
feat: Add new dataplane customization features, fixes #46
fix: Deepcopy K8s resources to be sent to the updater
refactor: Clean up metadata sharing between Gateways and Deployments, fixes #45
test: Refactor integration test cases

Enjoy STUNner and don't forget to support us!

Stabilization: Second stable release candidate

26 Feb 17:23
Compare
Choose a tag to compare

We are proud to present STUNner v0.18.0, the second stable release candidate of the STUNner Kubernetes media gateway for WebRTC from l7mp.io.

Major changes include the conversion of all STUNner components and CLI utilities to the managed dataplane mode and the new config discovery API, the removal of the ConfigMap that used to hold the running dataplane config, the removal of the legacy dataplane mode, plus the usual assortment of fixes and documentation updates. STUNner is now in soft-freeze state: only fixes, refactors, and documentation updates will be accepted in the master branch until we release v1.

News

Complete config discovery service support

In the ancient days of STUNner, the way the control plane was configuring the dataplane was somewhat complicated. The operator rendered the config into a ConfigMap, which was then mapped into the file-system of the stunnerd pods that actively watched the file for updates. The rest of ancillary STUNner tools, including the auth-service or turncat, also used this ConfigMap as ground-truth. This config interface served us well for the few initial releases, and it also had the useful side-effect that the dataplane config was always available at a convenient place, which greatly simplified debugging. The negative side was that, due to certain Kubernetes limitations beyond our control, config updates were very slow to propagate to the dataplane, so much so that we had to package a separate config-watcher sidecar container next to stunnerd to speed up the process. This alone was larger than stunnerd itself. Even worse, the ConfigMap exposed the TURN authentication credentials in plain text, which raised rightful concerns for our security-savvy users.

To remedy this, we have initiated the transition to a new HTTP-based config API that we call the STUNner Config Discovery Service (CDS) during the v0.15.0 development cycle. This API allows fast access to STUNner dataplane configuration via the CDS server exposed by the gateway operator. The operator creates the stunnerd pods, bootstrapping each with its own address as the CDS server, and from this point stunnerd pods load their configuration via CDS autonomously. This is the main idea in the managed dataplane mode that we made default in the last release.

However, some components, like the STUNner authentication service, and CLI utilities like stunnerctl and turncat, still used the ConfigMap to learn the dataplane config. In this release all these tools have been rewritten to use CDS, eliminating the last remaining restriction originating in the old legacy dataplane mode.

Removal of the dataplane ConfigMaps

Since the ConfigMap that holds dataplane configs is no longer needed, it has been removed in this release. From now on, the "official" way to load/watch/debug dataplane configs is via the stunnerctl tool. For those of you who can't, or don't want to, download a separate utility just to interact with STUNner configs, we still maintain the old shell script, but it has been renamed to stunnerctl.sh. Certain workarounds involving manual port-forwarding and curl are also available for those who favor minimalistic solutions, see the stunnerctl manual for details.

Remote dataplane status reporting

The rewritten stunnerctl tool gained further useful features. The main purpose is still loading and watching running dataplane configs (stunnerctl config). However, from this release the tool can also query the status of the stunnerd pods per each Gateway (stunnerctl status), and it can also be used to quickly generate TURN credentials and full ICE server configurations for testing a STUNner deployment (stunnerctl auth). See the user manual for the details.

Full support for arm64 builds

From this release, we are officially providing build artifacts for the arm64 port of STUNner. Of course, the usual amd64 images are also available. If you are testing STUNner on your M1/M2 MacBook or running STUNner on any arm64 platform, you no longer have to build STUNner images yourself: our Helm charts should download and install the proper container images automatically.

Commits

Dataplane and utilities

doc: Add docs for CDS client
feat: Add turncat binaries as release assets (#115)
feat: Report deleted configs in stunnerctl
feat: Implement a status sub-command in stunnerctl
feat: Implement a basic auth-service client in stunnerctl
fix: Fix turncat build on Windows (#111)
fix: Restore credentials after config env substitution, fix #102
refactor: Rewrite config discovery in strict OpenAPI mode
refactor: Rewrite stunnerctl and turncat to use the CDS API, fix #81
test: Run CDS client-server tests on a random port

Gateway operator

feat: Define a label for the CDS Service for discoveribility
fix: Fix CDS server address parsing
fix: Generate only a single CDS config update per render round
fix: Make sure CDS server is reachable with API server port-forward
fix: Move CDS service label defs to STUNner
fix: Remove dataplane ConfigMap, fixes #43
fix: Update service-port for existing Services on Gateway reconcile
test: Implement config watchers using CDS in integration tests

Authentication service

refactor: Rewrite using the CDS API
chore: Upgrade to STUNner API v1

Enjoy STUNner and don't forget to support us!

Stabilization: First stable release candidate

05 Jan 18:39
Compare
Choose a tag to compare

We are proud to present STUNner v0.17.0, the new release of the STUNner Kubernetes media gateway for WebRTC from l7mp.io. Major changes include the stabilization of the managed dataplane mode, which also becomes the default from this release, upgrading the Gateway API and STUNner config API to v1, the rewriting of the STUNner dataplane API over OpenAPI, fixing the handling of clusters in Prometheus metrics reporting, port-range filtering support, and the usual assortment of fixes and documentation updates.

This release marks the first candidate for the upcoming stable release. STUNner now enters into a soft-freeze state: only fixes, refactors, and documentation updates will be applied in the master branch until we release v1.

News

Upgrade to Gateway API and STUNner API v1

The Kubernetes Gateway API used by STUNner to define the TURN services exposed to clients has gone GA recently. In preparation for the upcoming v1 release, STUNner's internal dataplane API, the wire protocol used by the gateway operator to configure the dataplane, has also been upgraded to v1.

This release therefore will again require smaller changes in your STUNner configs to track the moving API targets.

  • Gateway and GatewayClass API resources have to be updated from v1beta1 to v1. The old v1alpha2 and v1beta1 versions are still accepted, but silently upgraded to v1 by Kubernetes.
  • UDPRoutes remain at v1alpha2, but the backend port has been made mandatory upstream. To remove the confusion around the handling of ports, we have forked the official UDPRoute API to provide STUNner's own UDPRoute API that does not require backend ports but is otherwise fully compatible with the official API. To use it, rename the API group in your UDPRoute resources from gateway.networking.k8s.io/v1alpha2 to stunner.l7mp.io/v1. The standard v1alpha2 URPRoutes are still accepted, but don't forget that the backend port is now mandatory.
  • STUNner's GatewayConfig, Dataplane, and StaticService APIs have moved from v1alpha1 to v1. The old v1alpha1 resources are still accepted but they will be silently upgraded to v1. Since there were some minor incompatible changes between v1alpha1 and v1 it is recommended to double-check the upgrade, especially if you are using non-standard metrics reporting and health-checking settings.

Managed mode: default

In the last release we introduced the managed dataplane mode to simplify the provisioning of STUNner dataplane pods. In the managed mode the gateway operator automatically maintains a separate dataplane per each Gateway (i.e., a separate stunnerd Deployment), plus the usual LoadBalancer service to expose it to clients.

With this release managed mode has become the default. Legacy mode is still supported, but it is officially deprecated we will be removed in v1. Please make sure you properly upgrade STUNner, otherwise you may end up with a dysfunctional deployment.

Commits

chore: Introduce a rate-limited logger
chore: Introduce STUNner dataplane config API v1
chore: Refactor Service label/annotation setting
chore: Set default dataplane mode to "managed"
feat: Fork the UDPRoute API from Gateway API v1
feature: Per-cluster port range filtering
feature: Implement port range filtering on peer connections
fix: GatewayConfigs now print the Dataplane on kubectl-get
fix: Handle existing load balancer class (#41, #104)
fix: No longer set spec.externalIPs on LB Services
fix: Nonzero replica-count in Dataplane.spec no longer prevents autoscaling
fix: Open health-check service-port for LB Services only
fix: Properly remove managed resources on the invalidation pipeline
fix: Rewrite public address/port search, fix l7mp/stunner-auth-service#3
fix: Unbreak legacy config file watcher
refactor: Complete CDS server rewrite
refactor: Re-implement CDS API over OpenAPI
refactor: Steamline command line argument parsing
refactor: Upgrade STUNner Gateway API to v1
refactor: Upgrade to Gateway API v1

Enjoy STUNner and don't forget to support us!

Management: Managed STUNner dataplane

03 Oct 16:34
Compare
Choose a tag to compare

We are proud to present STUNner v0.16.0, the next major release of the STUNner Kubernetes media gateway for WebRTC from l7mp.io.

News

STUNner v0.16.0 is a major feature release and marks an important step towards STUNner reaching v1.0 and becoming generally available for production use.

Gateway API v0.8.0 support

STUNner uses the Gateway API, the upcoming Kubernetes API slated to replace the venerable Ingress API, to let users customize the way WebRTC media enters the cluster. The Gateway API is rapidly progressing towards general availability at breakneck speed and STUNner needs to progress at the same pace. In this release STUNNer has been updated to the latest Gateway API version v0.8.0. This has the bitter consequence, however, that the API versions in some of the Gateway API resources have to be updated. We have used this opportunity to streamline some of the unclear terminology around STUNner as well, especially concerning the inconsistent use of protocol names. So far, the plain transport protocol name, say, UDP or TCP, has meant "TURN over the said transport" (UDP or TCP). To differentiate between "pure UDP" and "TURN over UDP" listeners, the latter has been renamed to "TURN-UDP", and similarly for TCP, TLS and DTLS.

The particular rules are as follows:

  • The Gateway and GatewayClass API resources have to be updated from v1alpha2 to v1beta1, UDPRoutes remain at v1alpha2, and GatewayConfigs remain at v1alpha1. The old API versions are still accepted, but Kubernetes will silently rewrite API versions in the background. Our understanding is that this may cause problems in certain cases (e.g., with ArgoCD), therefore we recommend users to bump the API version on the corresponding resources (GatewayClass and Gateway) during the upgrade. This is as simple as changing the first line of your GatewayClass and Gateway YAMLs from apiVersion: gateway.networking.k8s.io/v1alpha2 to apiVersion: gateway.networking.k8s.io/v1beta1.
  • The protocol field in Gateway listener specifications has to be updated by adding the TURN- prefix, so UDP becomes TURN-UDP, TCP becomes TURN-TCP, etc. Again, the old protocol names are still accepted for compatibility, but we will remove support in the next release.

Managed dataplane

In the early days of STUNner we made the decision to make it the responsibility of the user to provision the dataplane for STUNner. That is, you had to separately helm-install the control plane, i.e., the stunner-gateway-operator, and the dataplane(s), that is, the stunnerd pods that actually process traffic. If STUNner was to be run in multiple namespaces, you had to provision a separate dataplane per each namespace. With STUNner maturing, this operational model has got more and more cumbersome and error-prone.

This release introduces the managed dataplane mode, a completely new way to operate STUNner. In this mode the operator automatically provisions a separate dataplane per each Gateway, i.e., a separate stunnerd Deployment, plus the usual LoadBalancer service to expose the gateway to clients. This simplifies the installation and configuration of common STUNner use cases substantially and makes the operation of complex setups much easier. Since the managed mode is still experimental, we decided to ship v0.16.0 with the old legacy dataplane mode by default. This means that you do not have to make any changes at this point, but in the next release managed mode is expected to become the default. We ask you to experiment with this feature and help us stabilize the managed mode by filing bug reports.

STUNner dataplane API

One of the recurring criticisms related to STUNner has always been the slow reaction to control plane updates. This was the consequence of the way the control plane (the gateway operator) and the dataplane (the stunnerd daemons) interact: the dataplane configuration is rendered by the operator into a Kubernetes ConfigMap, which is then mapped into the file system of the stunnerd pods as a regular configuration file that stunnerd watches for changes. Unfortunately, it may take Kubernetes up to a minute to push the new config file into the pod's filesystem, which was too slow for certain use cases.

The new release comes with an experimental implementation of the STUNner Configuration Discovery Service (CDS), which lets stunnerd to autodiscover its own configuration via a dedicted WebSocket connection. This makes control plane updates quasi real-time. CDS is available only in the managed mode and, correspondingly, it is switched off by default. However, installing the gateway operator with the managed mode enables the CDS service automatically, so if you are affected by slow control plane updates then you will see immediate improvements once you switch to the bleeding edge.

Standalone TURN server

We have received a number of complaints that STUNner is difficult to deploy as a public TURN server (this is called the "headless model"). Supporting this mode would make it possible to operate, and dynamically scale, a fleet of public TURN servers fully in Kubernetes.

One reason for this is that previously it was very difficult to provision a public IP for STUNner pods. With the new release this has become much simpler with the managed dataplane mode: just set hostNetwork: true in the Dataplane CR serving as a template for provisioning the stunnerd pods and your dataplane should be instantly re-deployed over a public IP. The other reason was a crucial assumption in the way STUNner filters clients' access to peers: namely, STUNner would allow access only to peers located within the same Kubernetes cluster. This made it difficult to forward to peers that are outside the cluster, like it is the case when STUNner is used as a standalone TURN server. In this release we introduce the StaticService API, which provides a a completely Kubernetes friendly way to control the IP ranges into which peer connections are accepted.

Cross-namespace route binding

The initial releases of STUNner included a number of simplifications, which made our lives much easier implementing it. One of these simplifications was the assumption that Gateways and UDPRoutes must both exist in the same Kubernetes namespace to be able to attach. With STUNner gradually maturing towards general availability, this limitation has become more and more cumbersome. In this release, we implemented the full route attachment machinery from the Kubernetes Gateway API, which unlocks a number of exciting new persona-based deployment models.

Mediasoup tutorial

MediaSoup is a popular WebRTC media server distribution that enables developers to build group chats, one-to-many broadcasts, and real-time streaming applications. This release comes with a new tutorial for setting up STUNner with mediasoup.

Minor updates

Apart from the major updates, this release also comes with the usual assortment of documentation updates, tests and CI/CD improvements all around the place.

feature: Implement and integrate a Config Discovery Service (CDS)
feature: Implement a StaticService CRD
feature: Implement route attachment, fixes #30
feature: Managed dataplane (#35)
feature: Use Gateway Addresses as rendered Service's external IPs (#33)
fix: Handle concurrent writes in the CDS server
fix: Make sure service-port names of LB Services are unique, fixes #19
fix: Make sure STUNner listens on all interfaces
fix: Merge LB Service metadata instead of rewriting it
fix: Protocol name disambiguation, fixes #28
fix: Report correct resource names in reconcilers
fix: Set default Namespace in ParentRef from UDPRoute.Namespace
fix: Streamline TURN URI parsing
chore: Upgrade to sigs.k8s.io/gateway-api v0.8.0
doc: Document that cross-namespace route bindings are now supported
doc: Document the StaticService CRD
doc: Warn that UDPRoutes ignore the service-port in backend Services
doc: Add mediasoup demo #103

Enjoy STUNner and don't forget to support us!

Performance: Per-allocation CPU load-balancing for UDP

09 May 15:15
Compare
Choose a tag to compare

We are proud to present STUNner v0.15.0, the next major release of the STUNner Kubernetes media gateway for WebRTC from l7mp.io.

News

STUNner v0.15.0 is a major feature release and marks an important step towards STUNner reaching v1.0 and become generally available for production use.

The most important changes include:

  • Performance: So far, STUNner TURN/UDP listeners have been limited to a single CPU. With this release this bottleneck has been eliminated, allowing STUNner to run multiple parallel readloops per TURN/UDP listener. This makes it possible to scale the TURN server to any practical number of CPUs and brings massive performance improvement for TURN/UDP workloads.
  • Authentication: On popular user request, STUNner can now read and reconcile TURN authentication credentials from a Kubernetes Secret. This makes it easier to control access to sensitive authentication information. STUNner now also comes with a REST API server for generating ephemeral TURN authentication credentials, implemented as, following the best cloud-native principles, a microservice. The authentication service can be leveraged by WebRTC application servers to obtain time-windowed user authentication credentials and full STUNner-ified ICE server configurations by a single HTTP GET request. Removing a source of permanent confusion, the plaintext authentication mode is now also available under the alias static and the alias ephemeral is introduced to mark what's so far has been called longterm. The old names remain available but their use is discouraged, and they will be deprecated in a future release.
  • Graceful shutdown: With this release, STUNner becomes a better Kubernetes citizen and fully supports graceful shutdown with Kubernetes-compatible liveness and readiness checks. This makes it possible to seamlessly scale a STUNner deployment down (scaling up has been available since the first release): on being shut down, STUNner pods will fail the readiness check so that Kubernetes stops routing new allocation requests to these terminating pods, but the built-in TURN servers will remain alive until having finished processing all active allocations. This prevents the disconnection of active client connections on terminating pods, making STUNner scale-up/scale-down completely seamless.
  • Custom cloud support: STUNner will now automatically expose health-check ports and enable mixed protocol LoadBalancers, both prerequisites of deploying it to Digital Ocean or AWS/EKS smoothly. This change should remove much manual configuration burden for the users of these popular platforms. GCP/GKE and other cloud providers' hosted Kubernetes platforms, which do not require health-checks for UDP LoadBalancer services, continue to work as always.
  • Documentation: STUNner docs are now available at ReadTheDocs!

Apart from the major updates, this release also comes with the usual assortment of documentation updates, tests and CI/CD improvements all around the place.

Enjoy STUNner and don't forget to support us!

Breaking changes

This release should bring no breaking changes. However, some Kubernetes annotations have been promoted to labels and this may cause issues in certain setups. We made several rounds of testing to make sure the upgrade goes as smoothly as possible but, as usual, upgrade carefully and don't forget to file a bug report if anything goes wrong.

Further changes/improvements

chore(CI/CD): Bump Go version to 1.19
chore: Strip symbols from the binary built (#17)
chore: Transition to pion/turn/v2.1.0 and pion/transport/v2
chore: Upgrade to Gateway API v0.6.2
doc: Add Prometheus and Grafana integration to MONITORING (#63)
feature: Add a "app:stunner" label to svcs and configmaps we create
feature: Add a config file watcher to the public API
feature: Automatically expose health-check ports on LBs, fixes #22
feature: Generate TURN URIs from running Stunner config
feature: Implement stunner.SetLogLevel
feature: Introduce auth type aliases, fixes #7
feature: Introduce the more descriptive authentication type aliases
feature: Multi-threaded UDP listeners
feature: Set "related-gateway" annotation of dataplane ConfigMaps
feature: Support mixed protocol load balancer (#25)
feature: Take auth credentials from a Secret, closes #18
fix: Bootstrap stunnerd with minimal config in watch mode
fix: Config file validation no longer sorts listeners and clusters
fix: Enable health-checking by default
fix: Fix segfault when calling Status on a listener w/o TURN server
fix: LB service watchers now filter on the label "app:stunner"
fix: Properly close listeners
fix: Remove segfault in StunnerConfig.DeepEqual
fix: Stop watching for config updates after a graceful shutdown
hack: Don't fail readiness checks when there is no config
refactor: Add public API for generating/checking TURN credentials
refactor: Export default config settings
refactor: Reorganize the config-watcher API
refactor: Use "owned-by" label to mark our own resources instead of the annotation with the same name

Security and Observability: Expose TLS/DTLS settings via the Gateway API + Monitoring with Prometheus and Grafana

23 Jan 15:47
Compare
Choose a tag to compare

We are proud to present STUNner v0.13.0. STUNner is the Kubernetes media gateway for WebRTC from l7mp.io.

News

We are happy to announce that we have completed two actual milestones with this release: the milestones v0.12: Security: Expose TLS/DTLS settings via the Gateway API and v0.13: Observability: Prometheus + Grafana dashboard are now both released in a single package. Thus, there's no v0.12, we immediately jump to v0.13!

The most important changes include full support for TURN over TLS/DTLS to improve security, a Prometheus metric exporter to gain real-time visibility into media traffic, and new tutorials describing how to use Jitsi and LiveKit with STUNner. As a major usability upgrade, STUNner can now reconcile most control-plane updates without having to restart the underlying TURN server and disconnect active sessions. Apart from the milestones, this release also sports the usual assortment of documentation updates, tests and CI/CD improvements all around the place.

Enjoy STUNner and don't forget to support us if you like it!

Breaking changes

This is a massive release and there are inevitably some intrusive changes that may break your WebRTC application. Upgrade at your own risk.

  • Automatically created LB services now use the same name as the Gateway being exposed. This improves consistency with the rest of the Gateway API implementations.
  • STUNner listeners are now named as <Gateway-namespace>/<Gateway-name>/<listener-name>.
  • All listeners of a Gateway are now exposed in a single LB Service (i.e., a single exteraal IP). Multi-protocol LBs are still not supported, this is to arrive in the next release.

Major changes/features

  • Add TLS/DTLS support to the control plane and the datplane.
  • Track only a single node to obtain an external IP for NodePort fallback.
  • Add manual public IP setting to Gateways.
  • Expose health-check settings in the GatewayConfig.
  • Fallback to LB Service Status.Hostname when no Status.IP is available.
  • Disambiguate listener and cluster names.
  • Protocol names (UDP/TCP/...) now stringify to upper case.
  • Mask sensitive info (usernames, passwords and TLS certs) in the logging output.
  • Configurable telemetry collection.
  • Maintain a separate TURN server per listener.
  • Cluster.Protocol support for a future TCPRoute implementation.
  • Implement liveness and readiness check and full stunnerd lifecycle.
  • Implement a dry-run mode to suppress side-effects.
  • Support coturn use-auth-secret TURN authentication mode.
  • Add a simple benchmark script.
  • Handle FQDNs in TURN URIs.
  • New tutorials (Jitsi and LiveKit).

Control plane: Kubernetes gateway operator and dataplane reconciliation

21 Oct 16:19
Compare
Choose a tag to compare

We are proud to present STUNner v0.11.0, the Kubernetes media gateway for WebRTC.

News in this release

This is the first release that showcases the complete user story of STUNner: a dataplane exposing a standards compliant STUN/TURN service to clients, along with a control plane (a Kubernetes operator) that lets you configure STUNner in a high-level declarative style (that Kubernetes Gateway API), in the same YAML-engineering style you use to interact with any other Kubernetes workload.

Main improvements:

  • A Kubernetes gateway operator, an open-source implementation of the Kubernetes Gateway API using STUNner as the data plane.
  • Full documentation, with a comprehensive getting started guide, user guides, tutorials, and manuals.
  • Using STUNner in multi-cluster/multi-cloud Kubernetes deployments! STUNner is fully compliant with the official Kubernetes Multi-Cluster Services API, which lets you deploy your media servers to multiple geographically diverse sites and STUNner ensures that media traffic flows between clusters smoothly.
  • A Node.js helper library to simplify generating ICE configurations and TURN credentials for STUNner.
  • New tutorials for firing up a cloud-gaming or a desktop streaming application in 5 minutes, all thanks to Kubernetes and STUNner. This completes the number of our tutorials to 6!
  • Lots of bug fixes, usability improvements, and doc updates all around the place.