metal-stack is an open source software that provides an API for provisioning and managing physical servers in the data center. To categorize this product, we use the terms Metal-as-a-Service (MaaS) or bare metal cloud.
From the perspective of a user, the metal-stack does not feel any different from working with a conventional cloud provider. Users manage their resources (machines, networks and ip addresses, etc.) by themselves, which effectively turns your data center into an elastic cloud infrastructure.
The major difference to other cloud providers is that compute power and data reside in your own data center.
Pages = ["index.md"]
Depth = 5
Before we started with our mission to implement the metal-stack, we decided on a couple of key characteristics and constraints that we think are unique in the domain (otherwise we would definitely have chosen an existing solution).
We hope that the following properties appeal to you as well.
Running on-premise gives you data sovereignty and usually a better price / performance ratio than with hyperscalers — especially the larger you grow your environment. Another benefit of running on-premise is an easier connectivity to existing company networks.
Provisioning bare metal machines should not feel much different from virtual machines. metal-stack is capable of provisioning servers in less than a minute. The underlying network topology is based on BGP and allows announcing new routes to your host machines in a matter of seconds.
Part of the metal-stack runs on dedicated switches in your data center. This way, it is possible to automate server inventorization, permanently reconcile network configuration and automatically manage machine lifecycles. Manual configuration is neither required nor wanted.
Our networking approach was designed for highest standards on security. Also, we enforce firewalling on dedicated tenant firewalls before users can establish connections to other networks than their private tenant network. API authentication and authorization is done with the help of OIDC.
The development of metal-stack is strictly API driven and offers self-service to end-users. This approach delivers the highest possible degree of automation, maintainability and performance.
Not only does the metal-stack run smoothly on Kubernetes (K8s). The major intent of metal-stack has always been to build a scalable machine infrastructure for Kubernetes as a Service (KaaS). In partnership with the open-source project Gardener, we can provision Kubernetes clusters on metal-stack at scale.
From the perspective of the Gardener, the metal-stack is just another cloud provider. The time savings compared to providing machines and Kubernetes by hand are significant. We actually want to be able to compete with offers of public cloud providers, especially regarding speed and usability.
Of course, you can use metal-stack only for machine provisioning as well and just put something else on top of your metal infrastructure.
The metal-stack is open source and free of constraints regarding vendors and third-party products. The stack is completely built on open source products. We have a community actively working on the metal-stack, which can assist you delivering all reasonable features you are gonna need.
Bare metal has several advantages over virtual environments and overcomes several drawbacks of virtual machines. We also listed drawbacks of the bare metal approach. Bare in mind though that it is still possible to virtualize on bare metal environments when you have your stack up and running.
- Spectre and Meltdown can only be mitigated with a "cluster per tenant" approach
- Missing isolation of multi-tenant change impacts
- Licensing restrictions
- Noisy-neighbors
- Guaranteed and fastest possible performance (especially disk i/o)
- Reduced stack depth (Host / VM / Application vs. Host / Container)
- Reduced attack surface
- Lower costs, higher performance
- No VM live-migrations
- Bigger hardware configurations possible (hypervisors have restrictions, e.g. it is not possible to assign all CPUs to a single VM)
- Hardware defects have direct impact (should be considered by design) and can not be mitigated by live-migration as in virtual environments
- Capacity planning is more difficult (no resource overbooking possible)