Is it necessary to support both gVisor and MicroVM simultaneously? #387
Replies: 18 comments 6 replies
-
|
We have made a lot of performance optimizations for the cloud hypervisor, and we will not be weaker than gvisor on the various indicators mentioned above. Moreover, many of our capabilities and the customized MicroVM Hypervisor are closely related, so I think there is no essential connection between the points you raised and whether it is gvisor |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for your prompt reply! |
Beta Was this translation helpful? Give feedback.
-
|
Our built-in guest OS and guest rootfs have been highly trimmed. In addition, based on the principle of our snapshot startup, all sandbox memory comes from the mmap of snapshots on the host machine, with CoW. The vast majority of the guest kernel's memory is read-only, which means that all sandbox kernel code segments based on the same template on the same host machine are globally shared, so I believe that in the kernel of the sandbox, we will not be weaker than gvisor |
Beta Was this translation helpful? Give feedback.
-
|
I conducted a simple benchmark test using the e2b-code-interpreter workload. The results show that gvisor consumes far less memory than CubeSandbox. When running 100 instances, its used memory is only 18% of that of CubeSandbox. Could you help analyze the root causes? Thank you! Test Environment
Versions
We measured the incremental used memory with the
|
Beta Was this translation helpful? Give feedback.
-
|
In our single-machine one-click deployment configuration, too many components are installed, including nginx, redis, mysql, etc. These control plane components also consume additional memory due to the creation of sandboxes each time. In actual deployment, these control plane components should be deployed on separate machines. A more accurate way to view the sandbox memory consumption is to check /proc/sandbox pid/numa_maps, as described in issue #272 |
Beta Was this translation helpful? Give feedback.
-
|
We measure memory consumption by tracking the incremental memory usage as the number of instances increases. We added comparative tests for the work node only scenario. The results show that the overhead on the control plane is negligible for deployments with up to 100 instances. Overall, there remains a notable performance gap compared to gVisor.
|
Beta Was this translation helpful? Give feedback.
-
|
Can you help me check where the additional memory is mainly distributed? In our internal PVM tests, the memory consumption of 100 sandboxes is about 2G, and in previous external user tests, a single machine producing 5000 sandboxes consumed 154G of memory. Let's see if it's related to the kernel version or the environment. From a technical perspective, the kernel of each sandbox is globally shared, which does not bring additional memory overhead. |
Beta Was this translation helpful? Give feedback.
-
|
When creating a template using the sandbox-code image, the following parameters were applied: cubemastercli tpl create-from-image \
--image cube-sandbox-cn.tencentcloudcr.com/cube-sandbox/sandbox-code:latest \
--writable-layer-size 1GThe
After rebuilding the template with the additional parameters below: cubemastercli tpl create-from-image \
--image cube-sandbox-cn.tencentcloudcr.com/cube-sandbox/sandbox-code:latest \
--writable-layer-size 1G \
--expose-port 49999 \
--expose-port 49983 \
--probe 49999The updated test results are as follows:
The memory footprint is now significantly optimized. As you mentioned, it even consumes less memory than gVisor. There is one remaining question: why do the |
Beta Was this translation helpful? Give feedback.
-
|
The probe determines when a memory snapshot is taken. Without probe, a snapshot is taken immediately after kernel startup. With probe, a snapshot is taken only after the network on the corresponding port is accessible. Without probe, the application has an additional initialization process, which generates additional memory pages through Copy-on-Write (CoW). |
Beta Was this translation helpful? Give feedback.
-
|
Thanks a lot for the detailed explanation on the earlier memory Copy-on-Write (CoW) issue. As shown in the table below, I further compared the I/O performance of several virtualization runtimes. The results indicate that CubeSandbox delivers noticeably lower performance on network I/O and mixed network & storage I/O compared with gVisor and kata-fc. Could you help analyze the root causes? Test MethodologyThe tests were conducted on two physical machines with identical configurations, with only a single sandbox running on each, so performance degradation caused by over-provisioning can be ruled out. Test Results
|
||||||||||||||||||||||||||||||||||||||||||||
Beta Was this translation helpful? Give feedback.
-
|
Could you please help me paste the feature results of the sandbox internal and host machine network cards? (ethtool -K dev), it should be encountering some bug, from a technical perspective, our IO virtualization implementation performance should not be weaker than FC |
Beta Was this translation helpful? Give feedback.
-
@jamesxia20181001 Thanks for your report. Please provide outputs of the following command to help us diagnose the issue.
|
Beta Was this translation helpful? Give feedback.
-
|
Thanks for your prompt reply. Below is the collected information; please help analyze it, thank you! CubeSandbox MicroVM's guestos network informationethtool -i eth0ethtool -k eth0CubeSandbox hostos kernel and network informationThe network is configured with VLAN and Bonding;Bond1 aggregates ens6f0 and ens6f1;The VLAN interface is bond1.2. uname -a
ethtool -i bond1.2ethtool -i bond1ethtool -i ens6f0ethtool -i ens6f1ethtool -k bond1.2ethtool -k bond1ethtool -k ens6f0ethtool -k ens6f1 |
Beta Was this translation helpful? Give feedback.
-
|
The network is deployed with VLAN and Bonding: Bond1 bundles ens6f0 and ens6f1, with its VLAN sub-interface bond1.2 serving as the primary network adapter for CubeSandbox. Per installation logs and network port details, GRO has been successfully disabled on bond1.2, confirmed by zero relevant error entries in both installation logs and network-agent logs:
However, GRO remains enabled on all underlying physical/bonded ports relied upon by bond1.2:
After manually turning off GRO across all these ports, we ran another round of benchmark tests. The dataset marked CubeSandbox GRO Disabled represents metrics collected with GRO fully disabled on all ports, showing notable network performance improvement against the original state. Even with this optimization applied, CubeSandbox still delivers remarkably lower network throughput compared with gVisor and Kata-FC. Could you please help investigate the root cause? Thanks.
|
||||||||||||||||||||||||||||||||||||||||||||
Beta Was this translation helpful? Give feedback.
-
|
To eliminate performance bias introduced by heterogeneous hardware environments, CubeSandbox was deployed on the same bare-metal server hosting gVisor and Kata-fc, and upgraded to the latest v0.3.0 release for validation. Environment DetailsOS Information
Component Version Information
Network InformationgVisor and Kata-fc are integrated into the Kubernetes cluster with out-of-box default configurations. Flannel is used as the default CNI plugin. The uplink consists of VLAN subinterfaces created on two physical NICs aggregated via bond, and GRO is enabled by default. Prior to CubeSandbox benchmarking, the Kubernetes cluster was shut down, followed by manual disabling of GRO across all network interfaces before running CubeSandbox performance measurements. Test ResultsThe aggregated benchmark metrics are listed in the table below. Test data indicates notable performance improvements for the upgraded CubeSandbox; nevertheless, it still falls behind gVisor and Kata-fc on certain workloads such as the
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Beta Was this translation helpful? Give feedback.
-
|
Enabling the TSO for the network card in the sandbox should be very helpful for the upload iperf case. Additionally, we have fixed some issues related to checksum, and the host network card can enable GRO, which should also be very helpful for the download iperf case |
Beta Was this translation helpful? Give feedback.
-
|
We tried enabling TSO and GRO, but neither worked. 1.TSO cannot be enabled within the sandbox; relevant details are provided below: 2.An attempt was made to enable GRO, yet the download speed remains capped at 10 MB/s. Partial packet capture data for both enabled and disabled GRO states is listed as follows. Enabled GRODisabled GRO |
Beta Was this translation helpful? Give feedback.
-
|
So when do you plan to merge this feature, or in which release version? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
CubeSandbox adopts MicroVM for sandbox virtualization and isolation. Recently, Google open-sourced Agent-Substrate, which leverages gVisor. Combined with idle suspend (persisting memory snapshots to object storage) and runtime resume (cross-node restoration supported), this solution claims to deliver 30x+ oversubscription by scheduling a large number of stateful workloads onto a small pool of shared physical pods. Google further built the distributed agent runtime Agent Executor (AX) on top of Agent-Substrate.
According to the benchmark report agent-infra/runtime-benchmark, gVisor achieves a significantly higher deployment density compared with MicroVM (represented by Kata Containers).
Memory Overhead per Sandbox Pod
CPU Overhead per Idle Sandbox Pod (millicores)
Node Packing Density (n2-standard-8, 32 GiB)
Agents are evolving toward an architecture of stateless harness + session-based sandbox. The agent harness becomes stateless, standardized and lightweight, and is only responsible for routing, scheduling, session management and access control. Since this component requires very few system calls and simple dependencies, gVisor is fully capable of supporting its operation.
For tool sandboxes running common lightweight workloads such as simple scripts, basic commands and general SDKs, gVisor provides better cost advantages over MicroVM. MicroVM is only required as a safety fallback for high-risk untrusted code, complex toolchains, workloads relying on full Linux system calls/kernel features, or applications running on heterogeneous operating systems.
If the above assessment holds true, supporting gVisor and MicroVM in the long term and enabling them to share the same compute node pool would be a reasonable architecture decision.
Beta Was this translation helpful? Give feedback.
All reactions