diff --git a/docs/blog/posts/nvidia-dgx-spark.md b/docs/blog/posts/nvidia-dgx-spark.md new file mode 100644 index 000000000..209625133 --- /dev/null +++ b/docs/blog/posts/nvidia-dgx-spark.md @@ -0,0 +1,132 @@ +--- +title: "Orchestrating workloads on NVIDIA DGX Spark" +date: 2025-11-14 +description: "TBA" +slug: nvidia-dgx-spark +image: https://dstack.ai/static-assets/static-assets/images/nvidia-dgx-spark.png +# categories: +# - Benchmarks +--- + +# Orchestrating workloads on NVIDIA DGX Spark + +With support from [Graphsignal :material-arrow-top-right-thin:{ .external }](https://x.com/GraphsignalAI/status/1986565583593197885){:target="_blank" }, our team gained access to the new [NVIDIA DGX Spark :material-arrow-top-right-thin:{ .external }](https://www.nvidia.com/en-us/products/workstations/dgx-spark/){:target="_blank"} and used it to validate how `dstack` operates on this hardware. This post walks through how to set it up with `dstack` and use it alongside existing on-prem clusters or GPU cloud environments to run workloads. + + + + + +If DGX Spark is new to you, here is a quick breakdown of the key specs. + +* Built on the NVIDIA GB10 Grace Blackwell Superchip with Arm CPUs. +* Capable of up to 1 petaflop of AI compute at FP4 precision, roughly comparable to RTX 5070 performance. +* Features 128GB of unified CPU and GPU memory enabled by the Grace Blackwell architecture. +* Ships with NVIDIA DGX OS (a tuned Ubuntu build) and NVIDIA Container Toolkit. + +These characteristics make DGX Spark a fitting extension for local development and smaller-scale model training or inference, including workloads up to the GPT-OSS 120B range. + +## Creating an SSH fleet + +Because DGX Spark supports SSH and containers, integrating it with dstack is straightforward. Start by configuring an [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets). The file needs the hosts and access credentials. + +
+ +```yaml +type: fleet +name: spark + +ssh_config: + user: devops + identity_file: ~/.ssh/id_rsa + hosts: + - spark-e3a4 +``` + +
+ +The `user` must have `sudo` privileges. + +Apply the configuration: + +
+ +```shell +$ dstack apply -f fleet.dstack.yml + +Provisioning... +---> 100% + + FLEET INSTANCE GPU PRICE STATUS CREATED + spark 0 GB10:1 $0 idle 3 mins ago +``` + +
+ +Once active, the system detects hardware and marks the instance as `idle`. From here, you can run +[dev environments](../../docs/concepts/dev-environments.md), [tasks](../../docs/concepts/tasks.md), +and [services](../../docs/concepts/services.md) on the DGX Spark fleet, the same way you would with other on-prem or cloud GPU backends. + +## Running a dev environment + +Example configuration: + +
+ +```yaml +type: dev-environment +name: cursor + +image: lmsysorg/sglang:spark + +ide: cursor + +resources: + gpu: GB10 + +volumes: + - /root/.cache/huggingface:/root/.cache/huggingface + +fleets: [spark] +``` + +
+ +We use an [instance volume](../../docs/concepts/volumes.md#instance-volumes) to keep model downloads cached across runs. The `lmsysorg/sglang:spark` image is tuned for inference on DGX Spark. Any Arm-compatible image with proper driver support will work if customization is needed. + +Run the environment: + +
+ +```shell +$ dstack apply -f .dstack.yml + + BACKEND GPU INSTANCE TYPE PRICE + ssh (remtoe) GB10:1 instance $0 idle + +Submit the run cursor? [y/n]: y + + # NAME BACKEND GPU PRICE STATUS SUMBITTED + 1 cursor ssh (remote) GB10:1 $0 running 12:24 + +Launching `cursor`... +---> 100% + +To open in VS Code Desktop, use this link: + vscode://vscode-remote/ssh-remote+cursor/workflow +``` + +
+ +Workloads behave exactly like they do on other supported compute targets. You can use DGX Spark for fine tuning, interactive development, or model serving without changing workflows. + +!!! info "Aknowledgement" + Thanks to the [Graphsignal :material-arrow-top-right-thin:{ .external }](https://graphsignal.com/){:target="_blank"} team for access to DGX Spark and for supporting testing and validation. Graphsignal provides inference observability tooling used to profile CUDA workloads during both training and inference. + +## What's next? + +1. Read the [NVIDIA DGX Spark in-depth review :material-arrow-top-right-thin:{ .external }](https://lmsys.org/blog/2025-10-13-nvidia-dgx-spark/){:target="_blank"} by the SGLang team. +2. Check [dev environments](../../docs/concepts/dev-environments.md), + [tasks](../../docs/concepts/tasks.md), [services](../../docs/concepts/services.md), + and [fleets](../../docs/concepts/fleets.md) +3. Follow [Quickstart](../../docs/quickstart.md) +4. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd){:target="_blank"}