diff --git a/README.md b/README.md
index 67f24af4c..93227dbf5 100644
--- a/README.md
+++ b/README.md
@@ -14,14 +14,13 @@
 
 </div>
 
-`dstack` is an open-source container orchestration engine for AI. 
-It accelerates the development, training, and deployment of AI models, and simplifies the management of clusters.
+`dstack` is a lightweight alternative to Kubernetes, designed specifically for managing the development, training, and
+deployment of AI models at any scale.
 
-#### Cloud and on-prem
+`dstack` is easy to use with any cloud provider (AWS, GCP, Azure, OCI, Lambda, TensorDock, Vast.ai, RunPod, etc.) or
+any on-prem clusters.
 
-`dstack` is easy to use with any cloud or on-prem servers.
-Supported cloud providers include AWS, GCP, Azure, OCI, Lambda, TensorDock, Vast.ai, RunPod, and CUDO.
-For using `dstack` with on-prem servers, see [fleets](https://dstack.ai/docs/fleets#__tabbed_1_2).
+If you already use Kubernetes, `dstack` can be used with it.
 
 #### Accelerators
 
@@ -29,40 +28,29 @@ For using `dstack` with on-prem servers, see [fleets](https://dstack.ai/docs/fle
  
 ## Major news ✨
 
-- [2024/07] [dstack 0.18.7: Fleets, RunPod Volumes, dstack apply, and more](https://github.com/dstackai/dstack/releases/tag/0.18.7) (Release)
+- [2024/07] [dstack 0.18.8: GCP volumes](https://github.com/dstackai/dstack/releases/tag/0.18.8) (Release)
+- [2024/07] [dstack 0.18.7: Fleets, RunPod volumes, dstack apply, and more](https://github.com/dstackai/dstack/releases/tag/0.18.7) (Release)
 - [2024/05] [dstack 0.18.4: Google Cloud TPU, and more](https://github.com/dstackai/dstack/releases/tag/0.18.4) (Release)
 - [2024/05] [dstack 0.18.3: OCI, and more](https://github.com/dstackai/dstack/releases/tag/0.18.3) (Release)
 - [2024/05] [dstack 0.18.2: On-prem clusters, private subnets, and more](https://github.com/dstackai/dstack/releases/tag/0.18.2) (Release)
-- [2024/04] [dstack 0.18.0: RunPod, multi-node tasks, and more](https://github.com/dstackai/dstack/releases/tag/0.18.0) (Release)
 
 ## Installation
 
 Before using `dstack` through CLI or API, set up a `dstack` server.
 
-### Install the server
-    
-The easiest way to install the server, is via `pip`:
-
-```shell
-pip install "dstack[all]" -U
-```
-
-### Configure backends
+### 1. Configure backends
 
-If you have default AWS, GCP, Azure, or OCI credentials on your machine, the `dstack` server will pick them up automatically.
+If you want the `dstack` server to run containers or manage clusters in your cloud accounts (or use Kubernetes),
+create the [~/.dstack/server/config.yml](https://dstack.ai/docs/reference/server/config.yml.md) file and configure backends.
 
-Otherwise, you need to manually specify the cloud credentials in `~/.dstack/server/config.yml`.
+### 2. Start the server
 
-See the [server/config.yml reference](https://dstack.ai/docs/reference/server/config.yml.md#examples)
-for details on how to configure backends for all supported cloud providers.
-
-### Start the server
-
-To start the server, use the `dstack server` command:
+Once the `~/.dstack/server/config.yml` file is configured, proceed to start the server:
 
 <div class="termy">
 
 ```shell
+$ pip install "dstack[all]" -U
 $ dstack server
 
 Applying ~/.dstack/server/config.yml...
@@ -76,42 +64,58 @@ The server is running at http://127.0.0.1:3000/
 > **Note**
 > It's also possible to run the server via [Docker](https://hub.docker.com/r/dstackai/dstack).
 
-### Add on-prem servers
+The `dstack` server can run anywhere: on your laptop, a dedicated server, or in the cloud. Once it's up, you
+can use either the CLI or the API.
+
+### 3. Set up the CLI
+
+To point the CLI to the `dstack` server, configure it
+with the server address, user token, and project name:
+
+```shell
+$ pip install dstack
+$ dstack config --url http://127.0.0.1:3000 \
+    --project main \
+    --token bbae0f28-d3dd-4820-bf61-8f4bb40815da
+    
+Configuration is updated at ~/.dstack/config.yml
+```
+
+### 4. Create on-prem fleets
     
-If you'd like to use `dstack` to run workloads on your on-prem servers,
-see [on-prem fleets](https://dstack.ai/docs/fleets#__tabbed_1_2) command.
+> If you want the `dstack` server to run containers on your on-prem servers,
+use [fleets](../fleets.md#__tabbed_1_2).
 
 ## How does it work?
 
-### 1. Define run configurations
+> Before using `dstack`, [install](https://dstack.ai/docs/installation/index.md) the server and configure backends.
 
-`dstack` supports three types of run configurations:
+### 1. Define configurations
+
+`dstack` supports the following configurations:
    
 * [Dev environments](https://dstack.ai/docs/dev-environments.md) &mdash; for interactive development using a desktop IDE
-* [Tasks](https://dstack.ai/docs/tasks.md) &mdash; for any kind of batch jobs or web applications (supports distributed jobs)
-* [Services](https://dstack.ai/docs/services.md)&mdash; for production-grade deployment (supports auto-scaling and authorization)
-
-Each type of run configuration allows you to specify commands for execution, required compute resources, retry policies, auto-scaling rules, authorization settings, and more.
+* [Tasks](https://dstack.ai/docs/tasks.md) &mdash; for scheduling jobs (incl. distributed jobs) or running web apps
+* [Services](https://dstack.ai/docs/services.md) &mdash; for deployment of models and web apps (with auto-scaling and authorization)
+* [Fleets](https://dstack.ai/docs/fleets.md) &mdash; for managing cloud and on-prem clusters
+* [Volumes](https://dstack.ai/docs/concepts/volumes.md) &mdash; for managing persisted volumes
+* [Gateways](https://dstack.ai/docs/concepts/volumes.md) &mdash; for configuring the ingress traffic and public endpoints
 
 Configuration can be defined as YAML files within your repo.
 
-### 2. Run configurations
-
-Run any defined configuration either via `dstack` CLI or API.
-   
-`dstack` automatically handles provisioning, interruptions, port-forwarding, auto-scaling, network, volumes, 
-run failures, out-of-capacity errors, and more.
+### 2. Apply configurations
 
-### 3. Manage fleets
+Apply the configuration either via the `dstack apply` CLI command or through a programmatic API.
 
-Use [fleets](https://dstack.ai/docs/fleets.md) to provision and manage clusters and instances, both in the cloud and on-prem.
+`dstack` automatically manages provisioning, job queuing, auto-scaling, networking, volumes, run failures,
+out-of-capacity errors, port-forwarding, and more &mdash; across clouds and on-prem clusters.
 
 ## More information
 
 For additional information and examples, see the following links:
 
 * [Docs](https://dstack.ai/docs)
-* [Examples](examples)
+* [Examples](https://dstack.ai/docs/examples)
 * [Changelog](https://github.com/dstackai/dstack/releases)
 * [Discord](https://discord.gg/u8SmfwPpMd)
 
diff --git a/docs/assets/stylesheets/extra.css b/docs/assets/stylesheets/extra.css
index 08c3c0f8c..74b09fd15 100644
--- a/docs/assets/stylesheets/extra.css
+++ b/docs/assets/stylesheets/extra.css
@@ -598,7 +598,7 @@ code .md-code__nav:hover .md-code__button {
 }
 
 .md-typeset h5  {
-    font-size: 16px;
+    font-size: 18px;
 }
 
 .md-typeset h3 {
diff --git a/docs/docs/concepts/gateways.md b/docs/docs/concepts/gateways.md
index 81e9a22b4..49dc69e84 100644
--- a/docs/docs/concepts/gateways.md
+++ b/docs/docs/concepts/gateways.md
@@ -1,13 +1,14 @@
 # Gateways
 
-Gateways handle the ingress traffic of running services.
-They provide [services](services.md) with HTTPS domains, handle authentication, distribute load, and perform auto-scaling.
-In order to run a service, you need to have at least one gateway set up.
+Gateways manage the ingress traffic of running services and provide them with an HTTPS endpoint mapped to your domain,
+handling authentication, load distribution, and auto-scaling.
+
+To run a service, you need at least one gateway set up.
 
 > If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
-the gateway is already set up for you.
+> the gateway is already set up for you.
 
-## Configuration
+## Define a configuration
 
 First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `gateway.dstack.yml`
 are both acceptable).
@@ -16,10 +17,14 @@ are both acceptable).
 
 ```yaml
 type: gateway
+# A name of the gateway
 name: example-gateway
 
+# Gateways are bound to a specific backend and region
 backend: aws
 region: eu-west-1
+
+# This domain will be used to access the endpoint
 domain: example.com
 ```
 
@@ -28,10 +33,10 @@ domain: example.com
 A domain name is required to create a gateway.
 
 !!! info "Reference"
-    See the [.dstack.yml reference](../reference/dstack.yml/gateway.md)
-    for all supported configuration options and examples.
+    See [.dstack.yml](../reference/dstack.yml/gateway.md) for all the options supported by
+    gateways, along with multiple examples.
 
-## Creating and updating gateways
+## Create or update a gateway
 
 To create or update the gateway, simply call the [`dstack apply`](../reference/cli/index.md#dstack-apply) command:
 
@@ -39,7 +44,6 @@ To create or update the gateway, simply call the [`dstack apply`](../reference/c
 
 ```shell
 $ dstack apply . -f examples/deployment/gateway.dstack.yml
-
 The example-gateway doesn't exist. Create it? [y/n]: y
 
  BACKEND  REGION     NAME             HOSTNAME  DOMAIN       DEFAULT  STATUS
@@ -49,36 +53,37 @@ The example-gateway doesn't exist. Create it? [y/n]: y
 
 </div>
 
-## Updating DNS records
+## Update DNS records
 
 Once the gateway is assigned a hostname, go to your domain's DNS settings
 and add an `A` DNS record for `*.<gateway domain>` (e.g., `*.example.com`) pointing to the gateway's hostname.
 
-This will allow you to access runs and models using this domain.
+## Manage gateways
 
-## Managing gateways
-
-### Listing gateways
+### List gateways
 
 The [`dstack gateway list`](../reference/cli/index.md#dstack-gateway-list) command lists existing gateways and their status.
 
-### Deleting gateways
+### Delete a gateway
 
 To delete a gateway, pass gateway configuration to [`dstack delete`](../reference/cli/index.md#dstack-delete):
 
 <div class="termy">
 
 ```shell
-$ dstack delete . -f examples/deployment/gateway.dstack.yml
+$ dstack delete -f examples/deployment/gateway.dstack.yml
 ```
 
 </div>
 
-[//]: # (TODO: Ellaborate on default`)
+[//]: # (TODO: Ellaborate on default)
 
 [//]: # (TODO: ## Accessing endpoints)
 
 ## What's next?
 
 1. See [services](../services.md) on how to run services
-2. Check the [`.dstack.yml` reference](../reference/dstack.yml/gateway.md) for more details and examples
\ No newline at end of file
+
+!!! info "Reference"
+    See [.dstack.yml](../reference/dstack.yml/gateway.md) for all the options supported by
+    gateways, along with multiple examples.
\ No newline at end of file
diff --git a/docs/docs/concepts/volumes.md b/docs/docs/concepts/volumes.md
index ffbc92074..e35bed730 100644
--- a/docs/docs/concepts/volumes.md
+++ b/docs/docs/concepts/volumes.md
@@ -1,13 +1,13 @@
 # Volumes
 
-Volumes allow you to persist data between runs. `dstack` simplifies managing volumes and lets you mount them to a specific
-directory when working with dev environments, tasks, and services.
+Volumes allow you to persist data between runs. `dstack` allows to create and attach volumes to 
+dev environments, tasks, and services.
 
 !!! info "Experimental"
     Volumes are currently experimental and work with the `aws`, `gcp`, and `runpod` backends.
     Support for other backends is coming soon.
 
-## Configuration
+## Define a configuration
 
 First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `vol.dstack.yml`
 are both acceptable).
@@ -16,9 +16,14 @@ are both acceptable).
 
 ```yaml
 type: volume
+# A name of the volume
 name: my-new-volume
+
+# Volumes are bound to a specific backend and region
 backend: aws
 region: eu-central-1
+
+# Required size
 size: 100GB
 ```
 
@@ -28,13 +33,13 @@ If you use this configuration, `dstack` will create a new volume based on the sp
 
 !!! info "Registering existing volumes"
     If you prefer not to create a new volume but to reuse an existing one (e.g., created manually), you can 
-    [specify its ID via `volume_id`](../reference/dstack.yml/volume.md#register-volume). In this case, `dstack` will register the specified volume so that you can use it with dev environments, tasks, and services.
+    [specify its ID via `volume_id`](../reference/dstack.yml/volume.md#existing-volume). In this case, `dstack` will register the specified volume so that you can use it with dev environments, tasks, and services.
 
 !!! info "Reference"
-    See the [.dstack.yml reference](../reference/dstack.yml/dev-environment.md)
-    for all supported configuration options and examples.
+    See [.dstack.yml](../reference/dstack.yml/volume.md) for all the options supported by
+    volumes, along with multiple examples.
 
-## Creating and registering volumes
+## Create, register, or update a volume
 
 To create or register the volume, simply call the `dstack apply` command:
 
@@ -43,6 +48,7 @@ To create or register the volume, simply call the `dstack apply` command:
 ```shell
 $ dstack apply -f volume.dstack.yml
 Volume my-new-volume does not exist yet. Create the volume? [y/n]: y
+
  NAME           BACKEND  REGION        STATUS     CREATED 
  my-new-volume  aws      eu-central-1  submitted  now     
 
@@ -54,7 +60,7 @@ Volume my-new-volume does not exist yet. Create the volume? [y/n]: y
 
 Once created, the volume can be attached with dev environments, tasks, and services.
 
-## Attaching volumes
+## Attach a volume
 
 Dev environments, tasks, and services let you attach any number of volumes.
 To attach a volume, simply specify its name using the `volumes` property and specify where to mount its contents:
@@ -63,7 +69,12 @@ To attach a volume, simply specify its name using the `volumes` property and spe
 
 ```yaml
 type: dev-environment
+# A name of the dev environment
+name: vscode-vol
+
 ide: vscode
+
+# Map the name of the volume to any path 
 volumes:
   - name: my-new-volume
     path: /volume_data
@@ -79,9 +90,9 @@ and its contents will persist across runs.
     to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to 
     attach volumes to `/workflow` or any of its subdirectories.
 
-## Managing volumes
+## Manage volumes
 
-### Listing volumes
+### List volumes
 
 The [`dstack volume list`](../reference/cli/index.md#dstack-gateway-list) command lists created and registered volumes:
 
@@ -91,7 +102,7 @@ NAME            BACKEND  REGION        STATUS  CREATED
  my-new-volume  aws      eu-central-1  active  3 weeks ago
 ```
 
-### Deleting volumes
+### Delete volumes
 
 When the volume isn't attached to any active dev environment, task, or service, you can delete it using `dstack delete`:
 
@@ -104,15 +115,17 @@ If you've registered an existing volume, it will be de-registered with `dstack`
 
 ## FAQ
 
-??? info "Using volumes across backends"
-    Since volumes are backed up by cloud network disks, you can only use them within the same cloud. If you need to access
-    data across different backends, you should either use object storage or replicate the data across multiple volumes.
+##### Can I use volumes across backends?
+
+Since volumes are backed up by cloud network disks, you can only use them within the same cloud. If you need to access
+data across different backends, you should either use object storage or replicate the data across multiple volumes.
+
+##### Can I use volumes across regions?
+
+Typically, network volumes are associated with specific regions, so you can't use them in other regions. Often,
+volumes are also linked to availability zones, but some providers support volumes that can be used across different
+availability zones within the same region.
 
-??? info "Using volumes across regions"
-    Typically, network volumes are associated with specific regions, so you can't use them in other regions. Often,
-    volumes are also linked to availability zones, but some providers support volumes that can be used across different
-    availability zones within the same region.
+##### Can I attach volumes to multiple runs or instances?
 
-??? info "Attaching volumes to multiple runs and instances"
-    You can mount a volume in multiple runs.
-    This feature is currently supported only by the `runpod` backend.
+You can mount a volume in multiple runs. This feature is currently supported only by the `runpod` backend.
\ No newline at end of file
diff --git a/docs/docs/dev-environments.md b/docs/docs/dev-environments.md
index 3afb69a7f..0c7c60070 100644
--- a/docs/docs/dev-environments.md
+++ b/docs/docs/dev-environments.md
@@ -1,27 +1,34 @@
 # Dev environments
 
-Before scheduling a task or deploying a model, you may want to run code interactively. Dev environments allow you to
-provision a remote machine set up with your code and favorite IDE with just one command.
+A dev environment lets you provision a remote machine with your code, dependencies, and resources, and access it with
+your desktop IDE. 
 
-## Configuration
+Dev environments are perfect when you need to run code interactively.
 
-First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `dev.dstack.yml` are
+## Define a configuration
+
+First, create a YAML file in your project repo. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `dev.dstack.yml` are
 both acceptable).
 
-<div editor-title=".dstack.yml"> 
+<div editor-title="examples/.dstack.yml"> 
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode
 
-# Specify the Python version, or your Docker image
 python: "3.11"
+# Uncomment to use a custom Docker image
+#image: dstackai/base:py3.10-0.4-cuda-12.1
 
-# This pre-configures the IDE with required extensions
 ide: vscode
 
-# Specify GPU, disk, and other resource requirements
+# Use either spot or on-demand instances
+spot_policy: auto
+
 resources:
-  gpu: 80GB
+  # Required resources
+  gpu: 24GB
 ```
 
 </div>
@@ -30,52 +37,36 @@ If you don't specify your Docker image, `dstack` uses the [base](https://hub.doc
 (pre-configured with Python, Conda, and essential CUDA drivers).
 
 !!! info "Reference"
-    See the [.dstack.yml reference](reference/dstack.yml/dev-environment.md)
-    for all supported configuration options and examples.
+    See [.dstack.yml](reference/dstack.yml/dev-environment.md) for all the options supported by
+    dev environments, along with multiple examples.
 
-## Running
+## Run a configuration
 
-To run a configuration, use the [`dstack run`](reference/cli/index.md#dstack-run) command followed by the working directory path, 
-configuration file path, and other options.
+To run a configuration, use the [`dstack apply`](reference/cli/index.md#dstack-apply) command.
 
 <div class="termy">
 
 ```shell
-$ dstack run . -f .dstack.yml
+$ dstack apply -f examples/.dstack.yml
 
- BACKEND     REGION         RESOURCES                     SPOT  PRICE
- tensordock  unitedkingdom  10xCPU, 80GB, 1xA100 (80GB)   no    $1.595
- azure       westus3        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- azure       westus2        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- 
-Continue? [y/n]: y
+ #  BACKEND  REGION    RESOURCES                SPOT  PRICE
+ 1  runpod   CA-MTL-1  9xCPU, 48GB, A5000:24GB  yes   $0.11
+ 2  runpod   EU-SE-1   9xCPU, 43GB, A5000:24GB  yes   $0.11
+ 3  gcp      us-west4  4xCPU, 16GB, L4:24GB     yes   $0.214516
 
-Provisioning `fast-moth-1`...
+Submit the run vscode? [y/n]: y
+
+Launching `vscode`...
 ---> 100%
 
 To open in VS Code Desktop, use this link:
-  vscode://vscode-remote/ssh-remote+fast-moth-1/workflow
+  vscode://vscode-remote/ssh-remote+vscode/workflow
 ```
 
 </div>
 
-When `dstack` provisions the dev environment, it mounts the project folder contents.
-
-??? info ".gitignore"
-    If there are large files or folders you'd like to avoid uploading, 
-    you can list them in `.gitignore`.
-
-??? info "Fleets"
-    By default, `dstack run` reuses `idle` instances from one of the existing [fleets](fleets.md). 
-    If no `idle` instances meet the requirements, it creates a new fleet using one of the configured backends.
-   
-    To have the fleet deleted after a certain idle time automatically, set
-    [`termination_idle_time`](../reference/dstack.yml/fleet.md#termination_idle_time).
-    By default, it's set to `5min`.
-
-!!! info "Reference"
-    See the [CLI reference](reference/cli/index.md#dstack-run) for more details
-    on how `dstack run` works.
+`dstack apply` automatically uploads the code from the current repo, including your local uncommitted changes.
+To avoid uploading large files, ensure they are listed in `.gitignore`.
 
 ### VS Code
 
@@ -96,21 +87,41 @@ $ ssh fast-moth-1
 
 </div>
 
-## Managing runs
+## Manage runs
 
-### Listing runs
+### List runs
 
-The [`dstack ps`](reference/cli/index.md#dstack-ps) command lists all running runs and their status.
+The [`dstack ps`](reference/cli/index.md#dstack-ps)  command lists all running jobs and their statuses. 
+Use `--watch` (or `-w`) to monitor the live status of runs.
 
-### Stopping runs
+### Stop a run
 
-Once the run exceeds the max duration,
-or when you use [`dstack stop`](reference/cli/index.md#dstack-stop), 
-the dev environment and its cloud resources are deleted.
+Once the run exceeds the [`max_duration`](reference/dstack.yml/dev-environment.md#max_duration), or when you use [`dstack stop`](reference/cli/index.md#dstack-stop), 
+the dev environment is stopped. Use `--abort` or `-x` to stop the run abruptly. 
 
 [//]: # (TODO: Mention `dstack logs` and `dstack logs -d`)
 
+## Manage fleets
+
+By default, `dstack apply` reuses `idle` instances from one of the existing [fleets](fleets.md), 
+or creates a new fleet through backends.
+
+!!! info "Idle duration"
+    To ensure the created fleets are deleted automatically, set
+    [`termination_idle_time`](reference/dstack.yml/fleet.md#termination_idle_time).
+    By default, it's set to `5min`.
+
+!!! info "Creation policy"
+    To ensure `dstack apply` always reuses an existing fleet and doesn't create a new one,
+    pass `--reuse` to `dstack apply` (or set [`creation_policy`](reference/dstack.yml/dev-environment.md#creation_policy) to `reuse` in the task configuration).
+    The default policy is `reuse_or_create`.
+
 ## What's next?
 
-1. Check the [`.dstack.yml` reference](reference/dstack.yml/dev-environment.md) for more details and examples
-2. See [fleets](fleets.md) on how to manage fleets
\ No newline at end of file
+1. Read about [dev environments](dev-environments.md), [tasks](tasks.md), and 
+    [services](services.md)
+2. See [fleets](fleets.md) on how to manage fleets
+
+!!! info "Reference"
+    See [.dstack.yml](reference/dstack.yml/dev-environment.md) for all the options supported by
+    dev environments, along with multiple examples.
diff --git a/docs/docs/fleets.md b/docs/docs/fleets.md
index 68386e967..38a71133e 100644
--- a/docs/docs/fleets.md
+++ b/docs/docs/fleets.md
@@ -5,7 +5,7 @@ fleet is created, it can be reused by dev environments, tasks, and services.
 
 > Fleets is a new feature. To use it, ensure you've installed version `0.18.7` or higher.
 
-## Configuration
+## Define a configuration
 
 To create a fleet, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `fleet.dstack.yml`
 are both acceptable).
@@ -15,19 +15,27 @@ are both acceptable).
     To provision a fleet in the cloud using the configured backends, specify the required resources, number of nodes, 
     and other optional parameters.
     
-    <div editor-title="examples/fleets/cluster.dstack.yml">
+    <div editor-title="examples/fine-tuning/alignment-handbook/fleet-distrib.dstack.yml">
     
     ```yaml
     type: fleet
-    name: my-fleet
+    # The name is optional, if not specified, generated randomly
+    name: ah-fleet-distrib
     
+    # Size of the cluster
     nodes: 2
+    # Ensure instances are interconnected
     placement: cluster
     
-    backends: [aws]
+    # Use either spot or on-demand instances
+    spot_policy: auto
     
     resources:
-      gpu: 24GB
+      gpu:
+        # 24GB or more vRAM
+        memory: 24GB..
+        # One or more GPU
+        count: 1..
     ```
     
     </div>
@@ -41,14 +49,17 @@ are both acceptable).
 
     To create a fleet from on-prem servers, specify their hosts along with the user, port, and SSH key for connection via SSH.
 
-    <div editor-title="examples/fleets/cluster.dstack.yml"> 
+    <div editor-title="fleet-on-prem.dstack.yml"> 
     
     ```yaml
     type: fleet
-    name: my-fleet
+    # The name is optional, if not specified, generated randomly
+    name: my-on-prem-fleet
 
+    # Ensure instances are interconnected
     placement: cluster
 
+    # The user, private SSH key, and hostnames of the on-prem servers
     ssh_config:
       user: ubuntu
       identity_file: ~/.ssh/id_rsa
@@ -65,21 +76,22 @@ are both acceptable).
 
     Set `placement` to `cluster` if the nodes are interconnected (e.g. if you'd like to use them for multi-node tasks).
     In that case, by default, `dstack` will automatically detect the private network. 
-    You can specify the [`network`](../reference/dstack.yml/fleet.md#network) parameter manually.
+    You can specify the [`network`](reference/dstack.yml/fleet.md#network) parameter manually.
 
 !!! info "Reference"
-    See the [.dstack.yml reference](reference/dstack.yml/fleet.md)
-    for all supported configuration options and examples.
+    See [.dstack.yml](reference/dstack.yml/fleet.md) for all the options supported by
+    fleets, along with multiple examples.
 
-## Creating and updating fleets
+## Create or update a fleet
 
 To create or update the fleet, simply call the [`dstack apply`](reference/cli/index.md#dstack-apply) command:
 
 <div class="termy">
 
 ```shell
-$ dstack apply -f examples/fleets/cluster.dstack.yml
-Fleet my-fleet does not exist yet. Create the fleet? [y/n]: y
+$ dstack apply -f examples/fine-tuning/alignment-handbook/fleet-distributed.dstack.yml
+Fleet ah-fleet-distrib does not exist yet. Create the fleet? [y/n]: y
+
  FLEET     INSTANCE  BACKEND  RESOURCES  PRICE  STATUS   CREATED 
  my-fleet  0                                    pending  now     
            1                                    pending  now     
@@ -87,22 +99,26 @@ Fleet my-fleet does not exist yet. Create the fleet? [y/n]: y
 
 </div>
 
-Once the status of instances change to `idle`, they can be used by `dstack run`.
+Once the status of instances change to `idle`, they can be used by dev environments, tasks, and services.
 
 ## Creation policy
 
-> By default, `dstack run` tries to reuse `idle` instances from existing fleets. 
-If no `idle` instances meet the requirements, `dstack run` creates a new fleet automatically.
-To avoid creating new fleet, specify pass `--reuse` to `dstack run`.
+By default, when running dev environments, tasks, and services, `dstack apply` tries to reuse `idle` 
+instances from existing fleets. 
+If no `idle` instances meet the requirements, it creates a new fleet automatically.
+To avoid creating new fleet, specify pass `--reuse` to `dstack apply` or (or set [
+`creation_policy`](reference/dstack.yml/dev-environment.md#creation_policy) to `reuse` in the configuration).
 
 ## Termination policy
 
-> If you want a fleet to be automatically deleted after a certain idle time, you can set the 
-you can set the [`termination_idle_time`](reference/dstack.yml/fleet.md#termination_idle_time) property.
+> If you want a fleet to be automatically deleted after a certain idle time, you can set the
+> you can set the [`termination_idle_time`](reference/dstack.yml/fleet.md#termination_idle_time) property.
+
+[//]: # (Add Idle time example to the reference page)
 
-## Managing fleets
+## Manage fleets
 
-### Listing fleets
+### List fleets
 
 The [`dstack fleet`](reference/cli/index.md#dstack-gateway-list) command lists fleet instances and theri status:
 
@@ -117,7 +133,7 @@ $ dstack fleet
 
 </div>
 
-### Deleting fleets
+### Delete fleets
 
 When a fleet isn't used by run, you can delete it via `dstack delete`:
 
@@ -133,4 +149,14 @@ Fleet my-gcp-fleet deleted
 
 You can pass either the path to the configuration file or the fleet name directly.
 
-To terminate and delete specific instances from a fleet, pass `-i INSTANCE_NUM`.
\ No newline at end of file
+To terminate and delete specific instances from a fleet, pass `-i INSTANCE_NUM`.
+
+## What's next?
+
+1. Read about [dev environments](dev-environments.md), [tasks](tasks.md), and 
+    [services](services.md) 
+2. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
+
+!!! info "Reference"
+    See [.dstack.yml](reference/dstack.yml/fleet.md) for all the options supported by
+    fleets, along with multiple examples.
\ No newline at end of file
diff --git a/docs/docs/guides/dstack-sky.md b/docs/docs/guides/dstack-sky.md
new file mode 100644
index 000000000..345b6276d
--- /dev/null
+++ b/docs/docs/guides/dstack-sky.md
@@ -0,0 +1,44 @@
+# dstack Sky
+
+If you don't want to host the `dstack` server or would like to access GPU from the `dstack` marketplace, 
+sign up with [dstack Sky](../guides/dstack-sky.md).
+
+### Set up the CLI
+
+If you've signed up, open your project settings, and copy the `dstack config` command to point the CLI to the project.
+
+![](https://raw.githubusercontent.com/dstackai/static-assets/main/static-assets/images/dstack-sky-project-config.png){ width=800 }
+
+Then, install the CLI on your machine and use the copied command.
+
+<div class="termy">
+
+```shell
+$ pip install dstack
+$ dstack config --url https://sky.dstack.ai \
+    --project peterschmidt85 \
+    --token bbae0f28-d3dd-4820-bf61-8f4bb40815da
+    
+Configuration is updated at ~/.dstack/config.yml
+```
+
+</div>
+
+### Configure clouds
+
+By default, [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"} 
+uses the GPU from its marketplace, which requires a credit card to be attached in your account
+settings.
+
+To use your own cloud accounts, click the settings icon of the corresponding backend and specify credentials:
+
+![](https://raw.githubusercontent.com/dstackai/static-assets/main/static-assets/images/dstack-sky-edit-backend-config.png){ width=800 }
+
+For more details on how to configure your own cloud accounts, check
+the [server/config.yml reference](../reference/server/config.yml.md).
+
+## What's next?
+
+1. Follow [quickstart](../quickstart.md)
+2. Browse [examples :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/tree/master/examples)
+3. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
\ No newline at end of file
diff --git a/docs/docs/index.md b/docs/docs/index.md
index 49fdb0ddb..44299a711 100644
--- a/docs/docs/index.md
+++ b/docs/docs/index.md
@@ -1,13 +1,12 @@
 # What is dstack?
 
-`dstack` is an open-source container orchestration engine for AI. 
-It accelerates the development, training, and deployment of AI models, and simplifies the management of clusters.
+`dstack` is a lightweight alternative to Kubernetes, designed specifically for managing the development, training, and
+deployment of AI models at any scale.
 
-#### Cloud and on-prem
+`dstack` is easy to use with any cloud provider (AWS, GCP, Azure, OCI, Lambda, TensorDock, Vast.ai, RunPod, etc.) or
+any on-prem clusters.
 
-`dstack` is easy to use with any cloud or on-prem servers.
-Supported cloud providers include AWS, GCP, Azure, OCI, Lambda, TensorDock, Vast.ai, RunPod, and CUDO.
-For using `dstack` with on-prem servers, see [fleets](fleets.md#__tabbed_1_2).
+If you already use Kubernetes, `dstack` can be used with it.
 
 #### Accelerators
 
@@ -15,35 +14,31 @@ For using `dstack` with on-prem servers, see [fleets](fleets.md#__tabbed_1_2).
 
 ## How does it work?
 
-> Before using `dstack`, [install](installation/index.md) the server and configure 
-backends for each cloud account (or Kubernetes cluster) that you intend to use.
+> Before using `dstack`, [install](installation/index.md) the server and configure backends.
 
-#### 1. Define run configurations
+#### 1. Define configurations
 
-`dstack` supports three types of run configurations:
+`dstack` supports the following configurations:
    
 * [Dev environments](dev-environments.md) &mdash; for interactive development using a desktop IDE
-* [Tasks](tasks.md) &mdash; for any kind of batch jobs or web applications (supports distributed jobs)
-* [Services](services.md)&mdash; for production-grade deployment (supports auto-scaling and authorization)
-
-Each type of run configuration allows you to specify commands for execution, required compute resources, retry policies, auto-scaling rules, authorization settings, and more.
+* [Tasks](tasks.md) &mdash; for scheduling jobs (incl. distributed jobs) or running web apps
+* [Services](services.md) &mdash; for deployment of models and web apps (with auto-scaling and authorization)
+* [Fleets](fleets.md) &mdash; for managing cloud and on-prem clusters
+* [Volumes](concepts/volumes.md) &mdash; for managing persisted volumes
+* [Gateways](concepts/volumes.md) &mdash; for configuring the ingress traffic and public endpoints
 
 Configuration can be defined as YAML files within your repo.
 
-#### 2. Run configurations
-
-Run any defined configuration either via `dstack` CLI or API.
-   
-`dstack` automatically handles provisioning, interruptions, port-forwarding, auto-scaling, network, volumes, 
-run failures, out-of-capacity errors, and more.
+#### 2. Apply configurations
 
-#### 3. Manage fleets
+Apply the configuration either via the `dstack apply` CLI command or through a programmatic API.
 
-Use [fleets](fleets.md) to provision and manage clusters and instances, both in the cloud and on-prem.
+`dstack` automatically manages provisioning, job queuing, auto-scaling, networking, volumes, run failures,
+out-of-capacity errors, port-forwarding, and more &mdash; across clouds and on-prem clusters.
 
 ## Where do I start?
 
 1. Proceed to [installation](installation/index.md)
 2. See [quickstart](quickstart.md)
-3. Browse [examples :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/tree/master/examples){:target="_blank"}
+3. Browse [examples](/docs/examples)
 4. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd){:target="_blank"}
\ No newline at end of file
diff --git a/docs/docs/installation/index.md b/docs/docs/installation/index.md
index 46725cad5..8d984d441 100644
--- a/docs/docs/installation/index.md
+++ b/docs/docs/installation/index.md
@@ -13,7 +13,7 @@ Follow the steps below to set up the server.
 
 ### 1. Configure backends
 
-> If you want the `dstack` server to run containers or manage clusters in your cloud accounts (or use Kubernetes),
+If you want the `dstack` server to run containers or manage clusters in your cloud accounts (or use Kubernetes),
 create the [~/.dstack/server/config.yml](../reference/server/config.yml.md) file and configure backends.
 
 ### 2. Start the server
@@ -55,16 +55,16 @@ Once the `~/.dstack/server/config.yml` file is configured, proceed to start the
 
     > For more details on how to deploy `dstack` using Docker, check its [Docker repo](https://hub.docker.com/r/dstackai/dstack).
 
-> By default, the `dstack` server stores its state in `~/.dstack/server/data` using SQLite.
-> To use a database, set the [`DSTACK_DATABASE_URL`](../reference/cli/index.md#environment-variables) environment variable.
+By default, the `dstack` server stores its state in `~/.dstack/server/data` using SQLite.
+To use a database, set the [`DSTACK_DATABASE_URL`](../reference/cli/index.md#environment-variables) environment variable.
 
-The server can be set up anywhere: on your laptop, a dedicated server, or in the cloud.
-Once the `dstack` server is up, you can use the CLI or API.
+The `dstack` server can run anywhere: on your laptop, a dedicated server, or in the cloud. Once it's up, you
+can use either the CLI or the API.
 
 ### 3. Set up the CLI
 
 To point the CLI to the `dstack` server, configure it
-with the server address, user token and project name:
+with the server address, user token, and project name:
 
 <div class="termy">
 
@@ -81,55 +81,18 @@ Configuration is updated at ~/.dstack/config.yml
 
 This configuration is stored in `~/.dstack/config.yml`.
 
-### 4. Add on-prem servers
+### 4. Create on-prem fleets
     
-!!! info "Fleets"
-    If you want the `dstack` server to run containers on your on-prem servers,
-    use [fleets](../fleets.md#__tabbed_1_2).
-
-## dstack Sky
-
-If you don't want to host the `dstack` server yourself or would like to access GPU from the `dstack` marketplace, sign up with
-[dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"}.
-
-### Set up the CLI
-
-If you've signed up,
-open your project settings, and copy the `dstack config` command to point the CLI to the project.
-
-![](https://raw.githubusercontent.com/dstackai/static-assets/main/static-assets/images/dstack-sky-project-config.png){ width=800 }
-
-Then, install the CLI on your machine and use the copied command.
-
-<div class="termy">
-
-```shell
-$ pip install dstack
-$ dstack config --url https://sky.dstack.ai \
-    --project peterschmidt85 \
-    --token bbae0f28-d3dd-4820-bf61-8f4bb40815da
-    
-Configuration is updated at ~/.dstack/config.yml
-```
-
-</div>
-
-### Configure clouds
-
-By default, [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"} 
-uses the GPU from its marketplace, which requires a credit card to be attached in your account
-settings.
-
-To use your own cloud accounts, click the settings icon of the corresponding backend and specify credentials:
-
-![](https://raw.githubusercontent.com/dstackai/static-assets/main/static-assets/images/dstack-sky-edit-backend-config.png){ width=800 }
-
-[//]: # (The `dstack server` command automatically updates `~/.dstack/config.yml`)
-[//]: # (with the `main` project.)
+If you want the `dstack` server to run containers on your on-prem servers,
+use [fleets](../fleets.md#__tabbed_1_2).
 
 ## What's next?
 
 1. Check the [server/config.yml reference](../reference/server/config.yml.md) on how to configure backends
 2. Follow [quickstart](../quickstart.md)
 3. Browse [examples :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/tree/master/examples)
-4. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
\ No newline at end of file
+4. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
+
+!!! info "dstack Sky"
+    If you don't want to host the `dstack` server or would like to access GPU from the `dstack` marketplace, 
+    check [dstack Sky](../guides/dstack-sky.md).
\ No newline at end of file
diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md
index 5f4700810..81e3fb9ce 100644
--- a/docs/docs/quickstart.md
+++ b/docs/docs/quickstart.md
@@ -1,7 +1,6 @@
 # Quickstart
 
-> Before using `dstack`, [install](installation/index.md) the server and configure 
-backends.
+> Before using `dstack`, [install](installation/index.md) the server.
     
 ## Initialize a repo
 
@@ -18,118 +17,220 @@ $ dstack init
 
 Your folder can be a regular local folder or a Git repo.
 
-## Define a configuration
-
-Define what you want to run as a YAML file. The filename must end with `.dstack.yml` (e.g., `.dstack.yml`
-or `train.dstack.yml` are both acceptable).
+## Run a configuration
 
 === "Dev environment"
 
-    Dev environments allow you to quickly provision a machine with a pre-configured environment, resources, IDE, code, etc.
+    A dev environment lets you provision a remote machine with your code, dependencies, and resources, and access it 
+    with your desktop IDE.
+
+    ##### Define a configuration
+
+    Create the following configuration file inside the repo:
 
     <div editor-title=".dstack.yml"> 
 
     ```yaml
     type: dev-environment
-
-    # Use either `python` or `image` to configure environment
+    # The name is optional, if not specified, generated randomly
+    name: vscode
+    
     python: "3.11"
-    # image: ghcr.io/huggingface/text-generation-inference:latest
+    # Uncomment to use a custom Docker image
+    #image: dstackai/base:py3.10-0.4-cuda-12.1
     
     ide: vscode
-
-    # (Optional) Configure `gpu`, `memory`, `disk`, etc
-    resources:
-      gpu: 24GB
+    
+    # Use either spot or on-demand instances
+    spot_policy: auto
+    
+    # Uncomment to request resources
+    #resources:
+    #  gpu: 24GB
     ```
 
     </div>
 
+    ##### Run the configuration
+
+    Run the configuration via [`dstack apply`](reference/cli/index.md#dstack-apply):
+
+    <div class="termy">
+
+    ```shell
+    $ dstack apply -f .dstack.yml
+    
+     #  BACKEND  REGION           RESOURCES                 SPOT  PRICE
+     1  gcp      us-west4         2xCPU, 8GB, 100GB (disk)  yes   $0.010052
+     2  azure    westeurope       2xCPU, 8GB, 100GB (disk)  yes   $0.0132
+     3  gcp      europe-central2  2xCPU, 8GB, 100GB (disk)  yes   $0.013248
+     
+    Submit the run vscode? [y/n]: y
+    
+    Launching `vscode`...
+    ---> 100%
+    
+    To open in VS Code Desktop, use this link:
+      vscode://vscode-remote/ssh-remote+vscode/workflow
+    ```
+    
+    </div>
+
+    Open the link to access the dev environment using your desktop IDE.
+
 === "Task"
 
-    Tasks make it very easy to run any scripts, be it for training, data processing, or web apps. They allow you to pre-configure the environment, resources, code, etc.
+    A task allows you to schedule a job or run a web app. It lets you configure 
+    dependencies, resources, ports, the number of nodes (if you want to run the task on a cluster), etc.
 
-    <div editor-title="train.dstack.yml"> 
+    ##### Define a configuration
+
+    Create the following configuration file inside the repo:
+
+    <div editor-title="streamlit.dstack.yml"> 
 
     ```yaml
     type: task
-
+    # The name is optional, if not specified, generated randomly
+    name: streamlit
+    
     python: "3.11"
-    env:
-      - HF_HUB_ENABLE_HF_TRANSFER=1
+    # Uncomment to use a custom Docker image
+    #image: dstackai/base:py3.10-0.4-cuda-12.1
+    
+    # Commands of the task
     commands:
-      - pip install -r fine-tuning/qlora/requirements.txt
-      - python fine-tuning/qlora/train.py
-
-    # (Optional) Configure `gpu`, `memory`, `disk`, etc
-    resources:
-      gpu: 24GB
+      - pip install streamlit
+      - streamlit hello
+    # Ports to forward
+    ports:
+      - 8501
+
+    # Use either spot or on-demand instances
+    spot_policy: auto
+    
+    # Uncomment to request resources
+    #resources:
+    #  gpu: 24GB
     ```
 
     </div>
 
-    Ensure `requirements.txt` and `train.py` are in your folder. You can take them from [`examples`](https://github.com/dstackai/dstack/tree/master/examples/fine-tuning/qlora).
+    ##### Run the configuration
+
+    Run the configuration via [`dstack apply`](reference/cli/index.md#dstack-apply):
+
+    <div class="termy">
+
+    ```shell
+    $ dstack apply -f streamlit.dstack.yml
+    
+     #  BACKEND  REGION           RESOURCES                 SPOT  PRICE
+     1  gcp      us-west4         2xCPU, 8GB, 100GB (disk)  yes   $0.010052
+     2  azure    westeurope       2xCPU, 8GB, 100GB (disk)  yes   $0.0132
+     3  gcp      europe-central2  2xCPU, 8GB, 100GB (disk)  yes   $0.013248
+     
+    Submit the run streamlit? [y/n]: y
+     
+    Continue? [y/n]: y
+    
+    Provisioning `streamlit`...
+    ---> 100%
+
+      Welcome to Streamlit. Check out our demo in your browser.
+
+      Local URL: http://localhost:8501
+    ```
+    
+    </div>
+
+    `dstack apply` automatically forwards the remote ports to `localhost` for convenient access.
 
 === "Service"
 
-    Services make it easy to deploy models and apps cost-effectively as public endpoints, allowing you to use any frameworks.
+    A service allows you to deploy a web app or a model as a scalable endpoint. It lets you configure
+    dependencies, resources, authorizarion, auto-scaling rules, etc. 
+
+    ??? info "Prerequisites
+        If you're using the open-source server, you must set up a [gateway](concepts/gateways.md) before you can run a service.
 
-    <div editor-title="serve.dstack.yml"> 
+        If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
+        the gateway is already set up for you.
+
+    ##### Define a configuration
+
+    Create the following configuration file inside the repo:
+
+    <div editor-title="streamlit-service.dstack.yml"> 
 
     ```yaml
     type: service
-
-    image: ghcr.io/huggingface/text-generation-inference:latest
-    env:
-      - HUGGING_FACE_HUB_TOKEN # required to run gated models
-      - MODEL_ID=mistralai/Mistral-7B-Instruct-v0.1
+    # The name is optional, if not specified, generated randomly
+    name: streamlit-service
+    
+    python: "3.11"
+    # Uncomment to use a custom Docker image
+    #image: dstackai/base:py3.10-0.4-cuda-12.1
+    
+    # Commands of the service
     commands:
-      - text-generation-launcher --port 8000 --trust-remote-code
-    port: 8000
+      - pip install streamlit
+      - streamlit hello
+    # Port of the service
+    port: 8501
 
-    # (Optional) Configure `gpu`, `memory`, `disk`, etc
-    resources:
-      gpu: 24GB
+    # Comment to enable authorizartion
+    auth: False
+
+    # Use either spot or on-demand instances
+    spot_policy: auto
+    
+    # Uncomment to request resources
+    #resources:
+    #  gpu: 24GB
     ```
 
     </div>
 
-## Run configuration
+    ##### Run the configuration
 
-Run a configuration using the [`dstack run`](reference/cli/index.md#dstack-run) command, followed by the working directory path (e.g., `.`), 
-and the path to the configuration file.
+    Run the configuration via [`dstack apply`](reference/cli/index.md#dstack-apply):
 
-<div class="termy">
+    <div class="termy">
 
-```shell
-$ dstack run . -f train.dstack.yml
-
- BACKEND     REGION         RESOURCES                     SPOT  PRICE
- tensordock  unitedkingdom  10xCPU, 80GB, 1xA100 (80GB)   no    $1.595
- azure       westus3        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- azure       westus2        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- 
-Continue? [y/n]: y
-
-Provisioning...
----> 100%
+    ```shell
+    $ dstack apply -f streamlit.dstack.yml
+    
+     #  BACKEND  REGION           RESOURCES                 SPOT  PRICE
+     1  gcp      us-west4         2xCPU, 8GB, 100GB (disk)  yes   $0.010052
+     2  azure    westeurope       2xCPU, 8GB, 100GB (disk)  yes   $0.0132
+     3  gcp      europe-central2  2xCPU, 8GB, 100GB (disk)  yes   $0.013248
+     
+    Submit the run streamlit? [y/n]: y
+     
+    Continue? [y/n]: y
+    
+    Provisioning `streamlit`...
+    ---> 100%
 
-Epoch 0:  100% 1719/1719 [00:18<00:00, 92.32it/s, loss=0.0981, acc=0.969]
-Epoch 1:  100% 1719/1719 [00:18<00:00, 92.32it/s, loss=0.0981, acc=0.969]
-Epoch 2:  100% 1719/1719 [00:18<00:00, 92.32it/s, loss=0.0981, acc=0.969]
-```
+      Welcome to Streamlit. Check out our demo in your browser.
 
-</div>
+      Local URL: https://streamlit-service.example.com
+    ```
+    
+    </div>
 
-The `dstack run` command automatically uploads your code, including any local uncommitted changes. 
+    One the service is up, its endpoint is accessible at `https://<run name>.<gateway domain>`.
 
-!!! info "Fleets"
-    By default, `dstack run` reuses `idle` instances from one of the existing [fleets](fleets.md). 
-    If no `idle` instances meet the requirements, it creates a new fleet using one of the configured backends.
+> `dstack apply` automatically uploads the code from the current repo, including your local uncommitted changes.
 
 ## What's next?
 
 1. Read about [dev environments](dev-environments.md), [tasks](tasks.md), 
     [services](services.md), and [fleets](fleets.md) 
 2. Browse [examples :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/tree/master/examples){:target="_blank"}
-3. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
\ No newline at end of file
+3. Join the community via [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
+
+!!! info "Examples"
+    To see how dev environments, tasks, services, and fleets can be used for 
+    training and deploying AI models, check out the [examples](examples/index.md).
\ No newline at end of file
diff --git a/docs/docs/reference/dstack.yml/dev-environment.md b/docs/docs/reference/dstack.yml/dev-environment.md
index 2a39c0788..c516099c2 100644
--- a/docs/docs/reference/dstack.yml/dev-environment.md
+++ b/docs/docs/reference/dstack.yml/dev-environment.md
@@ -2,9 +2,9 @@
 
 The `dev-environment` configuration type allows running [dev environments](../../dev-environments.md).
 
-> Configuration files must have a name ending with `.dstack.yml` (e.g., `.dstack.yml` or `serve.dstack.yml` are both acceptable)
-> and can be located in the project's root directory or any nested folder.
-> Any configuration can be run via [`dstack run`](../cli/index.md#dstack-run).
+> Configuration files must be inside the project repo, and their names must end with `.dstack.yml` 
+> (e.g. `.dstack.yml` or `dev.dstack.yml` are both acceptable).
+> Any configuration can be run via [`dstack apply`](../cli/index.md#dstack-apply).
 
 ## Examples
 
@@ -18,25 +18,46 @@ The `python` property determines which default Docker image is used.
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
-python: "3.11"
+# If `image` is not specified, dstack uses its base image
+python: "3.10"
 
 ide: vscode
 ```
 
 </div>
 
-!!! info "nvcc"
+??? info "nvcc"
     Note that the default Docker image doesn't bundle `nvcc`, which is required for building custom CUDA kernels. 
     To install it, use `conda install cuda`.
 
+
+    ```yaml
+    type: dev-environment
+    # The name is optional, if not specified, generated randomly
+    name: vscode    
+
+    python: "3.10"
+    
+    ide: vscode
+
+    # Run this command on start
+    init:
+      - conda install cuda
+    ```
+
 ### Docker image
 
 <div editor-title=".dstack.yml"> 
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
+# Any custom Docker image
 image: ghcr.io/huggingface/text-generation-inference:latest
 
 ide: vscode
@@ -50,8 +71,12 @@ ide: vscode
 
     ```yaml
     type: dev-environment
-    
+    # The name is optional, if not specified, generated randomly
+    name: vscode    
+
+    # Any private Docker image
     image: ghcr.io/huggingface/text-generation-inference:latest
+    # Credentials of the private Docker registry
     registry_auth:
       username: peterschmidt85
       password: ghp_e49HcZ9oYwBzUbcSk2080gXZOU2hiT9AeSR5
@@ -68,19 +93,19 @@ range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
 ide: vscode
 
 resources:
   # 200GB or more RAM
   memory: 200GB..
-
   # 4 GPUs from 40GB to 80GB
   gpu: 40GB..80GB:4
-
-  # Shared memory
+  # Shared memory (required by multi-gpu)
   shm_size: 16GB
-
+  # Disk size
   disk: 500GB
 ```
 
@@ -96,6 +121,8 @@ and their quantity. Examples: `A100` (one A100), `A10G,A100` (either A10G or A10
 
     ```yaml
     type: dev-environment
+    # The name is optional, if not specified, generated randomly
+    name: vscode    
     
     ide: vscode
     
@@ -115,7 +142,10 @@ and their quantity. Examples: `A100` (one A100), `A10G,A100` (either A10G or A10
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
+# Environment variables
 env:
   - HUGGING_FACE_HUB_TOKEN
   - HF_HUB_ENABLE_HF_TRANSFER=1
@@ -125,12 +155,12 @@ ide: vscode
 
 </div>
 
-If you don't assign a value to an environment variable (see `HUGGING_FACE_HUB_TOKEN` above), 
+> If you don't assign a value to an environment variable (see `HUGGING_FACE_HUB_TOKEN` above), 
 `dstack` will require the value to be passed via the CLI or set in the current process.
 
 For instance, you can define environment variables in a `.env` file and utilize tools like `direnv`.
 
-#### Default environment variables
+#### System environment variables
 
 The following environment variables are available in any run and are passed by `dstack` by default:
 
@@ -148,9 +178,12 @@ You can choose whether to use spot instances, on-demand instances, or any availa
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
 ide: vscode
 
+# Use either spot or on-demand instances
 spot_policy: auto
 ```
 
@@ -166,9 +199,12 @@ By default, `dstack` provisions instances in all configured backends. However, y
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
 ide: vscode
 
+# Use only listed backends
 backends: [aws, gcp]
 ```
 
@@ -182,9 +218,12 @@ By default, `dstack` uses all configured regions. However, you can specify the l
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
 ide: vscode
 
+# Use only listed regions
 regions: [eu-west-1, eu-west-2]
 ```
 
@@ -199,9 +238,12 @@ To attach a volume, simply specify its name using the `volumes` property and spe
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
 ide: vscode
 
+# Map the name of the volume to any path
 volumes:
   - name: my-new-volume
     path: /volume_data
@@ -212,7 +254,7 @@ volumes:
 Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the dev
 environment, and its contents will persist across runs.
 
-!!! info "Limitations"
+??? info "Limitations"
     When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
     to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to 
     attach volumes to `/workflow` or any of its subdirectories.
diff --git a/docs/docs/reference/dstack.yml/fleet.md b/docs/docs/reference/dstack.yml/fleet.md
index e763ccded..ccccd4c21 100644
--- a/docs/docs/reference/dstack.yml/fleet.md
+++ b/docs/docs/reference/dstack.yml/fleet.md
@@ -2,35 +2,52 @@
 
 The `fleet` configuration type allows creating and updating fleets.
 
-> Configuration files must have a name ending with `.dstack.yml` (e.g., `.dstack.yml` or `fleet.dstack.yml` are both acceptable)
-> and can be located in the project's root directory or any nested folder.
-> Any configuration can be applied via [`dstack apply`](../cli/index.md#dstack-apply).
+> Configuration files must be inside the project repo, and their names must end with `.dstack.yml` 
+> (e.g. `.dstack.yml` or `fleet.dstack.yml` are both acceptable).
+> Any configuration can be run via [`dstack apply`](../cli/index.md#dstack-apply).
 
 ## Examples
 
-### Creating a cloud fleet { #create-cloud-fleet }
+### Cloud fleet { #cloud-fleet }
 
-<div editor-title="gcp-fleet.dstack.yml"> 
+<div editor-title="fleet-distrib.dstack.yml"> 
 
 ```yaml
 type: fleet
-name: my-gcp-fleet
+# The name is optional, if not specified, generated randomly
+name: my-fleet
+
+# The number of instances
 nodes: 4
+# Ensure the instances are interconnected
 placement: cluster
-backends: [gcp]
+
+# Use either spot or on-demand instances
+spot_policy: auto
+
 resources:
-  gpu: 1
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
 ```
 
 </div>
 
-### Creating an on-prem fleet { #create-ssh-fleet }
+### On-prem fleet { #on-prem-fleet }
 
-<div editor-title="ssh-fleet.dstack.yml"> 
+<div editor-title="fleet-on-prem.dstack.yml"> 
     
 ```yaml
 type: fleet
-name: my-ssh-fleet
+# The name is optional, if not specified, generated randomly
+name: my-on-prem-fleet
+
+# Ensure instances are interconnected
+placement: cluster
+
+# The user, private SSH key, and hostnames of the on-prem servers
 ssh_config:
   user: ubuntu
   identity_file: ~/.ssh/id_rsa
@@ -43,6 +60,8 @@ ssh_config:
 
 [//]: # (TODO: a cluster, individual user and identity file, etc)
 
+[//]: # (TODO: other examples, for all properties like in dev-environment/task/service)
+
 ## Root reference
 
 #SCHEMA# dstack._internal.core.models.fleets.FleetConfiguration
@@ -57,7 +76,6 @@ ssh_config:
     overrides:
       show_root_heading: false
 
-
 ## `ssh.hosts[n]`
 
 #SCHEMA# dstack._internal.core.models.fleets.SSHHostParams
diff --git a/docs/docs/reference/dstack.yml/gateway.md b/docs/docs/reference/dstack.yml/gateway.md
index 66fc3c21f..c9b2cd4b9 100644
--- a/docs/docs/reference/dstack.yml/gateway.md
+++ b/docs/docs/reference/dstack.yml/gateway.md
@@ -2,25 +2,32 @@
 
 The `gateway` configuration type allows creating and updating [gateways](../../services.md).
 
-> Configuration files must have a name ending with `.dstack.yml` (e.g., `.dstack.yml` or `gateway.dstack.yml` are both acceptable)
-> and can be located in the project's root directory or any nested folder.
-> Any configuration can be applied via [`dstack apply`](../cli/index.md#dstack-apply).
+> Configuration files must be inside the project repo, and their names must end with `.dstack.yml` 
+> (e.g. `.dstack.yml` or `gateway.dstack.yml` are both acceptable).
+> Any configuration can be run via [`dstack apply`](../cli/index.md#dstack-apply).
 
 ## Examples
 
+### Creating a new gateway { #new-gateway }
+
 <div editor-title="gateway.dstack.yml"> 
 
 ```yaml
 type: gateway
+# A name of the gateway
 name: example-gateway
 
+# Gateways are bound to a specific backend and region
 backend: aws
 region: eu-west-1
+
+# This domain will be used to access the endpoint
 domain: example.com
 ```
 
 </div>
 
+[//]: # (TODO: other examples, e.g. private gateways)
 
 ## Root reference
 
diff --git a/docs/docs/reference/dstack.yml/service.md b/docs/docs/reference/dstack.yml/service.md
index 9ca184d57..99c4101e5 100644
--- a/docs/docs/reference/dstack.yml/service.md
+++ b/docs/docs/reference/dstack.yml/service.md
@@ -2,9 +2,9 @@
 
 The `service` configuration type allows running [services](../../services.md).
 
-> Configuration files must have a name ending with `.dstack.yml` (e.g., `.dstack.yml` or `serve.dstack.yml` are both acceptable)
-> and can be located in the project's root directory or any nested folder.
-> Any configuration can be run via [`dstack run . -f PATH`](../cli/index.md#dstack-run).
+> Configuration files must be inside the project repo, and their names must end with `.dstack.yml` 
+> (e.g. `.dstack.yml` or `serve.dstack.yml` are both acceptable).
+> Any configuration can be run via [`dstack apply`](../cli/index.md#dstack-apply).
 
 ## Examples
 
@@ -14,16 +14,20 @@ If you don't specify `image`, `dstack` uses the default Docker image pre-configu
 `python`, `pip`, `conda` (Miniforge), and essential CUDA drivers. 
 The `python` property determines which default Docker image is used.
 
-<div editor-title="serve.dstack.yml"> 
+<div editor-title="service.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service    
 
-python: "3.11"
+# If `image` is not specified, dstack uses its base image
+python: "3.10"
 
+# Commands of the service
 commands:
   - python3 -m http.server
-
+# The port of the service
 port: 8000
 ```
 
@@ -31,20 +35,24 @@ port: 8000
 
 !!! info "nvcc"
     Note that the default Docker image doesn't bundle `nvcc`, which is required for building custom CUDA kernels. 
-    To install it, use `conda install cuda`.
+    To install it, use `conda install cuda` as the first command.
 
 ### Docker image
 
-<div editor-title="serve.dstack.yml">
+<div editor-title="service.dstack.yml">
 
     ```yaml
     type: service
+    # The name is optional, if not specified, generated randomly
+    name: http-server-service
+
+    # Any custom Docker image
+    image: dstackai/base:py3.10-0.4-cuda-12.1
     
-    image: dstackai/base:py3.11-0.4-cuda-12.1
-    
+    # Commands of the service
     commands:
       - python3 -m http.server
-      
+    # The port of the service
     port: 8000
     ```
 
@@ -56,47 +64,55 @@ port: 8000
 
     ```yaml
     type: service
+    # The name is optional, if not specified, generated randomly
+    name: http-server-service
     
-    image: dstackai/base:py3.11-0.4-cuda-12.1
-    
-    commands:
-      - python3 -m http.server
+    # Any private Docker iamge
+    image: dstackai/base:py3.10-0.4-cuda-12.1
+    # Credentials of the private registry
     registry_auth:
       username: peterschmidt85
       password: ghp_e49HcZ9oYwBzUbcSk2080gXZOU2hiT9AeSR5
-      
+    
+    # Commands of the service  
+    commands:
+      - python3 -m http.server
+    # The port of the service
     port: 8000
     ```
 
-### OpenAI-compatible interface { #model-mapping }
+### Model gateway { #model-mapping }
 
 By default, if you run a service, its endpoint is accessible at `https://<run name>.<gateway domain>`.
 
 If you run a model, you can optionally configure the mapping to make it accessible via the 
 OpenAI-compatible interface.
 
-<div editor-title="serve.dstack.yml"> 
+<div editor-title="service.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: llama31-service
 
-python: "3.11"
+python: "3.10"
 
-env:
-  - MODEL=NousResearch/Llama-2-7b-chat-hf
+# Commands of the service
 commands:
-  - pip install vllm
-  - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
+  - install vllm==0.5.3.post1
+  - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
+# Expose the port of the service
 port: 8000
 
 resources:
+  # Change to what is required
   gpu: 24GB
 
-# Enable the OpenAI-compatible endpoint
+# Comment if you don't want to access the model via https://gateway.<gateway domain>
 model:
-  format: openai
   type: chat
-  name: NousResearch/Llama-2-7b-chat-hf
+  name: meta-llama/Meta-Llama-3.1-8B-Instruct
+  format: openai
 ```
 
 </div>
@@ -149,32 +165,32 @@ and `openai` (if you are using Text Generation Inference or vLLM with OpenAI-com
 By default, `dstack` runs a single replica of the service.
 You can configure the number of replicas as well as the auto-scaling rules.
 
-<div editor-title="serve.dstack.yml"> 
+<div editor-title="service.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: llama31-service
 
-python: "3.11"
+python: "3.10"
 
-env:
-  - MODEL=NousResearch/Llama-2-7b-chat-hf
+# Commands of the service
 commands:
-  - pip install vllm
-  - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
+  - install vllm==0.5.3.post1
+  - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
+# Expose the port of the service
 port: 8000
 
 resources:
+  # Change to what is required
   gpu: 24GB
 
-# Enable the OpenAI-compatible endpoint
-model:
-  format: openai
-  type: chat
-  name: NousResearch/Llama-2-7b-chat-hf
-
+# Minimum and maximum number of replicas
 replicas: 1..4
 scaling:
+  # Requests per seconds
   metric: rps
+  # Target metric value
   target: 10
 ```
 
@@ -192,31 +208,31 @@ Setting the minimum number of replicas to `0` allows the service to scale down t
 If you specify memory size, you can either specify an explicit size (e.g. `24GB`) or a 
 range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
 
-<div editor-title="serve.dstack.yml"> 
+<div editor-title="examples/llms/mixtral/vllm.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service
+
+python: "3.10"
 
-python: "3.11"
+# Commands of the service
 commands:
   - pip install vllm
   - python -m vllm.entrypoints.openai.api_server
     --model mistralai/Mixtral-8X7B-Instruct-v0.1
     --host 0.0.0.0
-    --tensor-parallel-size 2 # Match the number of GPUs
+    --tensor-parallel-size $DSTACK_GPUS_NUM
+# Expose the port of the service
 port: 8000
 
 resources:
   # 2 GPUs of 80GB
   gpu: 80GB:2
 
+  # Minimum disk size
   disk: 200GB
-
-# Enable the OpenAI-compatible endpoint
-model:
-  type: chat
-  name: TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ
-  format: openai
 ```
 
 </div>
@@ -235,41 +251,51 @@ and their quantity. Examples: `A100` (one A100), `A10G,A100` (either A10G or A10
 By default, the service endpoint requires the `Authorization` header with `"Bearer <dstack token>"`.
 Authorization can be disabled by setting `auth` to `false`.
 
-<div editor-title="serve.dstack.yml"> 
+<div editor-title="examples/misc/http.server/service.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service
 
-python: "3.11"
+# Disable authorization
+auth: false
+
+python: "3.10"
 
+# Commands of the service
 commands:
   - python3 -m http.server
-
+# The port of the service
 port: 8000
-
-auth: false
 ```
 
 </div>
 
 ### Environment variables
 
-<div editor-title=".dstack.yml">
+<div editor-title="service.dstack.yml">
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: llama-2-7b-service
 
-python: "3.11"
+python: "3.10"
 
+# Environment variables
 env:
   - HUGGING_FACE_HUB_TOKEN
   - MODEL=NousResearch/Llama-2-7b-chat-hf
+# Commands of the service
 commands:
   - pip install vllm
   - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
+# The port of the service
 port: 8000
 
 resources:
+  # Required GPU vRAM
   gpu: 24GB
 ```
 
@@ -280,7 +306,7 @@ If you don't assign a value to an environment variable (see `HUGGING_FACE_HUB_TO
 
 For instance, you can define environment variables in a `.env` file and utilize tools like `direnv`.
 
-#### Default environment variables
+#### System environment variables
 
 The following environment variables are available in any run and are passed by `dstack` by default:
 
@@ -294,16 +320,19 @@ The following environment variables are available in any run and are passed by `
 
 You can choose whether to use spot instances, on-demand instances, or any available type.
 
-<div editor-title="serve.dstack.yml">
+<div editor-title="service.dstack.yml">
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service
 
 commands:
   - python3 -m http.server
-
+# The port of the service
 port: 8000
 
+# Use either spot or on-demand instances
 spot_policy: auto
 ```
 
@@ -315,16 +344,20 @@ The `spot_policy` accepts `spot`, `on-demand`, and `auto`. The default for servi
 
 By default, `dstack` provisions instances in all configured backends. However, you can specify the list of backends:
 
-<div editor-title="serve.dstack.yml">
+<div editor-title="service.dstack.yml">
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service
 
+# Commands of the service
 commands:
   - python3 -m http.server
-
+# The port of the service
 port: 8000
 
+# Use only listed backends
 backends: [aws, gcp]
 ```
 
@@ -334,16 +367,20 @@ backends: [aws, gcp]
 
 By default, `dstack` uses all configured regions. However, you can specify the list of regions:
 
-<div editor-title="serve.dstack.yml">
+<div editor-title="service.dstack.yml">
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service
 
+# Commands of the service
 commands:
   - python3 -m http.server
-
+# The port of the service
 port: 8000
 
+# Use only listed regions
 regions: [eu-west-1, eu-west-2]
 ```
 
@@ -354,16 +391,20 @@ regions: [eu-west-1, eu-west-2]
 Volumes allow you to persist data between runs.
 To attach a volume, simply specify its name using the `volumes` property and specify where to mount its contents:
 
-<div editor-title="serve.dstack.yml"> 
+<div editor-title="service.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: http-server-service
 
+# Commands of the service
 commands:
   - python3 -m http.server
-
+# The port of the service
 port: 8000
 
+# Map the name of the volume to any path
 volumes:
   - name: my-new-volume
     path: /volume_data
diff --git a/docs/docs/reference/dstack.yml/task.md b/docs/docs/reference/dstack.yml/task.md
index 0e0653e5f..330bfff08 100644
--- a/docs/docs/reference/dstack.yml/task.md
+++ b/docs/docs/reference/dstack.yml/task.md
@@ -2,9 +2,9 @@
 
 The `task` configuration type allows running [tasks](../../tasks.md).
 
-> Configuration files must have a name ending with `.dstack.yml` (e.g., `.dstack.yml` or `serve.dstack.yml` are both acceptable)
-> and can be located in the project's root directory or any nested folder.
-> Any configuration can be run via [`dstack run`](../cli/index.md#dstack-run).
+> Configuration files must be inside the project repo, and their names must end with `.dstack.yml` 
+> (e.g. `.dstack.yml` or `train.dstack.yml` are both acceptable).
+> Any configuration can be run via [`dstack apply`](../cli/index.md#dstack-apply).
 
 ## Examples
 
@@ -18,9 +18,13 @@ The `python` property determines which default Docker image is used.
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train    
 
-python: "3.11"
+# If `image` is not specified, dstack uses its base image
+python: "3.10"
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
@@ -28,10 +32,25 @@ commands:
 
 </div>
 
-!!! info "nvcc"
+??? info "nvcc"
     Note that the default Docker image doesn't bundle `nvcc`, which is required for building custom CUDA kernels. 
     To install it, use `conda install cuda`.
 
+
+    ```yaml
+    type: task
+    # The name is optional, if not specified, generated randomly
+    name: train    
+
+    python: "3.10"
+    
+    # Before other commands, install `nvcc` (via `conda install cuda`)
+    commands:
+      - conda install cuda
+      - pip install -r fine-tuning/qlora/requirements.txt
+      - python fine-tuning/qlora/train.py
+    ```
+
 ### Ports { #_ports }
 
 A task can configure ports. In this case, if the task is running an application on a port, `dstack run` 
@@ -41,14 +60,17 @@ will securely allow you to access this port from your local machine through port
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train    
 
-python: "3.11"
+python: "3.10"
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - tensorboard --logdir results/runs &
   - python fine-tuning/qlora/train.py
-  
+# Expose the port to access TensorBoard
 ports:
   - 6000
 ```
@@ -65,9 +87,13 @@ When running it, `dstack run` forwards `6000` port to `localhost:6000`, enabling
 
 ```yaml
 type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: train    
 
-image: dstackai/base:py3.11-0.4-cuda-12.1
+# Any custom Docker image
+image: dstackai/base:py3.10-0.4-cuda-12.1
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
@@ -80,12 +106,17 @@ commands:
 
     ```yaml
     type: dev-environment
+    # The name is optional, if not specified, generated randomly
+    name: train
     
-    image: dstackai/base:py3.11-0.4-cuda-12.1
+    # Any private Docker image
+    image: dstackai/base:py3.10-0.4-cuda-12.1
+    # Credentials of the private Docker registry
     registry_auth:
       username: peterschmidt85
       password: ghp_e49HcZ9oYwBzUbcSk2080gXZOU2hiT9AeSR5
     
+    # Commands of the task
     commands:
       - pip install -r fine-tuning/qlora/requirements.txt
       - python fine-tuning/qlora/train.py
@@ -100,7 +131,10 @@ range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train    
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
@@ -108,13 +142,11 @@ commands:
 resources:
   # 200GB or more RAM
   memory: 200GB..
-
   # 4 GPUs from 40GB to 80GB
   gpu: 40GB..80GB:4
-
-  # Shared memory
+  # Shared memory (required by multi-gpu)
   shm_size: 16GB
-
+  # Disk size
   disk: 500GB
 ```
 
@@ -130,9 +162,12 @@ and their quantity. Examples: `A100` (one A100), `A10G,A100` (either A10G or A10
 
     ```yaml
     type: task
+    # The name is optional, if not specified, generated randomly
+    name: train
     
-    python: "3.11"
+    python: "3.10"
     
+    # Commands of the task
     commands:
       - pip install torch~=2.3.0 torch_xla[tpu]~=2.3.0 torchvision -f https://storage.googleapis.com/libtpu-releases/index.html
       - git clone --recursive https://github.com/pytorch/xla.git
@@ -155,12 +190,14 @@ and their quantity. Examples: `A100` (one A100), `A10G,A100` (either A10G or A10
 ```yaml
 type: task
 
-python: "3.11"
+python: "3.10"
 
+# Environment variables
 env:
   - HUGGING_FACE_HUB_TOKEN
   - HF_HUB_ENABLE_HF_TRANSFER=1
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
@@ -168,12 +205,12 @@ commands:
 
 </div>
 
-If you don't assign a value to an environment variable (see `HUGGING_FACE_HUB_TOKEN` above), 
+> If you don't assign a value to an environment variable (see `HUGGING_FACE_HUB_TOKEN` above), 
 `dstack` will require the value to be passed via the CLI or set in the current process.
 
 For instance, you can define environment variables in a `.env` file and utilize tools like `direnv`.
 
-##### Default environment variables
+##### System environment variables
 
 The following environment variables are available in any run and are passed by `dstack` by default:
 
@@ -186,7 +223,7 @@ The following environment variables are available in any run and are passed by `
 | `DSTACK_NODE_RANK`      | The rank of the node                    |
 | `DSTACK_MASTER_NODE_IP` | The internal IP address the master node |
 
-### Distributed tasks { #_nodes }
+### Distributed tasks
 
 By default, the task runs on a single node. However, you can run it on a cluster of nodes.
 
@@ -194,13 +231,15 @@ By default, the task runs on a single node. However, you can run it on a cluster
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train-distrib
 
 # The size of the cluster
 nodes: 2
 
-python: "3.11"
-env:
-  - HF_HUB_ENABLE_HF_TRANSFER=1
+python: "3.10"
+
+# Commands of the task
 commands:
   - pip install -r requirements.txt
   - torchrun
@@ -220,41 +259,13 @@ resources:
 If you run the task, `dstack` first provisions the master node and then runs the other nodes of the cluster.
 All nodes are provisioned in the same region.
 
-`dstack` is easy to use with `accelerate`, `torchrun`, and other distributed frameworks. All you need to do
+> `dstack` is easy to use with `accelerate`, `torchrun`, and other distributed frameworks. All you need to do
 is pass the corresponding environment variables such as `DSTACK_GPUS_PER_NODE`, `DSTACK_NODE_RANK`, `DSTACK_NODES_NUM`,
-`DSTACK_MASTER_NODE_IP`, and `DSTACK_GPUS_NUM` (see [System environment variables](#default-environment-variables)).
+`DSTACK_MASTER_NODE_IP`, and `DSTACK_GPUS_NUM` (see [System environment variables](#system-environment-variables)).
 
 ??? info "Backends"
-    Running on multiple nodes is supported only with `aws`, `gcp`, `azure`, `oci`, and instances added via
-    [`dstack pool add-ssh`](../../fleets.md#__tabbed_1_2).
-
-### Arguments
-
-You can parameterize tasks with user arguments using `${{ run.args }}` in the configuration.
-
-<div editor-title="train.dstack.yml"> 
-
-```yaml
-type: task
-
-python: "3.11"
-
-commands:
-  - pip install -r fine-tuning/qlora/requirements.txt
-  - python fine-tuning/qlora/train.py ${{ run.args }}
-```
-
-</div>
-
-Now, you can pass your arguments to the `dstack run` command:
-
-<div class="termy">
-
-```shell
-$ dstack run . -f train.dstack.yml --train_batch_size=1 --num_train_epochs=100
-```
-
-</div>
+    Running on multiple nodes is supported only with the `aws`, `gcp`, `azure`, `oci` backends, or
+    [on-prem fleets](../../fleets.md#__tabbed_1_2).
 
 ### Web applications
 
@@ -264,13 +275,16 @@ Here's an example of using `ports` to run web apps with `tasks`.
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: streamlit-hello
 
-python: "3.11"
+python: "3.10"
 
+# Commands of the task
 commands:
   - pip3 install streamlit
   - streamlit hello
-
+# Expose the port to access the web app
 ports: 
   - 8501
 
@@ -286,11 +300,15 @@ You can choose whether to use spot instances, on-demand instances, or any availa
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train    
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
 
+# Use either spot or on-demand instances
 spot_policy: auto
 ```
 
@@ -298,6 +316,34 @@ spot_policy: auto
 
 The `spot_policy` accepts `spot`, `on-demand`, and `auto`. The default for tasks is `auto`.
 
+### Queueing tasks { #queueing-tasks }
+
+By default, if `dstack apply` cannot find capacity, the task fails. 
+
+To queue the task and wait for capacity, specify the [`retry`](#retry) 
+property:
+
+<div editor-title="train.dstack.yml">
+
+```yaml
+type: task
+# The name is optional, if not specified, generated randomly
+name: train
+
+# Commands of the task
+commands:
+  - pip install -r fine-tuning/qlora/requirements.txt
+  - python fine-tuning/qlora/train.py
+
+retry:
+  # Retry on no-capacity errors
+  on_events: [no-capacity]
+  # Retry within 1 day
+  duration: 1d
+```
+
+</div>
+
 ### Backends
 
 By default, `dstack` provisions instances in all configured backends. However, you can specify the list of backends:
@@ -306,11 +352,15 @@ By default, `dstack` provisions instances in all configured backends. However, y
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
 
+# Use only listed backends
 backends: [aws, gcp]
 ```
 
@@ -324,11 +374,15 @@ By default, `dstack` uses all configured regions. However, you can specify the l
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: train
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
 
+# Use only listed regions
 regions: [eu-west-1, eu-west-2]
 ```
 
@@ -343,13 +397,17 @@ To attach a volume, simply specify its name using the `volumes` property and spe
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: vscode    
 
-python: "3.11"
+python: "3.10"
 
+# Commands of the task
 commands:
   - pip install -r fine-tuning/qlora/requirements.txt
   - python fine-tuning/qlora/train.py
 
+# Map the name of the volume to any path
 volumes:
   - name: my-new-volume
     path: /volume_data
@@ -375,6 +433,15 @@ The `task` configuration type supports many other options. See below.
       type:
         required: true
 
+## `retry`
+
+#SCHEMA# dstack._internal.core.models.profiles.ProfileRetry
+    overrides:
+      show_root_heading: false
+      type:
+        required: true
+      item_id_prefix: retry-
+
 ## `resources`
 
 #SCHEMA# dstack._internal.core.models.resources.ResourcesSpecSchema
diff --git a/docs/docs/reference/dstack.yml/volume.md b/docs/docs/reference/dstack.yml/volume.md
index 03351fb6e..26e75b8a8 100644
--- a/docs/docs/reference/dstack.yml/volume.md
+++ b/docs/docs/reference/dstack.yml/volume.md
@@ -2,35 +2,45 @@
 
 The `volume` configuration type allows creating, registering, and updating volumes.
 
-> Configuration files must have a name ending with `.dstack.yml` (e.g., `.dstack.yml` or `vol.dstack.yml` are both acceptable)
-> and can be located in the project's root directory or any nested folder.
-> Any configuration can be applied via [`dstack apply`](../cli/index.md#dstack-apply).
+> Configuration files must be inside the project repo, and their names must end with `.dstack.yml` 
+> (e.g. `.dstack.yml` or `fleet.dstack.yml` are both acceptable).
+> Any configuration can be run via [`dstack apply`](../cli/index.md#dstack-apply).
 
 ## Examples
 
-### Creating a new volume { #create-volume }
+### Creating a new volume { #new-volume }
 
 <div editor-title="vol.dstack.yml"> 
 
 ```yaml
 type: volume
-name: my-aws-volume
+# The name of the volume
+name: my-new-volume
+
+# Volumes are bound to a specific backend and region
 backend: aws
 region: eu-central-1
+
+# The size of the volume
 size: 100GB
 ```
 
 </div>
 
-### Registering an existing volume { #register-volume }
+### Registering an existing volume { #existing-volume }
 
-<div editor-title="ext-vol.dstack.yml"> 
+<div editor-title="vol-exist.dstack.yml"> 
     
 ```yaml
 type: volume
-name: my-external-volume
+# The name of the volume
+name: my-existing-volume
+
+# Volumes are bound to a specific backend and region
 backend: aws
 region: eu-central-1
+
+# The ID of the volume in AWS
 volume_id: vol1235
 ```
 
diff --git a/docs/docs/services.md b/docs/docs/services.md
index e755cd2c4..0c0120a4e 100644
--- a/docs/docs/services.md
+++ b/docs/docs/services.md
@@ -1,8 +1,10 @@
 # Services
 
-Services make it easy to deploy models and web applications as public,
-secure, and scalable endpoints. They are provisioned behind a [gateway](concepts/gateways.md) that
-automatically provides an HTTPS domain, handles authentication, distributes load, and performs auto-scaling.
+A service allows you to deploy a web app or a model as a scalable endpoint. It lets you configure
+dependencies, resources, authorizarion, auto-scaling rules, etc.
+
+Services are provisioned behind a [gateway](concepts/gateways.md) which provides an HTTPS endpoint mapped to your domain,
+handles authentication, distributes load, and performs auto-scaling.
 
 ??? info "Gateways"
     If you're using the open-source server, you must set up a [gateway](concepts/gateways.md) before you can run a service.
@@ -10,32 +12,43 @@ automatically provides an HTTPS domain, handles authentication, distributes load
     If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
     the gateway is already set up for you.
 
-## Configuration
+## Define a configuration
 
-First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `serve.dstack.yml`
+First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or
+`serve.dstack.yml`
 are both acceptable).
 
 <div editor-title="serve.dstack.yml"> 
 
 ```yaml
 type: service
+# The name is optional, if not specified, generated randomly
+name: llama31-service
+
+# If `image` is not specified, dstack uses its default image
+python: "3.10"
 
-python: "3.11"
+# Required environment variables
 env:
-  - MODEL=NousResearch/Llama-2-7b-chat-hf
+  - HUGGING_FACE_HUB_TOKEN
 commands:
-  - pip install vllm
-  - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
+  - install vllm==0.5.3.post1
+  - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
+# Expose the vllm server port
 port: 8000
 
+# Use either spot or on-demand instances
+spot_policy: auto
+
 resources:
-  gpu: 80GB
+  # Change to what is required
+  gpu: 24GB
 
-# (Optional) Enable the OpenAI-compatible endpoint
+# Comment if you don't to access the model via https://gateway.<gateway domain>
 model:
-  format: openai
   type: chat
-  name: NousResearch/Llama-2-7b-chat-hf
+  name: meta-llama/Meta-Llama-3.1-8B-Instruct
+  format: openai
 ```
 
 </div>
@@ -49,25 +62,26 @@ If you don't specify your Docker image, `dstack` uses the [base](https://hub.doc
     In this case, `dstack` auto-scales it based on the load.
 
 !!! info "Reference"
-    See the [.dstack.yml reference](reference/dstack.yml/service.md)
-    for all supported configuration options and examples.
+    See [.dstack.yml](reference/dstack.yml/service.md) for all the options supported by
+    services, along with multiple examples.
 
-## Running
+## Run a service
 
-To run a configuration, use the [`dstack run`](reference/cli/index.md#dstack-run) command followed by the working directory path, 
-configuration file path, and any other options.
+To run a configuration, use the [`dstack apply`](reference/cli/index.md#dstack-apply) command.
 
 <div class="termy">
 
 ```shell
+$ HUGGING_FACE_HUB_TOKEN=...
+
 $ dstack run . -f serve.dstack.yml
 
- BACKEND     REGION         RESOURCES                     SPOT  PRICE
- tensordock  unitedkingdom  10xCPU, 80GB, 1xA100 (80GB)   no    $1.595
- azure       westus3        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- azure       westus2        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
+ #  BACKEND  REGION    RESOURCES                    SPOT  PRICE
+ 1  runpod   CA-MTL-1  18xCPU, 100GB, A5000:24GB:2  yes   $0.22
+ 2  runpod   EU-SE-1   18xCPU, 100GB, A5000:24GB:2  yes   $0.22
+ 3  gcp      us-west4  27xCPU, 150GB, A5000:24GB:3  yes   $0.33
  
-Continue? [y/n]: y
+Submit the run llama31-service? [y/n]: y
 
 Provisioning...
 ---> 100%
@@ -77,31 +91,14 @@ Service is published at https://yellow-cat-1.example.com
 
 </div>
 
-When deploying the service, `dstack run` mounts the current folder's contents.
-
-[//]: # (TODO: Fleets and idle duration)
-
-??? info ".gitignore"
-    If there are large files or folders you'd like to avoid uploading, 
-    you can list them in `.gitignore`.
-
-??? info "Fleets"
-    By default, `dstack run` reuses `idle` instances from one of the existing [fleets](fleets.md). 
-    If no `idle` instances meet the requirements, it creates a new fleet using one of the configured backends.
-    
-    To have the fleet deleted after a certain idle time automatically, set
-    [`termination_idle_time`](reference/dstack.yml/fleet.md#termination_idle_time).
-    By default, it's set to `5min`.
-
-!!! info "Reference"
-    See the [CLI reference](reference/cli/index.md#dstack-run) for more details
-    on how `dstack run` works.
+`dstack apply` automatically uploads the code from the current repo, including your local uncommitted changes.
+To avoid uploading large files, ensure they are listed in `.gitignore`.
 
-## Service endpoint
+## Access the endpoint
 
 One the service is up, its endpoint is accessible at `https://<run name>.<gateway domain>`.
 
-By default, the service endpoint requires the `Authorization` header with `Bearer <dstack token>`. 
+By default, the service endpoint requires the `Authorization` header with `Bearer <dstack token>`.
 
 <div class="termy">
 
@@ -110,7 +107,7 @@ $ curl https://yellow-cat-1.example.com/v1/chat/completions \
     -H 'Content-Type: application/json' \
     -H 'Authorization: Bearer &lt;dstack token&gt;' \
     -d '{
-        "model": "NousResearch/Llama-2-7b-chat-hf",
+        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
         "messages": [
             {
                 "role": "user",
@@ -122,27 +119,50 @@ $ curl https://yellow-cat-1.example.com/v1/chat/completions \
 
 </div>
 
-Authorization can be disabled by setting `auth` to `false` in the service configuration file.
+Authorization can be disabled by setting [`auth`](reference/dstack.yml/service.md#authorization) to `false` in the
+service configuration file.
+
+### Gateway endpoint
 
-### Model endpoint
+In case the service has the [model mapping](reference/dstack.yml/service.md#model-mapping) configured, you will also be
+able to access the model at `https://gateway.<gateway domain>` via the OpenAI-compatible interface.
 
-In case the service has the [model mapping](reference/dstack.yml/service.md#model-mapping) configured, you will also be able
-to access the model at `https://gateway.<gateway domain>` via the OpenAI-compatible interface.
+## Manage runs
 
-## Managing runs
+### List runs
 
-### Listing runs
+The [`dstack ps`](reference/cli/index.md#dstack-ps)  command lists all running jobs and their statuses. 
+Use `--watch` (or `-w`) to monitor the live status of runs.
 
-The [`dstack ps`](reference/cli/index.md#dstack-ps) command lists all running runs and their status.
+### Stop a run
 
-### Stopping runs
+Once the run exceeds the [`max_duration`](reference/dstack.yml/task.md#max_duration), or when you use [`dstack stop`](reference/cli/index.md#dstack-stop), 
+the dev environment is stopped. Use `--abort` or `-x` to stop the run abruptly. 
 
-When you use [`dstack stop`](reference/cli/index.md#dstack-stop), the service and its cloud resources are deleted.
+[//]: # (TODO: Mention `dstack logs` and `dstack logs -d`)
+
+## Manage fleets
+
+By default, `dstack apply` reuses `idle` instances from one of the existing [fleets](fleets.md), 
+or creates a new fleet through backends.
+
+!!! info "Idle duration"
+    To ensure the created fleets are deleted automatically, set
+    [`termination_idle_time`](reference/dstack.yml/fleet.md#termination_idle_time).
+    By default, it's set to `5min`.
+
+!!! info "Creation policy"
+    To ensure `dstack apply` always reuses an existing fleet and doesn't create a new one,
+    pass `--reuse` to `dstack apply` (or set [`creation_policy`](reference/dstack.yml/task.md#creation_policy) to `reuse` in the task configuration).
+    The default policy is `reuse_or_create`.
 
 ## What's next?
 
-1. Check the [Text Generation Inference :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/deployment/tgi/README.md){:target="_blank"} and [vLLM :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/deployment/vllm/README.md){:target="_blank"} examples
-2. Check the [`.dstack.yml` reference](reference/dstack.yml/service.md) for more details and examples
-3. See [gateways](concepts/gateways.md) on how to set up a gateway
-4. Browse [examples :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/tree/master/examples){:target="_blank"}
-5. See [fleets](fleets.md) on how to manage fleets
\ No newline at end of file
+1. Check the [TGI :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/deployment/tgi/README.md){:target="_blank"} and [vLLM :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/deployment/vllm/README.md){:target="_blank"} examples
+2. See [gateways](concepts/gateways.md) on how to set up a gateway
+3. Browse [examples](/docs/examples)
+4. See [fleets](fleets.md) on how to manage fleets
+
+!!! info "Reference"
+    See [.dstack.yml](reference/dstack.yml/service.md) for all the options supported by
+    services, along with multiple examples.
\ No newline at end of file
diff --git a/docs/docs/tasks.md b/docs/docs/tasks.md
index acae1b7bc..330834093 100644
--- a/docs/docs/tasks.md
+++ b/docs/docs/tasks.md
@@ -1,34 +1,42 @@
 # Tasks
 
-Tasks allow for convenient scheduling of various batch jobs, such as training, fine-tuning, or
-data processing. They can also be used to run web applications
-when features offered by [services](services.md) are not needed, such as for debugging.
+A task allows you to schedule a job or run a web app. It lets you configure dependencies, resources, ports, and more.
+Tasks can be distributed and run on clusters.
 
-You can run tasks on a single machine or on a cluster of nodes.
+Tasks are ideal for training and fine-tuning jobs. They can also be used instead of services if you want to run a web
+app but don't need a public endpoint.
 
-## Configuration
+## Define a configuration
 
 First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `train.dstack.yml`
 are both acceptable).
 
-<div editor-title="train.dstack.yml"> 
+[//]: # (TODO: Make tabs - single machine & distributed tasks & web app)
+
+<div editor-title="examples/fine-tuning/axolotl/train.dstack.yml"> 
 
 ```yaml
 type: task
+# The name is optional, if not specified, generated randomly
+name: axolotl-train
+
+# Using the official Axolotl's Docker image
+image: winglian/axolotl-cloud:main-20240429-py3.11-cu121-2.2.1
 
-python: "3.11"
+# Required environment variables
 env:
-  - HF_HUB_ENABLE_HF_TRANSFER=1
+  - HUGGING_FACE_HUB_TOKEN
+  - WANDB_API_KEY
+# Commands of the task
 commands:
-  - pip install -r fine-tuning/qlora/requirements.txt
-  - tensorboard --logdir results/runs &
-  - python fine-tuning/qlora/train.py
-ports:
-  - 6000
+  - accelerate launch -m axolotl.cli.train examples/fine-tuning/axolotl/config.yaml
 
-# (Optional) Configure `gpu`, `memory`, `disk`, etc
 resources:
-  gpu: 80GB
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # Two or more GPU
+    count: 2..
 ```
 
 </div>
@@ -36,84 +44,91 @@ resources:
 If you don't specify your Docker image, `dstack` uses the [base](https://hub.docker.com/r/dstackai/base/tags) image
 (pre-configured with Python, Conda, and essential CUDA drivers).
 
-
 !!! info "Distributed tasks"
     By default, tasks run on a single instance. However, you can specify
-    the [number of nodes](reference/dstack.yml/task.md#_nodes).
-    In this case, `dstack` provisions a cluster of instances.
+    the [number of nodes](reference/dstack.yml/task.md#distributed-tasks).
+    In this case, the task will run a cluster of instances.
 
 !!! info "Reference"
-    See the [.dstack.yml reference](reference/dstack.yml/task.md)
-    for all supported configuration options and examples.
+    See [.dstack.yml](reference/dstack.yml/task.md) for all the options supported by
+    tasks, along with multiple examples.
 
-## Running
+## Run a configuration
 
-To run a configuration, use the [`dstack run`](reference/cli/index.md#dstack-run) command followed by the working directory path, 
-configuration file path, and other options.
+To run a configuration, use the [`dstack apply`](reference/cli/index.md#dstack-apply) command.
 
 <div class="termy">
 
 ```shell
-$ dstack run . -f train.dstack.yml
+$ HUGGING_FACE_HUB_TOKEN=...
+$ WANDB_API_KEY=...
 
- BACKEND     REGION         RESOURCES                     SPOT  PRICE
- tensordock  unitedkingdom  10xCPU, 80GB, 1xA100 (80GB)   no    $1.595
- azure       westus3        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- azure       westus2        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
- 
-Continue? [y/n]: y
+$ dstack apply -f examples/.dstack.yml
 
-Provisioning...
----> 100%
+ #  BACKEND  REGION    RESOURCES                    SPOT  PRICE
+ 1  runpod   CA-MTL-1  18xCPU, 100GB, A5000:24GB:2  yes   $0.22
+ 2  runpod   EU-SE-1   18xCPU, 100GB, A5000:24GB:2  yes   $0.22
+ 3  gcp      us-west4  27xCPU, 150GB, A5000:24GB:3  yes   $0.33
 
-TensorBoard 2.13.0 at http://localhost:6006/ (Press CTRL+C to quit)
+Submit the run axolotl-train? [y/n]: y
 
-Epoch 0:  100% 1719/1719 [00:18<00:00, 92.32it/s, loss=0.0981, acc=0.969]
-Epoch 1:  100% 1719/1719 [00:18<00:00, 92.32it/s, loss=0.0981, acc=0.969]
-Epoch 2:  100% 1719/1719 [00:18<00:00, 92.32it/s, loss=0.0981, acc=0.969]
+Launching `axolotl-train`...
+---> 100%
+
+{'loss': 1.4967, 'grad_norm': 1.2734375, 'learning_rate': 1.0000000000000002e-06, 'epoch': 0.0}
+  0% 1/24680 [00:13<95:34:17, 13.94s/it]
+  6% 73/1300 [00:48<13:57,  1.47it/s]
 ```
 
 </div>
 
-If the task specifies `ports`, `dstack run` automatically forwards them to your local machine for
-convenient and secure access.
+`dstack apply` automatically uploads the code from the current repo, including your local uncommitted changes.
+To avoid uploading large files, ensure they are listed in `.gitignore`.
 
-When running the task, `dstack run` mounts the current folder's contents.
+!!! info "Ports"
+    If the task specifies [`ports`](reference/dstack.yml/task.md#_ports), `dstack run` automatically forwards them to your
+    local machine for convenient and secure access.
 
-[//]: # (TODO: Fleets and idle duration)
+!!! info "Queueing tasks"
+    By default, if `dstack apply` cannot find capacity, the task fails. 
+    To queue the task and wait for capacity, specify the [`retry`](reference/dstack.yml/task.md#queueing-tasks) 
+    property in the task configuration.
 
-??? info ".gitignore"
-    If there are large files or folders you'd like to avoid uploading, 
-    you can list them in `.gitignore`.
+## Manage runs
 
-??? info "Fleets"
-    By default, `dstack run` reuses `idle` instances from one of the existing [fleets](fleets.md). 
-    If no `idle` instances meet the requirements, it creates a new fleet using one of the configured backends.
+### List runs
 
-    To have the fleet deleted after a certain idle time automatically, set
-    [`termination_idle_time`](../reference/dstack.yml/fleet.md#termination_idle_time).
-    By default, it's set to `5min`.
+The [`dstack ps`](reference/cli/index.md#dstack-ps)  command lists all running jobs and their statuses. 
+Use `--watch` (or `-w`) to monitor the live status of runs.
 
-!!! info "Reference"
-    See the [CLI reference](reference/cli/index.md#dstack-run) for more details
-    on how `dstack run` works.
+### Stop a run
 
-## Managing runs
+Once the run exceeds the [`max_duration`](reference/dstack.yml/task.md#max_duration), or when you use [`dstack stop`](reference/cli/index.md#dstack-stop), 
+the dev environment is stopped. Use `--abort` or `-x` to stop the run abruptly. 
 
-### Listing runs
+[//]: # (TODO: Mention `dstack logs` and `dstack logs -d`)
 
-The [`dstack ps`](reference/cli/index.md#dstack-ps) command lists all running runs and their status.
+## Manage fleets
 
-### Stopping runs
+By default, `dstack apply` reuses `idle` instances from one of the existing [fleets](fleets.md), 
+or creates a new fleet through backends.
 
-Once you use [`dstack stop`](reference/cli/index.md#dstack-stop) (or when the run exceeds the
-`max_duration`), the instances return to the [fleet](fleets.md).
+!!! info "Idle duration"
+    To ensure the created fleets are deleted automatically, set
+    [`termination_idle_time`](reference/dstack.yml/fleet.md#termination_idle_time).
+    By default, it's set to `5min`.
 
-[//]: # (TODO: Mention `dstack logs` and `dstack logs -d`)
+!!! info "Creation policy"
+    To ensure `dstack apply` always reuses an existing fleet and doesn't create a new one,
+    pass `--reuse` to `dstack apply` (or set [`creation_policy`](reference/dstack.yml/task.md#creation_policy) to `reuse` in the task configuration).
+    The default policy is `reuse_or_create`.
 
 ## What's next?
 
-1. Check the [QLoRA :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/blob/master/examples/fine-tuning/qlora/README.md){:target="_blank"} example
-2. Check the [`.dstack.yml` reference](../reference/dstack.yml/task.md) for more details and examples
-3. Browse [all examples :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/tree/master/examples){:target="_blank"}
-4. See [fleets](fleets.md) on how to manage fleets
\ No newline at end of file
+1. Check the [Axolotl](/docs/examples/fine-tuning/axolotl) example
+2. Browse [all examples](/docs/examples)
+3. See [fleets](fleets.md) on how to manage fleets
+
+!!! info "Reference"
+    See [.dstack.yml](reference/dstack.yml/task.md) for all the options supported by
+    tasks, along with multiple examples.
diff --git a/docs/overrides/home.html b/docs/overrides/home.html
index d65ddd63c..89083a996 100644
--- a/docs/overrides/home.html
+++ b/docs/overrides/home.html
@@ -112,8 +112,8 @@
                 <h1>AI container orchestration engine for everyone</h1>
 
                 <p>
-                    dstack is an open-source orchestration engine that simplifies developing, training, and deploying AI
-                    models, as well as managing clusters on any cloud or data center.
+                    dstack is a lightweight alternative to Kubernetes for AI. It simplifies container orchestration for
+                    AI on any cloud or on-premises, accelerating the development, training, and deployment.
                 </p>
             </div>
 
@@ -228,10 +228,11 @@ <h2>Dev environments</h2>
                 <div class="block">
                     <h2>Tasks</h2>
 
-                    <p>Tasks allow for convenient scheduling of various batch jobs, such as training, fine-tuning, or
-                        data processing, as well as running web applications.</p>
+                    <p>A task allows you to schedule a job or run a web app. It lets you configure dependencies,
+                        resources, ports, and more. Tasks can be distributed and run on clusters.</p>
 
-                    <p>You can run tasks on a single machine or on a cluster of nodes.</p>
+                    <p>Tasks are ideal for training and fine-tuning jobs or running apps
+                        for development purposes.</p>
 
                     <p>
                         <a href="/docs/tasks" target="_blank"
@@ -247,8 +248,8 @@ <h2>Tasks</h2>
                     <h2>Services</h2>
 
                     <p>
-                        Services make it very easy to deploy any kind of model as public,
-                        secure, and scalable endpoints.
+                        A service allows you to deploy a web app or a model as a scalable endpoint. It lets you configure
+                        dependencies, resources, authorizarion, auto-scaling rules, etc.
                     </p>
 
                     <p>
@@ -343,9 +344,9 @@ <h3 class="external">
                     </div>
                 </a>
 
-                <a href="https://github.com/dstackai/dstack/blob/master/examples/fine-tuning/axolotl" target="_blank"
+                <a href="docs/fine-tuning/axolotl" target="_blank"
                    class="feature-cell">
-                    <h3 class="external">
+                    <h3>
                         Axolotl
                     </h3>
 
diff --git a/examples/.dstack.yml b/examples/.dstack.yml
index 143c04aec..9a09e8641 100644
--- a/examples/.dstack.yml
+++ b/examples/.dstack.yml
@@ -1,10 +1,16 @@
 type: dev-environment
+# The name is optional, if not specified, generated randomly
 name: vscode
 
-# This configuration launches a blank dev environment
-
 python: "3.11"
+# Uncomment to use a custom Docker image
+#image: dstackai/base:py3.10-0.4-cuda-12.1
 
 ide: vscode
 
+# Use either spot or on-demand instances
 spot_policy: auto
+
+# Uncomment to request resources
+#resources:
+#  gpu: 24GB
\ No newline at end of file
diff --git a/examples/README.md b/examples/README.md
index 7bffa9914..23accbb3f 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -10,7 +10,9 @@ cd dstack
 dstack init
 ```
 
-Now you are ready to run examples! Select any example from the left-hand sidebar.
+Now you are ready to run examples! 
+
+> Browse the examples using the menu on the left.
 
 ## Source code
 
diff --git a/examples/fine-tuning/alignment-handbook/.dstack.yml b/examples/fine-tuning/alignment-handbook/.dstack.yml
index 5d121d1ef..fc97d6b96 100644
--- a/examples/fine-tuning/alignment-handbook/.dstack.yml
+++ b/examples/fine-tuning/alignment-handbook/.dstack.yml
@@ -1,4 +1,5 @@
 type: dev-environment
+# The name is optional, if not specified, generated randomly
 name: ah-vscode
 
 # If `image` is not specified, dstack uses its default image
@@ -25,5 +26,8 @@ ide: vscode
 spot_policy: auto
 
 resources:
-  # Minimum 24GB, one or more GPU
-  gpu: 24GB..:1..
\ No newline at end of file
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
\ No newline at end of file
diff --git a/examples/fine-tuning/alignment-handbook/config.yaml b/examples/fine-tuning/alignment-handbook/config.yaml
index fee2964b4..330c43c10 100644
--- a/examples/fine-tuning/alignment-handbook/config.yaml
+++ b/examples/fine-tuning/alignment-handbook/config.yaml
@@ -40,7 +40,7 @@ gradient_accumulation_steps: 2
 gradient_checkpointing: true
 gradient_checkpointing_kwargs:
   use_reentrant: false
-hub_model_id: chansung/coding_llamaduo_60k_v0.2
+hub_model_id: peterschmidt85/coding_llamaduo_60k_v0.2
 hub_strategy: every_save
 learning_rate: 2.0e-04
 log_level: info
diff --git a/examples/fine-tuning/alignment-handbook/fleet-distrib.dstack.yml b/examples/fine-tuning/alignment-handbook/fleet-distrib.dstack.yml
index 0fa773f16..10c049017 100644
--- a/examples/fine-tuning/alignment-handbook/fleet-distrib.dstack.yml
+++ b/examples/fine-tuning/alignment-handbook/fleet-distrib.dstack.yml
@@ -2,16 +2,19 @@ type: fleet
 # The name is optional, if not specified, generated randomly
 name: ah-fleet-distrib
 
+# Number of instances in fleet
+nodes: 2
+# Ensure instances are interconnected
+placement: cluster
+
 # Use either spot or on-demand instances
 spot_policy: auto
-# Terminate the instance if not used for one hour
+# Terminate instances if not used for one hour
 termination_idle_time: 1h
 
 resources:
-  # Change to what is required
-  gpu: 24GB
-
-# Specify a number of instances
-nodes: 2
-# Ensure instances are interconnected
-placement: cluster
\ No newline at end of file
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
\ No newline at end of file
diff --git a/examples/fine-tuning/alignment-handbook/fleet.dstack.yml b/examples/fine-tuning/alignment-handbook/fleet.dstack.yml
index 2388cc745..d8ae8872d 100644
--- a/examples/fine-tuning/alignment-handbook/fleet.dstack.yml
+++ b/examples/fine-tuning/alignment-handbook/fleet.dstack.yml
@@ -2,14 +2,17 @@ type: fleet
 # The name is optional, if not specified, generated randomly
 name: ah-fleet
 
+# Number of instances in fleet
+nodes: 1
+
 # Use either spot or on-demand instances
 spot_policy: auto
 # Terminate the instance if not used for one hour
 termination_idle_time: 1h
 
 resources:
-  # Change to what is required
-  gpu: 24GB
-
-# Need one instance only
-nodes: 1
\ No newline at end of file
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
\ No newline at end of file
diff --git a/examples/fine-tuning/alignment-handbook/train-distrib.dstack.yml b/examples/fine-tuning/alignment-handbook/train-distrib.dstack.yml
index cf09c0290..b33902a5d 100644
--- a/examples/fine-tuning/alignment-handbook/train-distrib.dstack.yml
+++ b/examples/fine-tuning/alignment-handbook/train-distrib.dstack.yml
@@ -25,17 +25,20 @@ commands:
     --machine_rank=$DSTACK_NODE_RANK
     --num_processes=$DSTACK_GPUS_NUM
     --num_machines=$DSTACK_NODES_NUM
-    scripts/run_sft.py 
+    scripts/run_sft.py
     ../examples/fine-tuning/alignment-handbook/config.yaml
 # Expose 6006 to access TensorBoard
 ports:
   - 6006
 
-# The number of interconnected instances required
+# Number of instances in cluster
 nodes: 2
 
 resources:
-  # Required resources
-  gpu: 24GB
-  # Shared memory size for inter-process communication
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
+  # Shared memory (for multi-gpu)
   shm_size: 24GB
\ No newline at end of file
diff --git a/examples/fine-tuning/alignment-handbook/train.dstack.yml b/examples/fine-tuning/alignment-handbook/train.dstack.yml
index fc57a2adc..a52a3b08f 100644
--- a/examples/fine-tuning/alignment-handbook/train.dstack.yml
+++ b/examples/fine-tuning/alignment-handbook/train.dstack.yml
@@ -28,5 +28,8 @@ commands:
 #  - 6006
 
 resources:
-  # Required resources
-  gpu: 24GB
\ No newline at end of file
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
\ No newline at end of file
diff --git a/examples/fine-tuning/axolotl/.dstack.yml b/examples/fine-tuning/axolotl/.dstack.yml
index f419d6ac8..4b9096cfa 100644
--- a/examples/fine-tuning/axolotl/.dstack.yml
+++ b/examples/fine-tuning/axolotl/.dstack.yml
@@ -1,4 +1,5 @@
 type: dev-environment
+# The name is optional, if not specified, generated randomly
 name: axolotl-vscode
 
 # Using the official Axolotl's Docker image
@@ -15,5 +16,8 @@ ide: vscode
 spot_policy: auto
 
 resources:
-  # Two or more 24GB GPUs (required by FSDP)
-  gpu: 24GB:2..
\ No newline at end of file
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # Two or more GPU
+    count: 2..
\ No newline at end of file
diff --git a/examples/fine-tuning/axolotl/README.md b/examples/fine-tuning/axolotl/README.md
index 85dc7d77a..c7ffd762b 100644
--- a/examples/fine-tuning/axolotl/README.md
+++ b/examples/fine-tuning/axolotl/README.md
@@ -47,13 +47,13 @@ env:
 # Commands of the task
 commands:
   - accelerate launch -m axolotl.cli.train examples/fine-tuning/axolotl/config.yaml
-# Expose 6006 to access TensorBoard
-ports:
-  - 6006
 
 resources:
-  # Two or more 24GB GPUs (required by FSDP)
-  gpu: 24GB:2..
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # Two or more GPU
+    count: 2..
 ```
 
 The task uses Axolotl's Docker image, where Axolotl is already pre-installed.
@@ -67,9 +67,6 @@ WANDB_API_KEY=...
 dstack apply -f examples/fine-tuning/axolotl/train.dstack.yml
 ```
 
-If you list `tensorbord` via `report_to` in [`examples/fine-tuning/axolotl/config.yaml`](https://github.com/dstackai/dstack/blob/master/examples/fine-tuning/axolotl/config.yaml),
-you'll be able to access experiment metrics via `http://localhost:6006` (while the task is running).
-
 ## Fleets
 
 > By default, `dstack run` reuses `idle` instances from one of the existing [fleets](https://dstack.ai/docs/fleets).
diff --git a/examples/fine-tuning/axolotl/config.yaml b/examples/fine-tuning/axolotl/config.yaml
index 5087bc434..7f3c08745 100644
--- a/examples/fine-tuning/axolotl/config.yaml
+++ b/examples/fine-tuning/axolotl/config.yaml
@@ -79,4 +79,4 @@ fsdp_config:
 special_tokens:
   pad_token: <|end_of_text|>
 
-hub_model_id: chansung/axolotl_llama3_8b_fsdp_qlora
\ No newline at end of file
+hub_model_id: peterschmidt85/axolotl_llama3_8b_fsdp_qlora
\ No newline at end of file
diff --git a/examples/fine-tuning/axolotl/fleet.dstack.yml b/examples/fine-tuning/axolotl/fleet.dstack.yml
index b3aefe6a9..0a10d67e1 100644
--- a/examples/fine-tuning/axolotl/fleet.dstack.yml
+++ b/examples/fine-tuning/axolotl/fleet.dstack.yml
@@ -2,14 +2,16 @@ type: fleet
 # The name is optional, if not specified, generated randomly
 name: axolotl-fleet
 
+# Number of instances in fleet
+nodes: 1
+
 # Use either spot or on-demand instances
 spot_policy: auto
 # Terminate the instance if not used for one hour
 termination_idle_time: 1h
 
 resources:
-  # Two or more 24GB GPUs (required by FSDP)
-  gpu: 24GB:2..
-
-# Need one instance only
-nodes: 1
\ No newline at end of file
+  # 24GB or more vRAM
+  memory: 24GB..
+  # Two or more GPU (required by FSDP)
+  count: 2..
\ No newline at end of file
diff --git a/examples/fine-tuning/axolotl/train.dstack.yaml b/examples/fine-tuning/axolotl/train.dstack.yaml
index 9accbe5fc..b81c5fc8c 100644
--- a/examples/fine-tuning/axolotl/train.dstack.yaml
+++ b/examples/fine-tuning/axolotl/train.dstack.yaml
@@ -12,10 +12,10 @@ env:
 # Commands of the task
 commands:
   - accelerate launch -m axolotl.cli.train examples/fine-tuning/axolotl/config.yaml
-# Uncomment to access TensorBoard
-#ports:
-#  - 6006
 
 resources:
-  # Two or more 24GB GPUs (required by FSDP)
-  gpu: 24GB:2..
\ No newline at end of file
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # Two or more GPU (required by FSDP)
+    count: 2..
\ No newline at end of file
diff --git a/examples/fine-tuning/qlora/train.dstack.yml b/examples/fine-tuning/qlora/train.dstack.yml
index 029c472de..a51bb1ff0 100644
--- a/examples/fine-tuning/qlora/train.dstack.yml
+++ b/examples/fine-tuning/qlora/train.dstack.yml
@@ -1,5 +1,4 @@
 type: task
-# This task fine-tunes Llama 2 with QLoRA. Learn more at https://dstack.ai/examples/qlora/
 
 python: "3.11"
 
diff --git a/examples/fine-tuning/trl/.dstack.yml b/examples/fine-tuning/trl/.dstack.yml
new file mode 100644
index 000000000..13685d624
--- /dev/null
+++ b/examples/fine-tuning/trl/.dstack.yml
@@ -0,0 +1,35 @@
+type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: trl-vscode
+
+# If `image` is not specified, dstack uses its default image
+python: "3.10"
+
+# Required environment variables
+env:
+  - HUGGING_FACE_HUB_TOKEN
+  - ACCELERATE_LOG_LEVEL=info
+  - WANDB_API_KEY
+# Uncomment if you want the environment to be pre-installed
+#init:
+#  - conda install cuda
+#  - pip install flash-attn --no-build-isolation
+#  - pip install "transformers>=4.43.2"
+#  - pip install bitsandbytes
+#  - pip install peft
+#  - pip install wandb
+#  - git clone https://github.com/huggingface/trl
+#  - cd trl
+#  - pip install .
+
+ide: vscode
+
+# Use either spot or on-demand instances
+spot_policy: auto
+
+resources:
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
\ No newline at end of file
diff --git a/examples/fine-tuning/trl/train-distrib.dstack.yml b/examples/fine-tuning/trl/train-distrib.dstack.yml
new file mode 100644
index 000000000..f17d42997
--- /dev/null
+++ b/examples/fine-tuning/trl/train-distrib.dstack.yml
@@ -0,0 +1,62 @@
+type: task
+# The name is optional, if not specified, generated randomly
+name: trl-train-distrib
+
+python: "3.10"
+
+# Required environment variables
+env:
+  - HUGGING_FACE_HUB_TOKEN
+  - ACCELERATE_LOG_LEVEL=info
+  - WANDB_API_KEY
+# Commands of the task
+commands:
+  - conda install cuda
+  - pip install "transformers>=4.43.2"
+  - pip install bitsandbytes
+  - pip install flash-attn --no-build-isolation
+  - pip install peft
+  - pip install wandb
+  - git clone https://github.com/huggingface/trl
+  - cd trl
+  - pip install .
+  - accelerate launch
+    --config_file=examples/accelerate_configs/fsdp_qlora.yaml 
+    --main_process_ip=$DSTACK_MASTER_NODE_IP
+    --main_process_port=8008
+    --machine_rank=$DSTACK_NODE_RANK
+    --num_processes=$DSTACK_GPUS_NUM
+    --num_machines=$DSTACK_NODES_NUM
+      examples/scripts/sft.py
+    --model_name meta-llama/Meta-Llama-3.1-8B
+    --dataset_name OpenAssistant/oasst_top1_2023-08-25
+    --dataset_text_field="text"
+    --per_device_train_batch_size 1
+    --per_device_eval_batch_size 1
+    --gradient_accumulation_steps 4
+    --learning_rate 2e-4
+    --report_to wandb
+    --bf16
+    --max_seq_length 1024
+    --lora_r 16 --lora_alpha 32
+    --lora_target_modules q_proj k_proj v_proj o_proj
+    --load_in_4bit
+    --use_peft
+    --attn_implementation "flash_attention_2"
+    --logging_steps=10
+    --output_dir models/llama31
+    --hub_model_id peterschmidt85/FineLlama-3.1-8B
+    --torch_dtype bfloat16
+    --use_bnb_nested_quant
+
+# Size of the cluster
+nodes: 2
+
+resources:
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
+  # Shared memory (for multi-gpu)
+  shm_size: 24GB
\ No newline at end of file
diff --git a/examples/fine-tuning/trl/train.dstack.yml b/examples/fine-tuning/trl/train.dstack.yml
new file mode 100644
index 000000000..f965654ac
--- /dev/null
+++ b/examples/fine-tuning/trl/train.dstack.yml
@@ -0,0 +1,51 @@
+type: task
+# The name is optional, if not specified, generated randomly
+name: trl-train
+
+python: "3.10"
+
+# Required environment variables
+env:
+  - HUGGING_FACE_HUB_TOKEN
+  - ACCELERATE_LOG_LEVEL=info
+  - WANDB_API_KEY
+# Commands of the task
+commands:
+  - conda install cuda
+  - pip install "transformers>=4.43.2"
+  - pip install bitsandbytes
+  - pip install flash-attn --no-build-isolation
+  - pip install peft
+  - pip install wandb
+  - git clone https://github.com/huggingface/trl
+  - cd trl
+  - pip install .
+  - accelerate launch
+    --config_file=examples/accelerate_configs/multi_gpu.yaml
+    --num_processes $DSTACK_GPUS_PER_NODE 
+    examples/scripts/sft.py
+    --model_name meta-llama/Meta-Llama-3.1-8B
+    --dataset_name OpenAssistant/oasst_top1_2023-08-25
+    --dataset_text_field="text"
+    --per_device_train_batch_size 1
+    --per_device_eval_batch_size 1
+    --gradient_accumulation_steps 4
+    --learning_rate 2e-4
+    --report_to wandb
+    --bf16
+    --max_seq_length 1024
+    --lora_r 16 --lora_alpha 32
+    --lora_target_modules q_proj k_proj v_proj o_proj
+    --load_in_4bit
+    --use_peft
+    --attn_implementation "flash_attention_2"
+    --logging_steps=10
+    --output_dir models/llama31
+    --hub_model_id peterschmidt85/FineLlama-3.1-8B
+
+resources:
+  gpu:
+    # 24GB or more vRAM
+    memory: 24GB..
+    # One or more GPU
+    count: 1..
\ No newline at end of file
diff --git a/examples/llms/llama31/.dstack.yml b/examples/llms/llama31/.dstack.yml
new file mode 100644
index 000000000..b9782c82a
--- /dev/null
+++ b/examples/llms/llama31/.dstack.yml
@@ -0,0 +1,20 @@
+type: dev-environment
+# The name is optional, if not specified, generated randomly
+name: llama31-vscode
+
+# If `image` is not specified, dstack uses its default image
+python: "3.10"
+
+# Required environment variables
+env:
+  - HUGGING_FACE_HUB_TOKEN
+ide: vscode
+
+# Use either spot or on-demand instances
+spot_policy: auto
+# Uncomment to ensure it doesn't create a new fleet
+#creation_policy: reuse
+
+resources:
+  # Required resources
+  gpu: 24GB
diff --git a/examples/llms/llama31/fleet.dstack.yml b/examples/llms/llama31/fleet.dstack.yml
new file mode 100644
index 000000000..51136e5cd
--- /dev/null
+++ b/examples/llms/llama31/fleet.dstack.yml
@@ -0,0 +1,15 @@
+type: fleet
+# The name is optional, if not specified, generated randomly
+name: llama31-fleet
+
+# Need one instance only
+nodes: 1
+
+# Use either spot or on-demand instances
+spot_policy: auto
+# Terminate the instance if not used for one hour
+termination_idle_time: 1h
+
+resources:
+  # Required resources
+  gpu: 24GB
diff --git a/examples/llms/llama31/service.dstack.yml b/examples/llms/llama31/service.dstack.yml
new file mode 100644
index 000000000..400d02379
--- /dev/null
+++ b/examples/llms/llama31/service.dstack.yml
@@ -0,0 +1,30 @@
+type: service
+# The name is optional, if not specified, generated randomly
+name: llama31-service
+
+# If `image` is not specified, dstack uses its base image
+python: "3.10"
+
+# Required environment variables
+env:
+  - HUGGING_FACE_HUB_TOKEN
+commands:
+  - install vllm==0.5.3.post1
+  - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
+# Expose the vllm server port
+port: 8000
+
+# Use either spot or on-demand instances
+spot_policy: auto
+# Uncomment to ensure it doesn't create a new fleet
+#creation_policy: reuse
+
+resources:
+  # Change to what is required
+  gpu: 24GB
+
+# Comment if you don't want to access the model via https://gateway.<gateway domain>
+model:
+  type: chat
+  name: meta-llama/Meta-Llama-3.1-8B-Instruct
+  format: openai
\ No newline at end of file
diff --git a/examples/llms/llama31/task.dstack.yml b/examples/llms/llama31/task.dstack.yml
new file mode 100644
index 000000000..1a8516927
--- /dev/null
+++ b/examples/llms/llama31/task.dstack.yml
@@ -0,0 +1,24 @@
+type: task
+name: llama31-task
+
+# If `image` is not specified, dstack uses its default image
+python: "3.10"
+
+# Required environment variables
+env:
+  - HUGGING_FACE_HUB_TOKEN
+commands:
+  - pip install vllm==0.5.3.post1
+  - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
+# Expose the vllm server port
+ports:
+  - 8000
+
+# Use either spot or on-demand instances
+spot_policy: auto
+# Uncomment to ensure it doesn't create a new fleet
+#creation_policy: reuse
+
+resources:
+  # Required resources
+  gpu: 24GB
\ No newline at end of file
diff --git a/mkdocs.yml b/mkdocs.yml
index f07d87d0e..a75e8240d 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -206,6 +206,7 @@ nav:
               - Volumes: docs/concepts/volumes.md
           - Guides:
               - Protips: docs/guides/protips.md
+              - dstack Sky: docs/guides/dstack-sky.md
           - Examples: docs/examples
       - Reference:
           - server/config.yml: docs/reference/server/config.yml.md