From dc120a870188b46a6212b47f7d924224558234a2 Mon Sep 17 00:00:00 2001 From: KevKibe Date: Tue, 20 Feb 2024 17:37:35 +0300 Subject: [PATCH 1/4] restructure of 'How-to-add-a backend' contribution guide --- contributing/BACKENDS.md | 185 +++++++++++++++++++++++++++------------ 1 file changed, 127 insertions(+), 58 deletions(-) diff --git a/contributing/BACKENDS.md b/contributing/BACKENDS.md index 416110e67..27a0a079a 100644 --- a/contributing/BACKENDS.md +++ b/contributing/BACKENDS.md @@ -1,96 +1,165 @@ -# How to add a new backend +# How to add a Backend to dstack.ai +## Introduction -## Overview +Welcome to the Integration Guide for adding a backend by intergrating new cloud providers to gpuhunt and extending the capabilities of dstack.
+This document is designed to assist developers and contributors in integrating additional cloud computing resources into dstack. + + +## Overview of Steps 1. Add cloud provider to `gpuhunt` - 1. Add `src/gpuhunt/providers/.py` - 2. Define class attribute `NAME` and implement -2. Add Backend, Compute, and configuration models in `dstack` +2. Integrating a Cloud Provider into dstackai/dstack -## dstackai/gpuhunt +## Adding a cloud provider to dstackai/gpuhunt +To integrate a new cloud provider into `gpuhunt`, follow these steps: -Clone and open https://github.com/dstackai/gpuhunt. Create `Provider` class -in `src/gpuhunt/providers/.py`. +1. **Clone the Repository**: Start by cloning the `gpuhunt` repository from GitHub: +```bash +https://github.com/dstackai/gpuhunt.git +``` + 2. **Create the Provider Class**: Navigate to the `providers` directory and create a new Python file for your provider: +- Path: `src/gpuhunt/providers/.py` +- Replace `` with the name of your cloud provider. -Your class must inherit `AbstractProvider`, have `NAME` class variable, and implement `get` method. Use -optional `query_filter` to speed up the query. Use `balance_resources` if your backend provides fine-grained control on -resources like RAM and CPU to prevent under-optimal configurations (i.e., A100 80GB with 1 GB of RAM). +3. **Implement the Provider Class**: Your class should meet the following criteria: -`get` method is called during catalog generation for `offline` providers and every query for `online` providers. +- **Inherit from `AbstractProvider`**: Ensure your class extends the `AbstractProvider` base class. + ```python + from gpuhunt.providers import AbstractProvider -> There are two types of providers in `gpuhunt`: ->1. `offline` — providers that take a lot of time to get all offers. A catalog is precomputed and stored as csv file ->2. `online` — providers that take a few seconds to get all offers. A catalog is computed in a real-time as needed + class Provider(AbstractProvider): + ``` -If your provider is `offline`, also add data quality tests to `src/integrity_tests/test_.py` to verify -generated csv files before publication. +- **Define the `NAME` Class Variable**: This should be a unique identifier for your provider. -## dstackai/dstack + ```python + NAME = 'your_provider_name' + ``` -Clone and open https://github.com/dstackai/dstack. Follow `CONTRIBUTING.md` to setup your environment. +- **Implement the `get` Method**: This method is responsible for fetching the available GPU resources from your cloud provider. Implement it according to the `AbstractProvider` interface. -Add your dependencies to `setup.py` in a separate `` section. Also, update `all` section. + ```python + def get(self, query_filter=None): + # Implementation here + ``` +- **Utilize `query_filter`**: (Optional) Use this parameter to speed up the query process by filtering results early on. -Add a new enum entry `BackendType.` at `src/dstack/_internal/core/models/backends/base.py`. +- **Apply `balance_resources`**: If your backend offers detailed control over resources (like RAM and CPU), use this method to prevent configurations that are not optimal, such as pairing a high-end GPU with insufficient RAM. -Create `src/dstack/_internal/core/backends/` directory: + ```python + def balance_resources(self, configurations): + # Implementation to balance or filter configurations + ``` -- Implement `YourProviderBackend` in `__init__.py`, inherit it from `BaseBackend`. - - Define the `TYPE` class variable. -- Implement `Compute` in `compute.py`, and inherit it from `Compute`. - - Implement `get_offers`. It will be called every time the user wants to provision something. Add availability - information if possible. - - Implement `run_job`. Here you create a compute resource and run `dstack-shim` or `dstack-runner`. - - Implement `terminate_instance`. This method should not raise an error, if there is no such instance. -- Implement `Config` in `config.py`, inherit it from `BackendConfig` and `StoredConfig`. - This config is accepted by `Backend` class. +4. **Understand Provider Types**: +- `gpuhunt` distinguishes between two types of providers: + 1. **`offline`**: These providers take a significant amount of time to retrieve all offers. A catalog is precomputed and stored as a CSV file. + 2. **`online`**: These providers can fetch all offers within a few seconds. A catalog is computed in real-time as needed. -> There are two types of compute in `dstask`: ->1. `dockerized: False` — the backend runs `dstack-shim`. Later, `dstack-shim` will create a job container - with `dstack-runner` in it. This is common for VM. ->2. `dockerized: True` — the backend runs `dstack-runner` inside a docker container. -> Note, that the Compute class interface is subject to changes with the coming pools feature release. +5. **Data Quality Tests for Offline Providers**: +- If your provider is classified as `offline`, you should add data quality tests to ensure the integrity of the precomputed CSV files. These tests are located in: + ``` + src/integrity_tests/test_.py + ``` +- Replace `` with the name of your cloud provider. These tests verify the generated CSV files before publication to ensure accuracy and reliability. -Create configuration models in `src/dstack/_internal/core/models/backends/.py`. `ConfigInfo` -contains everything except for the credentials. You may have multiple models for credentials (i.e., default -credentials & explicit credentials). Create a model with creds: `ConfigInfoWithCreds`. Create a model with -all fields being optional: `ConfigInfoWithCredsPartial`. Create a model representing UI elements for -configurator: `ConfigValues`. -Import all created models to `src/dstack/_internal/core/models/backends/__init__.py`. +## Integrating a Cloud Provider into dstackai/dstack -Implement `Configurator` -in `src/dstack/_internal/server/services/backends/configurators/.py` +Integrating a new cloud provider into `dstack` involves several key steps, from setting up your development environment to implementing specific backend configurations. Here’s how to proceed: -Add `Config` in `src/dstack/_internal/server/services/config.py`. This model represents the YAML -configuration. +### Setup and Initial Configuration -Add safe import for your backend in `src/dstack/_internal/server/services/backends/__init__.py`. Update expected -backends in tests in `src/tests/_internal/server/routers/test_backends.py`. +1. **Clone the `dstack` Repository**: Begin by cloning the `dstack` repository from GitHub: -## Appendix +```bash +git clone https://github.com/dstackai/dstack.git +``` + +2. **Follow Setup Instructions**: Consult the `CONTRIBUTING.md` document within the repository for instructions on setting up your development environment. + +### Modifying `setup.py` + +1. **Add Dependencies**: Incorporate any dependencies required by your cloud provider into `setup.py`. Create a separate section named `` for these dependencies and ensure to update the `all` section to include them. + +### Extending Backend Models + +1. **Add Backend Type**: Insert a new enumeration entry for your backend in `src/dstack/_internal/core/models/backends/base.py`: + +```python + = '' +``` +2. **Create Provider Directory**: Establish a new directory at `src/dstack/_internal/core/backends/ `to house your provider’s backend and compute implementations. + + +3. **Backend Implementation:** +In `__init__.py`, implement `YourProviderBackend`, inheriting from `BaseBackend`. Define the `TYPE` class variable to associate your backend with the newly added enum entry. + +4. **Compute Implementation:** +In `compute.py`, develop `Compute`, inheriting from `Compute`.
+You'll need to implement methods like + - `get_offers` It will be called every time the user wants to provision something. Add availability information if possible. + - `run_job` Here you create a compute resource and run `dstack-shim` or `dstack-runner`. + - `terminate_instance` This method should not raise an error, if there is no such instance. + +5. **Configuration Implementation**: +- Implement the `Config` class in `config.py`, inheriting from both `BackendConfig` and `StoredConfig`. This configuration is accepted by the `Backend` class. + + +### Configuration Models + 1. **Create Configuration Models:** -### Adding VM compute backend +You may have multiple models for credentials (i.e., default credentials & explicit credentials). + In `src/dstack/_internal/core/models/backends/.py`, create models for your provider's configuration: +- `ConfigInfo:` create a model with all configuration details except credentials. +- `ConfigInfoWithCreds`: create a model with credentials. +- `ConfigInfoWithCredsPartial`: create a model with all fields optional. +- `ConfigValues:` create a model representing UI elements for configurator. -`dstack` expects the following features from your backend: +2. **Import Models:** +Ensure all new models are imported into `src/dstack/_internal/core/models/backends/__init__.py`. + +### Finalizing Integration +1. **Implement Configurator:** +Develop `Configurator` in `src/dstack/_internal/server/services/backends/configurators/.py`. + +2. **Add YAML Configuration Model:** +Insert `Config` in `src/dstack/_internal/server/services/config.py` to represent the provider’s configuration in YAML. + +3. **Ensure Safe Import:** +Add a safe import for your backend in `src/dstack/_internal/server/services/backends/__init__.py` and update expected backends in tests within `src/tests/_internal/server/routers/test_backends.py.` + + +Note: There are two types of compute in dstack: + +- `dockerized: False` — the backend runs `dstack-shim`. This setup is common for VMs. +- `dockerized: True`— the backend directly runs `dstack-runner` inside a docker container. + +The Compute class interface may undergo changes with the upcoming pools feature release, so keep an eye out for updates. + + +## Appendix +### Adding VM Compute Backend +dstack expects VM backends to have: - Ubuntu 22.04 LTS - Nvidia Drivers 535 - Docker with Nvidia runtime - OpenSSH server - External IP & 1 port for SSH (any) -- cloud-init script (preferable) +- cloud-init script (preferred) - API for creating and terminating instances -To accelerate provisioning — we prebuild VM images with necessary dependencies. You can find configurations -in `packer/`. - -### Adding Docker-only compute backend +To speed up provisioning, we prebuild VM images with necessary dependencies, available in `packer/`. -`dstack` expects the following features from your backend: +### Adding Docker-only Compute Backend +For Docker-only backends, dstack requires: - Docker with Nvidia runtime - External IP & 1 port for SSH (any) - Container entrypoint override (~2KB) -- API for creating and terminating containers \ No newline at end of file +- API for creating and terminating containers + + + From 400e4887d628790602e7743e3f5ede2975a1697c Mon Sep 17 00:00:00 2001 From: KevKibe Date: Thu, 22 Feb 2024 14:37:29 +0300 Subject: [PATCH 2/4] updated with requested changes --- contributing/BACKENDS.md | 49 +++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/contributing/BACKENDS.md b/contributing/BACKENDS.md index 27a0a079a..9b8b475ba 100644 --- a/contributing/BACKENDS.md +++ b/contributing/BACKENDS.md @@ -18,8 +18,8 @@ To integrate a new cloud provider into `gpuhunt`, follow these steps: https://github.com/dstackai/gpuhunt.git ``` 2. **Create the Provider Class**: Navigate to the `providers` directory and create a new Python file for your provider: -- Path: `src/gpuhunt/providers/.py` -- Replace `` with the name of your cloud provider. +- Path: `src/gpuhunt/providers/.py` +- Replace `` with the name of your cloud provider. 3. **Implement the Provider Class**: Your class should meet the following criteria: @@ -33,23 +33,18 @@ https://github.com/dstackai/gpuhunt.git - **Define the `NAME` Class Variable**: This should be a unique identifier for your provider. ```python - NAME = 'your_provider_name' + NAME = '_name' ``` -- **Implement the `get` Method**: This method is responsible for fetching the available GPU resources from your cloud provider. Implement it according to the `AbstractProvider` interface. +- **Implement the `get` Method**: This method is responsible for fetching the available GPU resources information from your cloud provider. Implement it according to the `AbstractProvider` interface. ```python - def get(self, query_filter=None): + def get(self, query_filter: Optional[QueryFilter] = None, balance_resources: bool = True) -> List[RawCatalogItem]: # Implementation here ``` - **Utilize `query_filter`**: (Optional) Use this parameter to speed up the query process by filtering results early on. -- **Apply `balance_resources`**: If your backend offers detailed control over resources (like RAM and CPU), use this method to prevent configurations that are not optimal, such as pairing a high-end GPU with insufficient RAM. - - ```python - def balance_resources(self, configurations): - # Implementation to balance or filter configurations - ``` +- **Use `balance_resources`**: If your backend offers detailed control over resources (like RAM and CPU), to prevent configurations that are not optimal, such as pairing a high-end GPU with insufficient RAM (i.e., A100 80GB with 1 GB of RAM). 4. **Understand Provider Types**: - `gpuhunt` distinguishes between two types of providers: @@ -60,9 +55,9 @@ https://github.com/dstackai/gpuhunt.git 5. **Data Quality Tests for Offline Providers**: - If your provider is classified as `offline`, you should add data quality tests to ensure the integrity of the precomputed CSV files. These tests are located in: ``` - src/integrity_tests/test_.py + src/integrity_tests/test_.py ``` -- Replace `` with the name of your cloud provider. These tests verify the generated CSV files before publication to ensure accuracy and reliability. +- Replace `` with the name of your cloud provider. These tests verify the generated CSV files before publication to ensure accuracy and reliability. ## Integrating a Cloud Provider into dstackai/dstack @@ -81,7 +76,7 @@ git clone https://github.com/dstackai/dstack.git ### Modifying `setup.py` -1. **Add Dependencies**: Incorporate any dependencies required by your cloud provider into `setup.py`. Create a separate section named `` for these dependencies and ensure to update the `all` section to include them. +1. **Add Dependencies**: Incorporate any dependencies required by your cloud provider into `setup.py`. Create a separate section named `` for these dependencies and ensure to update the `all` section to include them. ### Extending Backend Models @@ -90,18 +85,18 @@ git clone https://github.com/dstackai/dstack.git ```python = '' ``` -2. **Create Provider Directory**: Establish a new directory at `src/dstack/_internal/core/backends/ `to house your provider’s backend and compute implementations. +2. **Create Provider Directory**: Establish a new directory at `src/dstack/_internal/core/backends/ `to house your provider’s backend and compute implementations. 3. **Backend Implementation:** -In `__init__.py`, implement `YourProviderBackend`, inheriting from `BaseBackend`. Define the `TYPE` class variable to associate your backend with the newly added enum entry. +In `__init__.py`, implement `Backend`, inheriting from `BaseBackend`. Define the `TYPE` class variable to associate your backend with the newly added enum entry. 4. **Compute Implementation:** In `compute.py`, develop `Compute`, inheriting from `Compute`.
You'll need to implement methods like - - `get_offers` It will be called every time the user wants to provision something. Add availability information if possible. - - `run_job` Here you create a compute resource and run `dstack-shim` or `dstack-runner`. - - `terminate_instance` This method should not raise an error, if there is no such instance. + - `get_offers` It will be called every time the user wants to provision something. Add availability information if possible. + - `run_job` Here you create a compute resource and run `dstack-shim` or `dstack-runner`. + - `terminate_instance` This method should not raise an error, if there is no such instance. 5. **Configuration Implementation**: - Implement the `Config` class in `config.py`, inheriting from both `BackendConfig` and `StoredConfig`. This configuration is accepted by the `Backend` class. @@ -111,7 +106,7 @@ You'll need to implement methods like 1. **Create Configuration Models:** You may have multiple models for credentials (i.e., default credentials & explicit credentials). - In `src/dstack/_internal/core/models/backends/.py`, create models for your provider's configuration: + In `src/dstack/_internal/core/models/backends/.py`, create models for your provider's configuration: - `ConfigInfo:` create a model with all configuration details except credentials. - `ConfigInfoWithCreds`: create a model with credentials. - `ConfigInfoWithCredsPartial`: create a model with all fields optional. @@ -122,7 +117,7 @@ Ensure all new models are imported into `src/dstack/_internal/core/models/backen ### Finalizing Integration 1. **Implement Configurator:** -Develop `Configurator` in `src/dstack/_internal/server/services/backends/configurators/.py`. +Develop `Configurator` in `src/dstack/_internal/server/services/backends/configurators/.py`. 2. **Add YAML Configuration Model:** Insert `Config` in `src/dstack/_internal/server/services/config.py` to represent the provider’s configuration in YAML. @@ -131,12 +126,7 @@ Insert `Config` in `src/dstack/_internal/server/services/config.py Add a safe import for your backend in `src/dstack/_internal/server/services/backends/__init__.py` and update expected backends in tests within `src/tests/_internal/server/routers/test_backends.py.` -Note: There are two types of compute in dstack: -- `dockerized: False` — the backend runs `dstack-shim`. This setup is common for VMs. -- `dockerized: True`— the backend directly runs `dstack-runner` inside a docker container. - -The Compute class interface may undergo changes with the upcoming pools feature release, so keep an eye out for updates. ## Appendix @@ -150,6 +140,7 @@ dstack expects VM backends to have: - External IP & 1 port for SSH (any) - cloud-init script (preferred) - API for creating and terminating instances +- Safe imports To speed up provisioning, we prebuild VM images with necessary dependencies, available in `packer/`. @@ -160,6 +151,12 @@ For Docker-only backends, dstack requires: - External IP & 1 port for SSH (any) - Container entrypoint override (~2KB) - API for creating and terminating containers +- Safe imports + +Note: There are two types of compute in dstack: +- `dockerized: False` — the backend runs `dstack-shim`. This setup is common for VMs. +- `dockerized: True`— the backend directly runs `dstack-runner` inside a docker container. +The Compute class interface may undergo changes with the upcoming pools feature release, so keep an eye out for updates. From 3fa7a906d0330073fb9583368ce87bf9ae8eb11a Mon Sep 17 00:00:00 2001 From: KevKibe Date: Thu, 22 Feb 2024 14:47:19 +0300 Subject: [PATCH 3/4] updated with requested changes --- contributing/BACKENDS.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/contributing/BACKENDS.md b/contributing/BACKENDS.md index 9b8b475ba..e4b0a5a2d 100644 --- a/contributing/BACKENDS.md +++ b/contributing/BACKENDS.md @@ -93,6 +93,7 @@ In `__init__.py`, implement `Backend`, inheriting from `BaseBacken 4. **Compute Implementation:** In `compute.py`, develop `Compute`, inheriting from `Compute`.
+ You'll need to implement methods like - `get_offers` It will be called every time the user wants to provision something. Add availability information if possible. - `run_job` Here you create a compute resource and run `dstack-shim` or `dstack-runner`. @@ -144,6 +145,8 @@ dstack expects VM backends to have: To speed up provisioning, we prebuild VM images with necessary dependencies, available in `packer/`. +Examples: Microsoft Azure, AWS, Google Cloud Platform + ### Adding Docker-only Compute Backend For Docker-only backends, dstack requires: @@ -153,6 +156,8 @@ For Docker-only backends, dstack requires: - API for creating and terminating containers - Safe imports +Examples: Amazon Elastic Container Service(ECS), Google Cloud Run + Note: There are two types of compute in dstack: - `dockerized: False` — the backend runs `dstack-shim`. This setup is common for VMs. From 5066d46bbc0613cbfda17db2138d60291beba8f5 Mon Sep 17 00:00:00 2001 From: KevKibe Date: Thu, 22 Feb 2024 16:19:40 +0300 Subject: [PATCH 4/4] updated with requested changes --- contributing/BACKENDS.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/contributing/BACKENDS.md b/contributing/BACKENDS.md index e4b0a5a2d..719d8473b 100644 --- a/contributing/BACKENDS.md +++ b/contributing/BACKENDS.md @@ -141,11 +141,10 @@ dstack expects VM backends to have: - External IP & 1 port for SSH (any) - cloud-init script (preferred) - API for creating and terminating instances -- Safe imports To speed up provisioning, we prebuild VM images with necessary dependencies, available in `packer/`. -Examples: Microsoft Azure, AWS, Google Cloud Platform +Examples: `aws`, `azure`, `gcp` etc ### Adding Docker-only Compute Backend For Docker-only backends, dstack requires: @@ -154,9 +153,8 @@ For Docker-only backends, dstack requires: - External IP & 1 port for SSH (any) - Container entrypoint override (~2KB) - API for creating and terminating containers -- Safe imports -Examples: Amazon Elastic Container Service(ECS), Google Cloud Run +Examples: `kubernetes`, `vastai` etc Note: There are two types of compute in dstack: