diff --git a/docs/cli/Guides/deploy-app/deploy-app-example.md b/docs/cli/Guides/deploy-app/deploy-app-example.md index 762949db..edb21fd1 100644 --- a/docs/cli/Guides/deploy-app/deploy-app-example.md +++ b/docs/cli/Guides/deploy-app/deploy-app-example.md @@ -98,7 +98,7 @@ RUN chmod +x /usr/local/bin/usd_to_crypto.py COPY entrypoint.sh /usr/local/bin/entrypoint.sh RUN chmod +x /usr/local/bin/entrypoint.sh -# Set /sp as workdir (doesn't matter in this case; entrypoint.sh uses /sp/output as workdir) +# Set /sp as workdir (doesn't matter in this case -- entrypoint.sh uses /sp/output as workdir) WORKDIR /sp # Set entrypoint @@ -137,8 +137,8 @@ tar -czvf input.tar.gz ./input.txt ```shell ./spctl files upload ./input.tar.gz \ ---filename input.tar.gz \ ---output input.resource.json + --filename input.tar.gz \ + --output input.resource.json ``` ### 3. Deploy @@ -147,9 +147,9 @@ Place an order: ```shell ./spctl workflows create \ ---tee 7 \ ---solution ./usd_to_crypto.resource.json \ ---data ./input.resource.json + --tee 7 \ + --solution ./usd_to_crypto.resource.json \ + --data ./input.resource.json ``` Find the order ID in the output, for example: diff --git a/docs/cli/Guides/swarm-vllm.md b/docs/cli/Guides/swarm-vllm.md new file mode 100644 index 00000000..52f4a9ae --- /dev/null +++ b/docs/cli/Guides/swarm-vllm.md @@ -0,0 +1,183 @@ +--- +id: "swarm-vllm" +title: "vLLM on Super Swarm" +slug: "/guides/swarm-vllm" +sidebar_position: 20 +--- + +This guide provides step-by-step instructions for deploying MedGemma and Apertus on Super Swarm using vLLM. + +## Prerequisites + +- [kubectl](https://kubernetes.io/docs/tasks/tools/) +- [helm](https://helm.sh/docs/intro/install/) +- A domain +- For [MedGemma](https://huggingface.co/google/medgemma-1.5-4b-it): a Hugging Face token from an account that has already accepted the model's terms + +Also, download and rename deployment scripts: + +- [`deploy_medgemma_official.sh`](/files/deploy_medgemma_official.sh) +- [`deploy_apertus_official.sh`](/files/deploy_apertus_official.sh) + +## 1. Sign in to Super Swarm + +In the Super Swarm dashboard, sign in using MetaMask: + + +
+ +## 2. Create a Kubernetes cluster + +2.1. Go to **Kubernetes** and press **Create Cluster**: + + +
+
+ +2.2. Add a GPU to the cluster, allocate resources, and press **Create Cluster**: + + +
+ +## 3. Download the cluster configuration file + + +
+ +## 4. Point `kubectl` to the configuration file + +Execute the following command: + +```shell +export KUBECONFIG=-kubeconfig.yaml +``` + +Replace `-kubeconfig.yaml` with the name of the downloaded configuration file. + +## 5. Update the scripts + +In both scripts (`deploy_medgemma_official.sh` and `deploy_apertus_official.sh`), find `BASE_DOMAIN="${BASE_DOMAIN:-monai-swarm.win}"` and replace `monai-swarm.win` with your domain. + +## 6. Set the API key + +Choose any password that will protect your API endpoints. Execute the following command and type your chosen secret (characters won't be displayed): + +```shell +read -rs API_KEY && export API_KEY +``` + +## 7. Deploy the model + +### Apertus + +```shell +bash deploy_apertus_official.sh +``` + +The deployment usually takes 5-7 minutes. + +A working Apertus config is already set in the script: + +``` +dtype=bfloat16 +max-model-len=32768 +gpu-memory-utilization=0.55 +max-num-seqs=8 +max-num-batched-tokens=4096 +``` + +### MedGemma + +```shell +export HF_TOKEN=hf_xxx +bash deploy_medgemma_official.sh +``` + +Replace `hf_xxx` with an HF_TOKEN. + +Alternatively, create a `.hf_token` file with the token next to `deploy_medgemma_official.sh`; the script will read it automatically. + +A working MedGemma config is already set in the script: + +``` +dtype=bfloat16 +max-model-len=8192 +gpu-memory-utilization=0.40 +--mm-processor-cache-gb 1 +max-num-seqs=4 +max-num-batched-tokens=2048 +``` + +## 8. Check Kubernetes + +```shell +kubectl get pods -o wide +kubectl get svc +kubectl get ingress +``` + +Expected output: + +- Two pods in `1/1 Running` +- Two services +- Two ingresses + +## 9. Confirm DNS records + +Back in the Super Swarm dashboard, go to **Ingresses** and note the two hostnames listed there. + + +
+
+ +For each hostname, add a CNAME record pointing to it and a TXT record for domain verification at your DNS provider. + +## 10. Publish the cluster + +In the Super Swarm dashboard, go to **Kubernetes** and publish the cluster. + + +
+ +## 11. Send test requests + +In the test requests below, replace: + +- `` with your domain. +- `` with the key you set in [Step 6](/cli/guides/swarm-vllm#6-set-the-api-key). + +### Apertus + +```shell +curl https://apertus-vllm./v1/completions \ + -H 'Authorization: Bearer ' \ + -H 'Content-Type: application/json' \ + -d '{ + "model": "swiss-ai/Apertus-8B-2509", + "prompt": "Write a concise technical summary of Kubernetes GPU scheduling.", + "temperature": 0, + "max_tokens": 200 + }' +``` + +### MedGemma + +```shell +curl https://medgemma-vllm./v1/chat/completions \ + -H 'Authorization: Bearer ' \ + -H 'Content-Type: application/json' \ + -d '{ + "model": "google/medgemma-1.5-4b-it", + "messages": [ + { + "role": "user", + "content": [ + {"type": "text", "text": "Describe this image briefly."}, + {"type": "image_url", "image_url": {"url": "data:image/png;base64,PASTE_BASE64_HERE"}} + ] + } + ], + "temperature": 0, + "max_tokens": 120 + }' +``` \ No newline at end of file diff --git a/docs/cli/images/create-kubernetes-space.png b/docs/cli/images/create-kubernetes-space.png new file mode 100644 index 00000000..c98291d6 Binary files /dev/null and b/docs/cli/images/create-kubernetes-space.png differ diff --git a/docs/cli/images/ingresses.png b/docs/cli/images/ingresses.png new file mode 100644 index 00000000..86b2a44b Binary files /dev/null and b/docs/cli/images/ingresses.png differ diff --git a/docs/cli/images/kubernetes-create-cluster.png b/docs/cli/images/kubernetes-create-cluster.png new file mode 100644 index 00000000..312d03da Binary files /dev/null and b/docs/cli/images/kubernetes-create-cluster.png differ diff --git a/docs/cli/images/kubernetes-download-kubeconfig.png b/docs/cli/images/kubernetes-download-kubeconfig.png new file mode 100644 index 00000000..cf287367 Binary files /dev/null and b/docs/cli/images/kubernetes-download-kubeconfig.png differ diff --git a/docs/cli/images/kubernetes-publish-cluster.png b/docs/cli/images/kubernetes-publish-cluster.png new file mode 100644 index 00000000..bf0c2bfb Binary files /dev/null and b/docs/cli/images/kubernetes-publish-cluster.png differ diff --git a/docs/cli/images/swarm-log-in.png b/docs/cli/images/swarm-log-in.png new file mode 100644 index 00000000..e7abee2f Binary files /dev/null and b/docs/cli/images/swarm-log-in.png differ diff --git a/docs/marketplace/Guides/storage.md b/docs/marketplace/Guides/storage.md index 95915151..21ebb997 100644 --- a/docs/marketplace/Guides/storage.md +++ b/docs/marketplace/Guides/storage.md @@ -7,14 +7,12 @@ sidebar_position: 6 This guide provides step-by-step instructions on how to set up your personal Storj account. -The guide is intended for advanced Web3 users; feel free to skip it and continue using the default recommended option—**Super Protocol cloud**. Read about [types of storage](/marketplace/account/web3#storage). +The guide is intended for advanced users; feel free to skip it and continue using the default recommended option—**Super Protocol cloud**. Read about [types of storage](/marketplace/account#storage). - +

-Web2 users must first [log in as a Web3 user](/marketplace/guides/log-in) to be able to upload to a personal Storj account instead of the Super Protocol cloud. - ## Step 1. Register a Storj account If you don't already have a [Storj](https://www.storj.io/) account, register one. Both free Trial and Pro accounts are suitable. Note that with a Trial account, your files will become inaccessible once the trial period ends. @@ -36,7 +34,7 @@ As a result, you should have two pairs Access Key + Secret Key. ## Step 4. Set up your Super Protocol Web3 account -Open the [Marketplace web app](https://marketplace.superprotocol.com/). Log in as a Web3 user and open the **Account** window. +Open the [Marketplace web app](https://marketplace.superprotocol.com/), sign in and open the **Account** window.
diff --git a/docs/marketplace/account.md b/docs/marketplace/account.md new file mode 100644 index 00000000..99f147c1 --- /dev/null +++ b/docs/marketplace/account.md @@ -0,0 +1,52 @@ +--- +id: "account" +title: "Enter Marketplace" +slug: "/account" +sidebar_position: 3 +--- + +Super Protocol supports two login methods: + +- Web2 requires an account on one of the supported platforms: + - Google + - Hugging Face + - GitHub + - Microsoft +- Web3 requires a software wallet installed as a browser extension: + - MetaMask + - Trust Wallet + +For instructions on how to set up software wallets and connect them to the Marketplace, read [How to Log In as a Web3 User](/marketplace/guides/log-in). + + +
+
+ +## Account window + +This window shows your user account settings. + + +
+
+ +**User ID**: your unique user ID. + +**Login**: the OAuth2 provider and your login email address. + +The **Get SPPI** button allows you to get tokens necessary to place orders. + +### Storage + +You have two options of decentralized storage to upload files: + +- **Super Protocol cloud**: + - Recommended for most users. + - Does not require additional setup. + - Uses Super Protocol's Storj account and thus relies on Super Protocol as the storage provider. +- **Your Storj account**: + - Intended for advanced users. + - Requires creating and [setting up a Storj account](/marketplace/guides/storage). + - Gives sole control over the uploaded content and storage account. + +Read [How to Set Up Storage](/marketplace/guides/storage) for step-by-step instructions. \ No newline at end of file diff --git a/docs/marketplace/account/index.md b/docs/marketplace/account/index.md deleted file mode 100644 index e2ae472f..00000000 --- a/docs/marketplace/account/index.md +++ /dev/null @@ -1,39 +0,0 @@ ---- -id: "account" -title: "Enter Marketplace" -slug: "/account" -sidebar_position: 3 ---- - -There are two types of accounts in Super Protocol: - -- Web3 User account -- Web2 User account - - -
-
- -## Web3 User account - -_Web3 User account_ provides access to all Marketplace capabilities, including: - -- Full decentralization and sole control of user's funds, models, and datasets. -- Ability to upload models and datasets to the Super Protocol cloud or a personal Storj account. -- Placement of orders using Marketplace offers or the user's own uploaded content. -- Registration of individual providers. -- Creation and monetization of model and dataset offers on the Marketplace. -- Ability to request additional SPPI tokens. - -Read [How to Log In as a Web3 User](/marketplace/guides/log-in) for step-by-step instructions. - -## Web2 User account - -_Web2 User account_ is a quick way to start with the Marketplace. It streamlines a few steps, but this comes at the expense of full decentralization, such as using OAuth2 authentication for login instead of the decentralized MetaMask. - -To log in as a Web2 user, you need an account on one of the supported platforms: - -- Google -- Hugging Face -- GitHub -- Microsoft diff --git a/docs/marketplace/account/web2.md b/docs/marketplace/account/web2.md deleted file mode 100644 index 5e2e2140..00000000 --- a/docs/marketplace/account/web2.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -id: "web2" -title: "Web2 User Account" -slug: "/account/web2" -sidebar_position: 2 ---- - -This window shows the settings of your [Web2 User account](/marketplace/account#web2-user-account). - - -
-
- -**User ID**: your unique user ID. - -**Login**: the OAuth2 provider and your login email address. - -## Storage - -Super Protocol supports two options of decentralized storage to upload files: - -- **Super Protocol cloud**: - - Does not require additional setup. - - Uses Super Protocol's Storj account and thus relies on Super Protocol as the storage provider. - - Costs SPPI tokens for additional storage beyond the basic free package. -- **Your Storj account**: - - Available to Web3 users only. - - Requires creating and setting up a Storj account. - - Gives sole control over the uploaded content and storage account. - -To enable uploading to your personal Storj account, [log in as a Web3 user](/marketplace/guides/log-in). \ No newline at end of file diff --git a/docs/marketplace/account/web3.md b/docs/marketplace/account/web3.md deleted file mode 100644 index dd18bbb3..00000000 --- a/docs/marketplace/account/web3.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -id: "web3" -title: "Web3 User Account" -slug: "/account/web3" -sidebar_position: 1 ---- - -This window allows you to manage your [Web3 User account](/marketplace/account#web3-user-account). - - -
-
- -**User ID**: your unique user ID is your EVM wallet address. - -**Login**: the Web3 login method and the EVM wallet address you are using. Currently, Super Protocol only supports MetaMask as a Web3 login method. - -**Get SPPI** and **Get BNB** buttons allow you to get tokens necessary to place orders: - -- SPPI tokens are required to pay and receive payments in Super Protocol. -- BNB tokens are required to pay for opBNB blockchain transactions. - -## Storage - -You have two options of decentralized storage to upload files: - -- **Super Protocol cloud**: - - Recommended for most users. - - Does not require additional setup. - - Uses Super Protocol's Storj account and thus relies on Super Protocol as the storage provider. -- **Your Storj account**: - - Intended for advanced users. - - Requires creating and setting up a Storj account. - - Gives sole control over the uploaded content and storage account. - -Read [How to Set Up Storage](/marketplace/guides/storage) for step-by-step instructions. \ No newline at end of file diff --git a/docs/marketplace/images/web2-account.png b/docs/marketplace/images/web2-account.png index 528db2d7..80cc8bee 100644 Binary files a/docs/marketplace/images/web2-account.png and b/docs/marketplace/images/web2-account.png differ diff --git a/docs/marketplace/images/web3-account.png b/docs/marketplace/images/web3-account.png deleted file mode 100644 index 46480abe..00000000 Binary files a/docs/marketplace/images/web3-account.png and /dev/null differ diff --git a/docusaurus.config.js b/docusaurus.config.js index db773ef3..e38429ec 100644 --- a/docusaurus.config.js +++ b/docusaurus.config.js @@ -46,6 +46,14 @@ const config = { from: "/cli/guides/quick-guide", to: "/cli/guides/deploy-app", }, + { + from: "/marketplace/account/web2", + to: "/marketplace/account#account-window" + }, + { + from: "/marketplace/account/web3", + to: "/marketplace/account#account-window" + }, ], }, ], diff --git a/static/files/deploy_apertus_official.sh b/static/files/deploy_apertus_official.sh new file mode 100755 index 00000000..1487a1c7 --- /dev/null +++ b/static/files/deploy_apertus_official.sh @@ -0,0 +1,154 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +BASE_DOMAIN="${BASE_DOMAIN:-monai-swarm.win}" +API_HOST="${API_HOST:-apertus-vllm.${BASE_DOMAIN}}" +MODEL_NAME="${MODEL_NAME:-swiss-ai/Apertus-8B-2509}" +MODEL_ENTRY_NAME="${MODEL_ENTRY_NAME:-apertus}" +RELEASE_NAME="${RELEASE_NAME:-apertus-official}" +if [ -z "${API_KEY:-}" ]; then + echo "API_KEY must be set. Execute:" >&2 + echo "read -rs API_KEY && export API_KEY" >&2 + echo "And then type a desired key." >&2 + exit 1 +fi +IMAGE_REPOSITORY="${IMAGE_REPOSITORY:-vllm/vllm-openai}" +IMAGE_TAG="${IMAGE_TAG:-v0.18.0}" +GPU_MEMORY_UTILIZATION="${GPU_MEMORY_UTILIZATION:-0.55}" +MAX_MODEL_LEN="${MAX_MODEL_LEN:-32768}" +CPU_REQUEST="${CPU_REQUEST:-8}" +MEMORY_REQUEST="${MEMORY_REQUEST:-48Gi}" +GPU_COUNT="${GPU_COUNT:-1}" +PVC_STORAGE="${PVC_STORAGE:-80Gi}" +INGRESS_CLASS="${INGRESS_CLASS:-nginx}" + +need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; } +need kubectl +need helm + +NAMESPACE="${NAMESPACE:-$(kubectl config view --minify -o jsonpath='{..namespace}' 2>/dev/null || true)}" +if [ -z "${NAMESPACE}" ]; then + NAMESPACE="llm" +fi + +SECRET_NAME="${RELEASE_NAME}-auth" +SERVICE_NAME="${RELEASE_NAME}-${MODEL_ENTRY_NAME}-engine-service" +DEPLOY_LABEL_MODEL="${MODEL_ENTRY_NAME}" +INGRESS_NAME="${RELEASE_NAME}-api-ingress" + +echo "==> Runtime: vLLM (official helm chart)" +echo "==> Namespace: ${NAMESPACE}" +echo "==> Release: ${RELEASE_NAME}" +echo "==> API host: ${API_HOST}" +echo "==> Model: ${MODEL_NAME}" +echo "==> Model entry name: ${MODEL_ENTRY_NAME}" +echo "==> Image: ${IMAGE_REPOSITORY}:${IMAGE_TAG}" +echo "==> Max model length: ${MAX_MODEL_LEN}" +echo "==> GPU memory utilization: ${GPU_MEMORY_UTILIZATION}" +echo + +kubectl get ns "${NAMESPACE}" >/dev/null 2>&1 || kubectl create ns "${NAMESPACE}" + +helm repo add vllm https://vllm-project.github.io/production-stack >/dev/null 2>&1 || true +helm repo update >/dev/null 2>&1 + +cat < "${VALUES_FILE}" < Pods:" +kubectl -n "${NAMESPACE}" get pods -o wide +echo +echo "==> Services:" +kubectl -n "${NAMESPACE}" get svc -o wide +echo +echo "==> Ingress:" +kubectl -n "${NAMESPACE}" get ingress -o wide +echo +echo "==> Waiting for Apertus pod readiness..." +kubectl -n "${NAMESPACE}" wait --for=condition=ready pod \ + -l "model=${DEPLOY_LABEL_MODEL},helm-release-name=${RELEASE_NAME}" \ + --timeout=900s +echo +echo "==> Ready" +echo "Base URL: https://${API_HOST}/v1" +echo "Model: ${MODEL_NAME}" +echo "Example:" +echo " curl https://${API_HOST}/v1/models -H 'Authorization: Bearer ${API_KEY}'" diff --git a/static/files/deploy_medgemma_official.sh b/static/files/deploy_medgemma_official.sh new file mode 100755 index 00000000..7845a04e --- /dev/null +++ b/static/files/deploy_medgemma_official.sh @@ -0,0 +1,170 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +BASE_DOMAIN="${BASE_DOMAIN:-monai-swarm.win}" +API_HOST="${API_HOST:-medgemma-vllm.${BASE_DOMAIN}}" +MODEL_NAME="${MODEL_NAME:-google/medgemma-1.5-4b-it}" +MODEL_ENTRY_NAME="${MODEL_ENTRY_NAME:-medgemma}" +RELEASE_NAME="${RELEASE_NAME:-medgemma-official}" +if [ -z "${API_KEY:-}" ]; then + echo "API_KEY must be set. Execute:" >&2 + echo "read -rs API_KEY && export API_KEY" >&2 + echo "And then type a desired key." >&2 + exit 1 +fi +IMAGE_REPOSITORY="${IMAGE_REPOSITORY:-vllm/vllm-openai}" +IMAGE_TAG="${IMAGE_TAG:-v0.18.0}" +GPU_MEMORY_UTILIZATION="${GPU_MEMORY_UTILIZATION:-0.40}" +MAX_MODEL_LEN="${MAX_MODEL_LEN:-8192}" +CPU_REQUEST="${CPU_REQUEST:-8}" +MEMORY_REQUEST="${MEMORY_REQUEST:-48Gi}" +GPU_COUNT="${GPU_COUNT:-1}" +PVC_STORAGE="${PVC_STORAGE:-80Gi}" +INGRESS_CLASS="${INGRESS_CLASS:-nginx}" + +if [ -z "${HF_TOKEN:-}" ] && [ -f "${SCRIPT_DIR}/.hf_token" ]; then + HF_TOKEN="$(cat "${SCRIPT_DIR}/.hf_token")" +fi + +if [ -z "${HF_TOKEN:-}" ]; then + echo "HF_TOKEN is required for ${MODEL_NAME}." >&2 + echo "Set HF_TOKEN in the environment or create ${SCRIPT_DIR}/.hf_token." >&2 + exit 1 +fi + +need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; } +need kubectl +need helm + +NAMESPACE="${NAMESPACE:-$(kubectl config view --minify -o jsonpath='{..namespace}' 2>/dev/null || true)}" +if [ -z "${NAMESPACE}" ]; then + NAMESPACE="llm" +fi + +SECRET_NAME="${RELEASE_NAME}-auth" +SERVICE_NAME="${RELEASE_NAME}-${MODEL_ENTRY_NAME}-engine-service" +DEPLOY_LABEL_MODEL="${MODEL_ENTRY_NAME}" +INGRESS_NAME="${RELEASE_NAME}-api-ingress" + +echo "==> Runtime: vLLM (official helm chart)" +echo "==> Namespace: ${NAMESPACE}" +echo "==> Release: ${RELEASE_NAME}" +echo "==> API host: ${API_HOST}" +echo "==> Model: ${MODEL_NAME}" +echo "==> Model entry name: ${MODEL_ENTRY_NAME}" +echo "==> Image: ${IMAGE_REPOSITORY}:${IMAGE_TAG}" +echo "==> Max model length: ${MAX_MODEL_LEN}" +echo "==> GPU memory utilization: ${GPU_MEMORY_UTILIZATION}" +echo + +kubectl get ns "${NAMESPACE}" >/dev/null 2>&1 || kubectl create ns "${NAMESPACE}" + +helm repo add vllm https://vllm-project.github.io/production-stack >/dev/null 2>&1 || true +helm repo update >/dev/null 2>&1 + +cat < "${VALUES_FILE}" < Pods:" +kubectl -n "${NAMESPACE}" get pods -o wide +echo +echo "==> Services:" +kubectl -n "${NAMESPACE}" get svc -o wide +echo +echo "==> Ingress:" +kubectl -n "${NAMESPACE}" get ingress -o wide +echo +echo "==> Waiting for MedGemma pod readiness..." +kubectl -n "${NAMESPACE}" wait --for=condition=ready pod \ + -l "model=${DEPLOY_LABEL_MODEL},helm-release-name=${RELEASE_NAME}" \ + --timeout=900s +echo +echo "==> Ready" +echo "Base URL: https://${API_HOST}/v1" +echo "Model: ${MODEL_NAME}" +echo "Example:" +echo " curl https://${API_HOST}/v1/models -H 'Authorization: Bearer ${API_KEY}'"