Skip to content

Conversation

@future-xy
Copy link
Member

This pr adds versioning for the document site.
This pr also creates two snapshots (0.7 and 0.8) as past and stable document.

- Add Docusaurus versioning infrastructure
- Create version snapshots for 0.8.0 (stable) and 0.7.0
- Set up Latest (dev) documentation track
- Add GitHub Actions workflows for automated doc syncing
- Add custom version banner components
- Update configuration and sidebar structure
@future-xy future-xy requested review from Chivier and Copilot November 6, 2025 19:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds versioned documentation for ServerlessLLM versions 0.8.0 and 0.7.0. The changes include complete documentation sets for both versions covering getting started guides, deployment instructions, feature documentation, and store-specific guides.

Key Changes:

  • Added version tracking in versions.json for versions 0.8.0 and 0.7.0
  • Created versioned sidebar configurations for both versions
  • Added comprehensive documentation including quickstart guides, deployment guides, feature documentation, and ServerlessLLM Store documentation

Reviewed Changes

Copilot reviewed 65 out of 92 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
versions.json Defines available documentation versions
versioned_sidebars/version-0.8.0-sidebars.json Sidebar navigation for v0.8.0 docs
versioned_sidebars/version-0.7.0-sidebars.json Sidebar navigation for v0.7.0 docs
versioned_docs/version-0.8.0/store/*.md Store documentation including quickstart, ROCm guide, and quantization
versioned_docs/version-0.8.0/*.md Core documentation including intro, getting started, and features
versioned_docs/version-0.8.0/images/* Image assets for documentation
docs/store/category.json Category configuration for store documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


Our api aims to be compatible with the `sharded_state` load format in vLLM. Thus, due to the model modifications about the model architecture done by vLLM, the model format for vLLM is **not** the same as we used in transformers. Thus, the `ServerlessLLM format` mentioned in the subsequent sections means the format integrated with vLLM, which is different from the `ServerlessLLM format` used in the previous sections.

Thus, for fist-time users, you have to load the model from other backends and then converted it to the ServerlessLLM format.
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'fist-time' to 'first-time'.

Copilot uses AI. Check for mistakes.

Our api aims to be compatible with the `sharded_state` load format in vLLM. Thus, due to the model modifications about the model architecture done by vLLM, the model format for vLLM is **not** the same as we used in transformers. Thus, the `ServerlessLLM format` mentioned in the subsequent sections means the format integrated with vLLM, which is different from the `ServerlessLLM format` used in the previous sections.

Thus, for fist-time users, you have to load the model from other backends and then converted it to the ServerlessLLM format.
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verb tense inconsistency: 'converted' should be 'convert' to match 'load'.

Copilot uses AI. Check for mistakes.
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

4. Clean up by "Ctrl+C" the server process.
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent formatting: Use backticks for the instruction. Should be: 'Clean up by pressing Ctrl+C on the server process.' or 'Clean up by stopping (Ctrl+C) the server process.'

Copilot uses AI. Check for mistakes.
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

4. Clean up by `Ctrl+C` the server process.
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent with Step 4 later in the document which uses proper formatting: 'Clean up by Ctrl+C the server process.' The later version correctly states 'Clean up by pressing Ctrl+C...'

Copilot uses AI. Check for mistakes.
model = AutoModelForCausalLM.from_pretrained('facebook/opt-1.3b', torch_dtype=torch.float16) -->

# Replace './models' with your local path.
save_lora(adapter, './models/facebook/opt-1.3b')
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing explanation for where the 'adapter' variable comes from. The TODO comment above indicates this section is incomplete.

Copilot uses AI. Check for mistakes.
import clsx from 'clsx';
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import Link from '@docusaurus/Link';
import Translate from '@docusaurus/Translate';
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import Translate.

Copilot uses AI. Check for mistakes.
@future-xy future-xy merged commit 6453212 into main Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants