[Inference Operator] Update sample CRD files for v3.1 features by zicanl-amazon · Pull Request #413 · aws/sagemaker-hyperpod-cli

zicanl-amazon · 2026-04-22T22:44:24Z

What's changing and why?

Replace 6 stale v1alpha1-era sample files with 20 curated v1 samples from the internal operator repo.

The existing samples did not reflect any of the new CRD features added in v3.1 (PR #412). The new samples demonstrate all customer-facing v3.1 features:

HuggingFace model source (TGI, vLLM, SGLang runtimes, prefetch on/off)
Kubernetes volume model source
ServiceAccount (IRSA) support
NVMe + S3 prefetch patterns (on/off, with/without fallback)
BYO TLS certificate
Health probes (liveness/readiness)
Multi-node inference
Node affinity scheduling
Intelligent routing with KV cache (L1, L1+L2)

Internal-only samples excluded (DPD, test fixtures, benchmarks).
All account IDs, personal bucket names, and internal hostnames sanitized with <PLACEHOLDER> values.

Before/After UX

Before:
6 outdated samples from v1alpha1 era — no huggingface, no kubernetesVolume, no serviceAccount, no dataCapture, no health probes.

After:
20 samples covering all customer-facing v3.1 features with clear comments, prerequisites, and placeholder values for easy customization.

How was this change tested?

Sample files are static YAML documentation — no runtime behavior. Validated:

All 20 YAML files parse correctly
All samples use apiVersion: inference.sagemaker.aws.amazon.com/v1
All internal content sanitized (account IDs, bucket names, node IDs)
No DPD-related samples included

Are unit tests added?

N/A — sample files are documentation/examples, not executable code.

Are integration tests added?

N/A — sample files are documentation/examples, not executable code.

Reviewer Guidelines

‼️ Merge Requirements: PRs with failing integration tests cannot be merged without justification.

One of the following must be true:

All automated PR checks pass
Failed tests include local run results/screenshots proving they work
Changes are documentation-only

Replace 6 stale v1alpha1-era samples with 20 curated v1 samples covering: - HuggingFace model source (TGI, vLLM, SGLang runtimes, prefetch on/off) - Kubernetes volume model source - ServiceAccount (IRSA) support - NVMe + S3 prefetch patterns (on/off, with/without fallback) - BYO TLS certificate - Health probes (liveness/readiness) - Multi-node inference - Node affinity scheduling - Intelligent routing with KV cache (L1, L1+L2) Internal-only samples excluded (DPD, test fixtures, benchmarks). All account IDs, personal bucket names, and internal hostnames sanitized. Signed-off-by: Zican Li <zicanl@amazon.com>

zicanl-amazon requested a review from a team as a code owner April 22, 2026 22:44

zicanl-amazon requested a deployment to manual-approval April 22, 2026 22:44 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference Operator] Update sample CRD files for v3.1 features#413

[Inference Operator] Update sample CRD files for v3.1 features#413
zicanl-amazon wants to merge 1 commit intoaws:mainfrom
zicanl-amazon:feature/update-inference-operator-samples

zicanl-amazon commented Apr 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zicanl-amazon commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's changing and why?

Before/After UX

How was this change tested?

Are unit tests added?

Are integration tests added?

Reviewer Guidelines

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zicanl-amazon commented Apr 22, 2026 •

edited

Loading