Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
* text=auto eol=lf

*.ps1 text eol=crlf
*.bat text eol=crlf
*.cmd text eol=crlf
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ All notable changes to this project will be documented here.
- Added sandbox backend registry and `nullstate sandbox` commands.
- Added model metrics artifact support.
- Added DevSecOps repository documentation and GitHub workflow templates.
- Documented AMD Developer Cloud / DigitalOcean primary compute path with Fireworks fallback.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ Each run writes:
- [Threat model](docs/threat-model.md)
- [CI/CD](docs/ci-cd.md)
- [Runbook](docs/runbook.md)
- [AMD compute strategy](docs/compute-strategy.md)
- [Failure modes](docs/failure-modes.md)
- [Cost report](docs/cost-report.md)

Expand Down
3 changes: 3 additions & 0 deletions docs/case-study.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Cloud and platform teams can ship IaC faster than security teams can manually va
- No production cloud targets by default.
- LocalStack Azure requires Docker and `LOCALSTACK_AUTH_TOKEN`.
- The demo must still work if Docker, Terraform, or the model endpoint is unavailable.
- AMD Developer Cloud / DigitalOcean GPU access may be delayed, so Fireworks-compatible managed inference is kept as a contingency.

## 4. Requirements

Expand Down Expand Up @@ -80,6 +81,8 @@ See [Runbook](runbook.md).

See [Cost Report](cost-report.md). V1 is designed to run locally, with AMD Developer Cloud used only for model-serving evidence.

See [AMD Compute Strategy](compute-strategy.md) for the primary DigitalOcean/AMD path and Fireworks fallback.

## 10. Results

- Offline CLI demo runs end to end.
Expand Down
91 changes: 91 additions & 0 deletions docs/compute-strategy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# AMD Compute Strategy

## Decision

Use the harder AMD Developer Cloud / DigitalOcean route as the primary path for the case study. Keep Fireworks AI as a contingency endpoint if GPU access is delayed.

## Why primary path is DigitalOcean/AMD Developer Cloud

- It produces stronger evidence for the hackathon: model serving, ROCm, vLLM/SGLang, GPU observability, and operational setup.
- It gives a better personal learning outcome because the work includes real DevOps, cloud access, security boundaries, and inference operations.
- It supports the project thesis: private or self-controlled model inference for sensitive IaC and security evidence.
- It creates better case-study material than calling a hosted API only.

## Why Fireworks stays as fallback

- It can unblock the demo if AMD Developer Cloud access is delayed.
- It keeps the red/blue agent loop working through an OpenAI-compatible endpoint.
- It is still relevant to the AMD ecosystem, but it should be positioned as managed inference rather than private local/owned serving.

## Execution plan

### Track A - DigitalOcean baseline without GPU

Set up the non-GPU platform first:

- project/repo secrets
- hardened control droplet or container host
- Docker and compose baseline
- LocalStack Azure sandbox
- nullstate CLI installation
- GitHub Actions environment configuration
- run artifact storage layout
- basic monitoring and logs

This work is useful even before the GPU is available.

### Track B - AMD GPU inference

When MI300X access is available:

- provision AMD Developer Cloud / DigitalOcean GPU instance
- install ROCm stack or use provider image
- serve model with vLLM or SGLang using an OpenAI-compatible API
- expose only the required API path to the nullstate operator environment
- record model name, context length, ROCm version, GPU model, memory, throughput, and latency
- save vLLM `/metrics` snapshots and `amd-smi` or `rocm-smi` output into the case-study evidence folder

### Track C - Fireworks contingency

If GPU access blocks the submission:

- configure `NULLSTATE_LLM_BASE_URL` for Fireworks-compatible endpoint
- run the same nullstate demo
- document this as the managed-inference fallback
- keep the DigitalOcean/AMD setup as the next milestone rather than hiding the blocker

## Demo positioning

Preferred story:

```text
nullstate runs local IaC security validation and can use a private AMD MI300X-hosted model endpoint for red/blue reasoning over security evidence.
```

Fallback story:

```text
nullstate is model-provider portable through OpenAI-compatible endpoints. The same CLI can run against managed inference while the private AMD GPU endpoint is being provisioned.
```

## Evidence checklist

- `runs/<id>/report.md`
- `runs/<id>/metrics.json`
- vLLM `/metrics` before/after snapshots
- `amd-smi` or `rocm-smi` output
- ROCm version
- model server launch command
- sanitized network diagram
- screenshots of GPU utilization and CLI run
- note whether the run used DigitalOcean/AMD GPU, local mock mode, or Fireworks fallback

## Risks

| Risk | Impact | Mitigation |
|---|---|---|
| AMD Developer Cloud access delayed | High | Build DO baseline and use Fireworks fallback |
| GPU image/ROCm mismatch | Medium | Prefer provider image; document exact versions |
| Model too large or slow | Medium | Start with one model for both red and blue roles |
| Endpoint exposed publicly | High | restrict ingress, use token auth, document network boundary |
Comment thread
Ker102 marked this conversation as resolved.
| Case study overclaims private inference | High | label each run by actual endpoint type |
1 change: 1 addition & 0 deletions docs/cost-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ V1 is designed to keep cloud spend near zero by default. Offline mode runs local
| Offline demo | 0 | no cloud or model endpoint |
| LocalStack Azure | depends on LocalStack access | requires auth token |
| AMD Developer Cloud | hackathon credits | used for MI300X model endpoint |
| Fireworks fallback | provider dependent | contingency if GPU access is delayed |
| GitHub Actions | low/free tier dependent | tests are lightweight |

## Controls
Expand Down
8 changes: 8 additions & 0 deletions docs/runbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,14 @@ $env:NULLSTATE_LLM_API_KEY = "<optional>"

Then run without `--offline`.

## AMD Developer Cloud / DigitalOcean path

Use [AMD Compute Strategy](compute-strategy.md) as the deployment checklist. Build the non-GPU DigitalOcean baseline first, then attach the MI300X-backed model endpoint when access is available.

## Fireworks fallback

If AMD GPU access is delayed, point `NULLSTATE_LLM_BASE_URL` at the managed endpoint and keep the same nullstate run flow. Label the evidence as managed inference, not private GPU-hosted inference.

## Artifact review before publishing

Check:
Expand Down
Loading