[AMD][ROCM] dsv4-fp4-mi355x-atom: bump image, expand concurrency, simplify script by seungrokj · Pull Request #1311 · SemiAnalysisAI/InferenceX

seungrokj · 2026-05-11T04:18:59Z

Summary

Bump image to rocm/atom-dev:nightly_202605101539
Expand concurrency range from single-sequence (conc=1) to conc 1–256
Simplify dsv4_fp4_mi355x_atom.sh by removing WIP workarounds that are no longer needed
Add perf-changelog entry for dsv4-fp4-mi355x-atom

Test plan

Verify benchmark runs at expanded concurrency range
Verify perf-changelog entry is correctly formatted

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-11T04:19:07Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-05-11T04:25:29Z

 start_gpu_monitor

 set -x
-
-BLOCK_SIZE=${BLOCK_SIZE:-16}
-export ATOM_DSV4_SPARSE_ATTN_CHUNK_TOKENS=${ATOM_DSV4_SPARSE_ATTN_CHUNK_TOKENS:-256}
-# --enforce-eager is required: ROCm/ATOM#650 (PR1 skeleton) has no CUDAGraph
-# support yet (deferred to a follow-up PR). max-num-seqs is sized to the
-# client concurrency with a floor at 4 — the ATOM default (512) makes the
-# KV/GDN-mamba allocator overshoot the GPU budget ("GDN mamba tensor
-# exceeds available KV budget"), and using 1 hangs warmup at 0% GPU. 4
-# is the minimum we've seen complete warmup successfully (also the PR's
-# offline repro value). The PR1 kv_cache[:1,...] hardcode in
-# deepseek_v4.py means any forward with batch>1 silently corrupts
-# non-slot-0 lanes; eval (gsm8k) at conc>1 is the canary.
-MAX_NUM_SEQS=$(( CONC < 4 ? 4 : CONC ))
-MAX_NUM_BATCHED_TOKENS=${MAX_NUM_BATCHED_TOKENS:-$MAX_MODEL_LEN_VALUE}
+export ATOM_DISABLE_MMAP=true
+export ATOM_USE_TRITON_MOE=1
 python3 -m atom.entrypoints.openai_server \
    --model $MODEL \
    --server-port $PORT \


🔴 The simplified script still references $BLOCK_SIZE, $MAX_NUM_SEQS, and $MAX_NUM_BATCHED_TOKENS in the python invocation (lines 45, 47, 48 of benchmarks/single_node/dsv4_fp4_mi355x_atom.sh), but this PR deleted the only assignments of those variables and nothing else in the harness defines them. With set -eo pipefail (no -u), they expand to empty strings, so the rendered argv becomes --block-size --enforce-eager --max-num-seqs --max-num-batched-tokens --trust-remote-code … and the server will fail to start on every job in the expanded conc=4–1024 sweep. Fix is to either restore BLOCK_SIZE=${BLOCK_SIZE:-16} / MAX_NUM_SEQS=$(( CONC < 4 ? 4 : CONC )) / MAX_NUM_BATCHED_TOKENS=${MAX_NUM_BATCHED_TOKENS:-$MAX_MODEL_LEN_VALUE} (mirroring the sister ATOM scripts), or drop the three flags from the python command line.

Extended reasoning...

What the bug is

This PR simplifies benchmarks/single_node/dsv4_fp4_mi355x_atom.sh by removing a large block of WIP overlay/setup code. As part of that simplification it also deleted three short variable assignments that were sitting just above set -x:

BLOCK_SIZE=${BLOCK_SIZE:-16} MAX_NUM_SEQS=$(( CONC < 4 ? 4 : CONC )) MAX_NUM_BATCHED_TOKENS=${MAX_NUM_BATCHED_TOKENS:-$MAX_MODEL_LEN_VALUE}

However, the python invocation at the bottom of the simplified script still passes those three variables through (lines 45, 47, 48 of the new file):

--block-size $BLOCK_SIZE \ --enforce-eager \ --max-num-seqs $MAX_NUM_SEQS \ --max-num-batched-tokens $MAX_NUM_BATCHED_TOKENS \

Why nothing rescues this

check_env_vars at the top only validates MODEL/TP/CONC/ISL/OSL/RANDOM_RANGE_RATIO/RESULT_FILENAME/EP_SIZE/DP_ATTENTION — none of the three deleted variables are listed there, so the check does not surface them.

A grep across benchmarks/, .github/configs/, and the workflow templates shows no other producer for BLOCK_SIZE / MAX_NUM_SEQS / MAX_NUM_BATCHED_TOKENS; the orchestrator never injects them into the environment.

The script uses set -eo pipefail but not set -u, so the undefined variables expand to empty strings silently.

Every sibling ATOM script that passes --block-size keeps the local default (e.g. benchmarks/single_node/dsr1_fp4_mi355x_atom.sh:50, dsr1_fp8_mi355x_atom.sh:50, gptoss_fp4_mi355x_atom.sh:51 all define BLOCK_SIZE=${BLOCK_SIZE:-16}), confirming this is required wiring that was accidentally lost in the simplification.

Step-by-step proof

CI runs the dsv4-fp4-mi355x-atom config from the diff: { tp: 8, ep: 1, conc-start: 4, conc-end: 1024 }. The orchestrator exports MODEL, TP, CONC, ISL, OSL, RANDOM_RANGE_RATIO, RESULT_FILENAME, EP_SIZE, DP_ATTENTION — but not the three deleted vars.

The script runs check_env_vars … which passes (BLOCK_SIZE etc. are not in its list).

Reaching the python invocation, bash expands the argv. With the three vars unset and no set -u, the rendered argv after word-splitting is:
python3 -m atom.entrypoints.openai_server \ --model deepseek-ai/DeepSeek-V4-Pro --server-port 8888 -tp 8 \ --kv_cache_dtype fp8 [--max-model-len 10240] [--enable-expert-parallel?] \ --block-size --enforce-eager \ --max-num-seqs --max-num-batched-tokens --trust-remote-code

ATOM's argparse parses --block-size --enforce-eager: --block-size is int-typed, and argparse consumes the next token (--enforce-eager) as its value. int('--enforce-eager') raises argparse.ArgumentTypeError/ValueError, the server exits non-zero before opening port 8888.

wait_for_server_ready either polls until timeout or detects the dead PID and aborts the job. Every (TP=8, conc=4..1024, ISL/OSL ∈ {1024/1024, 8192/1024}) cell of the new sweep hits exactly this path, so the entire sweep this PR is trying to enable cannot start.

Even under the most charitable argparse behavior (for string-typed flags), --max-num-seqs would be assigned the literal string --max-num-batched-tokens, and --max-num-batched-tokens would consume --trust-remote-code. The server still cannot start with such values, and --trust-remote-code would in turn be missing.

Addressing the refutations

The two refutations on bug_003 and bug_004 are simply duplicate-dedup callouts between the two reports (they describe the same defect at the same location). The synthesis agent has already merged them into this single report (merged_bug_003), so the dedup objection is satisfied. Neither refutation disputes the underlying defect.

Fix

Either:

Restore the three defaults before set -x (mirroring dsr1_fp4_mi355x_atom.sh):
BLOCK_SIZE=${BLOCK_SIZE:-16} MAX_NUM_SEQS=$(( CONC < 4 ? 4 : CONC )) MAX_NUM_BATCHED_TOKENS=${MAX_NUM_BATCHED_TOKENS:-10240}
(Note that MAX_MODEL_LEN_VALUE was also removed in this PR, so the third default needs a literal or a recomputed value.)

Or drop the three flags from the python invocation entirely (matching the qwen3.5-fp4-mi355x-atom sister script which does not use them) if the new rocm/atom-dev:nightly_202605101539 image provides workable defaults.

claude · 2026-05-11T04:25:29Z

+    - dsv4-fp4-mi355x-atom
+  description:
+    - "Add DeepSeek-V4-Pro FP4 MI355X ATOM benchmark config; bump image to rocm/atom-dev:nightly_202605101539, expand concurrency range (conc 4–1024), and simplify runtime script"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/


🟡 The new perf-changelog entry for dsv4-fp4-mi355x-atom has pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/ with no PR number appended (perf-changelog.yaml:2344). It should be .../pull/1311 — every other entry in the file follows that pattern, and any downstream tooling that consumes pr-link will get a broken URL pointing to the PRs listing page instead of this PR.

Extended reasoning...

What the bug is

The new perf-changelog entry added by this PR ends with an incomplete pr-link value:

- config-keys: - dsv4-fp4-mi355x-atom description: - "Add DeepSeek-V4-Pro FP4 MI355X ATOM benchmark config; bump image to rocm/atom-dev:nightly_202605101539, expand concurrency range (conc 4–1024), and simplify runtime script" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/

The numeric PR ID is missing from the end of the URL.

Why existing patterns expect a number

Every other entry in perf-changelog.yaml follows the convention .../pull/<NUMBER>. For example, the entry immediately above (line 2338) is https://github.com/SemiAnalysisAI/InferenceX/pull/1308, and neighboring entries use /pull/1303, /pull/1304, /pull/1305, etc. The schema is uniform across the file, so consumers of this YAML will reasonably assume the value is a fully-formed URL pointing at a specific PR.

Impact

If a human clicks the link in a rendered changelog, they get the GitHub PR listing page for the repo rather than this PR. More importantly, any tooling that parses pr-link (changelog renderers, scripts that extract PR numbers via regex on the URL, dashboards that link to PRs) will either get a broken/empty PR ID or fall through to the listing page. The link silently points to the wrong place rather than failing loudly.

Step-by-step proof

Open perf-changelog.yaml at line 2344.

Observe the raw value: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/ (URL ends with a trailing slash and no digits).

Compare to line 2338 (the previous entry): pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1308.

Visit the URL in a browser — you land on https://github.com/SemiAnalysisAI/InferenceX/pulls (the listing), not on PR [AMD][ROCM] dsv4-fp4-mi355x-atom: bump image, expand concurrency, simplify script #1311.

Run a simple regex extractor like url.rsplit('/', 1)[-1] on the value: it yields an empty string instead of 1311.

How to fix

Append 1311 (this PR's number) to the URL:

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1311

…m server args Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-11T04:42:34Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25650584751
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25650584751

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-11T04:48:21Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25650688728
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25650688728

functionstackx · 2026-05-11T04:48:55Z

@seungrokj the amd sglang team is doing from conc 1 to 256, probably makes sense for ATOM to do that at least that too? feel free to expand the range even more if u want but we should do at least conc 1 to 256

from @1am9trash 's PR https://github.com/SemiAnalysisAI/InferenceX/pull/1300/changes

functionstackx · 2026-05-11T05:13:58Z

hi @seungrokj even on conc=4, it is getting this error, can u take a look?

 File "/app/ATOM/atom/models/deepseek_v4.py", line 189, in v4_attention_with_output
    return self.forward_impl(x, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 1500, in forward_impl
    compress_plans = attn_md.compress_plans
                     ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AttentionMetaData' object has no attribute 'compress_plans'
Process ModelRunner2/8:
Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/app/ATOM/atom/model_engine/async_proc.py", line 113, in __init__
    self.busy_loop()
  File "/app/ATOM/atom/model_engine/async_proc.py", line 173, in busy_loop
    out = func(*args)
          ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/model_engine/model_runner.py", line 2013, in capture_cudagraph
    model_output = self.model(
                   ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2499, in forward
    return self.model(input_ids, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/utils/decorators.py", line 529, in __call__
    model_output = self.forward(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2388, in forward
    def forward(
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 465, in __call__
    return super().__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1181, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.124", line 505, in forward
    submod_1 = self.submod_1(getitem, s72, l_positions_, s80);  getitem = None
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.126", line 5, in forward
    v4_attention_with_output = torch.ops.aiter.v4_attention_with_output(result_2, l_positions_, 'layers.0.attn');  result_2 = l_positions_ = None
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_ops.py", line 1209, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_device.py", line 109, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_ops.py", line 1209, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 189, in v4_attention_with_output
    return self.forward_impl(x, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 1500, in forward_impl
    compress_plans = attn_md.compress_plans
                     ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AttentionMetaData' object has no attribute 'compress_plans'
[aiter] import [module_rope_2c_cached_positions_fwd] under /app/aiter-test/aiter/jit/module_rope_2c_cached_positions_fwd.so
[aiter] import [module_rope_2c_cached_positions_fwd] under /app/aiter-test/aiter/jit/module_rope_2c_cached_positions_fwd.so
[aiter] import [module_rope_2c_cached_positions_fwd] under /app/aiter-test/aiter/jit/module_rope_2c_cached_positions_fwd.so
[aiter] import [module_rope_2c_cached_positions_fwd] under /app/aiter-test/aiter/jit/module_rope_2c_cached_positions_fwd.so
[aiter] import [module_rope_2c_cached_positions_fwd] under /app/aiter-test/aiter/jit/module_rope_2c_cached_positions_fwd.so
Process ModelRunner1/8:
Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/app/ATOM/atom/model_engine/async_proc.py", line 113, in __init__
    self.busy_loop()
  File "/app/ATOM/atom/model_engine/async_proc.py", line 173, in busy_loop
    out = func(*args)
          ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/model_engine/model_runner.py", line 2013, in capture_cudagraph
    model_output = self.model(
                   ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2499, in forward
    return self.model(input_ids, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/utils/decorators.py", line 529, in __call__
    model_output = self.forward(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2388, in forward
    def forward(
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 465, in __call__
    return super().__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1181, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.124", line 505, in forward
    submod_1 = self.submod_1(getitem, s72, l_positions_, s80);  getitem = None
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.126", line 5, in forward
    v4_attention_with_output = torch.ops.aiter.v4_attention_with_output(result_2, l_positions_, 'layers.0.attn');  result_2 = l_positions_ = None
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_ops.py", line 1209, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_device.py", line 109, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_ops.py", line 1209, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 189, in v4_attention_with_output
    return self.forward_impl(x, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 1500, in forward_impl
    compress_plans = attn_md.compress_plans
                     ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AttentionMetaData' object has no attribute 'compress_plans'

Capturing bs=512, max_q_len=1:   0%|          | 0/11 [00:00<?, ?it/s]
Process ModelRunner0/8:
[aiter] import [module_rope_2c_cached_positions_fwd] under /app/aiter-test/aiter/jit/module_rope_2c_cached_positions_fwd.so
Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/app/ATOM/atom/model_engine/async_proc.py", line 113, in __init__
    self.busy_loop()
  File "/app/ATOM/atom/model_engine/async_proc.py", line 173, in busy_loop
    out = func(*args)
          ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/model_engine/model_runner.py", line 2013, in capture_cudagraph
    model_output = self.model(
                   ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2499, in forward
    return self.model(input_ids, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/utils/decorators.py", line 529, in __call__
    model_output = self.forward(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2388, in forward
    def forward(
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 465, in __call__
    return super().__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1181, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.124", line 505, in forward
    submod_1 = self.submod_1(getitem, s72, l_positions_, s80);  getitem = None
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<eval_with_key>.126", line 5, in forward
    v4_attention_with_output = torch.ops.aiter.v4_attention_with_output(result_2, l_positions_, 'layers.0.attn');  result_2 = l_positions_ = None
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_ops.py", line 1209, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_device.py", line 109, in __torch_function__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_ops.py", line 1209, in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 189, in v4_attention_with_output
    return self.forward_impl(x, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 1500, in forward_impl
    compress_plans = attn_md.compress_plans
                     ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AttentionMetaData' object has no attribute 'compress_plans'
Process ModelRunner7/8:
Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/app/ATOM/atom/model_engine/async_proc.py", line 113, in __init__
    self.busy_loop()
  File "/app/ATOM/atom/model_engine/async_proc.py", line 173, in busy_loop
    out = func(*args)
          ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/model_engine/model_runner.py", line 2013, in capture_cudagraph
    model_output = self.model(
                   ^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2499, in forward
    return self.model(input_ids, positions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/utils/decorators.py", line 529, in __call__
    model_output = self.forward(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/ATOM/atom/models/deepseek_v4.py", line 2388, in forward
    def forward(
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 465, in __call__
    return super().__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1181, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 936, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 455, in __call__
    raise e
  File "/opt/venv/lib/python3.12/site-packages/torch/fx/graph_module.py", line 442, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

github-actions · 2026-05-11T05:21:12Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25650857591
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25650857591

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-11T05:55:42Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25651910046
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25651910046

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-11T06:36:09Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25654329782
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25654329782

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-11T08:00:19Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25655408841
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25655408841

github-actions · 2026-05-11T13:07:30Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25668243296
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25668243296

functionstackx · 2026-05-11T17:32:29Z

also, we need to fix this chat template issue too. will get back with a patch asap https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25668243296/job/75346620974

@seungrokj whats the reason for chat template for non-mtp? +viz @Oseltamivir

- Update atom-dev image to nightly_202605130853 - Expand conc-end from 256 to 512 for isl=1024 and isl=8192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj · 2026-05-13T12:43:42Z

@functionstackx @cquil11
can you plz merge this ?

e2e perf/accuracy https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25794276782

chunfangamd

The performance is outperforming SGLang now!

github-actions · 2026-05-13T13:22:35Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25798735228
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25798735228

functionstackx · 2026-05-13T15:10:15Z

@seungrokj ci is failing, can u take a look?

github-actions · 2026-05-13T15:59:26Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25798735228
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25798735228

seungrokj · 2026-05-13T16:04:15Z

@functionstackx @cquil11

disk was full -> I removed some stuffs
docker pull issue -> re-logged docker hub

should be okay by now... it's running

github-actions · 2026-05-13T16:58:55Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25798735228
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25798735228

functionstackx · 2026-05-13T17:38:44Z

@functionstackx @cquil11

disk was full -> I removed some stuffs

docker pull issue -> re-logged docker hub

should be okay by now... it's running

it is still failing, can u take a look? and if it is 1 flaky node, can u drain it? https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25813782165/job/75837243947?pr=1311

github-actions · 2026-05-13T18:33:54Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25813782165
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25813782165

github-actions · 2026-05-14T00:33:11Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25813782165
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25813782165

seungrokj · 2026-05-14T00:34:07Z

@functionstackx @cquil11
can you plz merge this (previous failing was due to docker login issue)
https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25813782165

github-actions · 2026-05-14T00:35:42Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25834577379
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25834577379

github-actions · 2026-05-14T00:37:01Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25834609567
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25834609567

seungrokj · 2026-05-14T00:41:54Z

cancelled https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25834633831
as it's duplicated run of https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25813782165

github-actions · 2026-05-14T00:42:15Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25834633831
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25834633831

functionstackx

lgtm

Conflicts resolved: - .github/configs/amd-master.yaml (dsv4-fp4-mi355x-atom): took main's simplified single-range conc form from PR #1311 (we had the older discrete-point version) - .github/configs/nvidia-master.yaml (kimik2.5-int4-b200-vllm): kept our bump-rationale comment alongside main's v0.20.2 image (both sides agreed on the image, only the comment was new on ours) - .github/configs/nvidia-master.yaml (minimaxm2.5-fp8-{h100,h200}-vllm): took main's v0.20.2 image bumps (we still had v0.19.1) Cleanup: - Drop our .gitignore additions (the 'scripts/debug_*.sh' line) per review feedback -- match main - Drop docs/AGENTIC_TEST_COVERAGE.md and docs/AGENTIC_TEST_RESULTS.md (agent-generated planning slop, not load-bearing)

The earlier rebase silently dropped trailing whitespace from two unrelated entries (PRs #1311, #1322). The 'no deletions in perf-changelog' policy treats whitespace changes as deletions and failed setup. Rebuild perf-changelog by checking out main's exact bytes and re-appending only the PR #1394 entry.

* $Update gptoss-fp4-b200-vllm vLLM image to v0.20.2\n\nRef #1154\n\nCo-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com> * fix(perf-changelog): restore trailing whitespace dropped by prior rebase The earlier rebase silently dropped trailing whitespace from two unrelated entries (PRs #1311, #1322). The 'no deletions in perf-changelog' policy treats whitespace changes as deletions and failed setup. Rebuild perf-changelog by checking out main's exact bytes and re-appending only the PR #1394 entry. --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: claude-fix-bot <claude-fix-bot@local> Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>

dsv4-fp4-mi355x-atom: bump image, expand concurrency, simplify script

8825db0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj requested a review from a team May 11, 2026 04:19

github-project-automation Bot added this to InferenceMAX Board May 11, 2026

seungrokj requested review from 1am9trash, billishyahao, chunfangamd and yctseng0211 as code owners May 11, 2026 04:19

Fix pr-link in perf-changelog for dsv4-fp4-mi355x-atom entry

9c8a639

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj changed the title ~~dsv4-fp4-mi355x-atom: bump image, expand concurrency, simplify script~~ [AMD][ROCM] dsv4-fp4-mi355x-atom: bump image, expand concurrency, simplify script May 11, 2026

seungrokj added the AMD label May 11, 2026

claude Bot reviewed May 11, 2026

View reviewed changes

seungrokj and others added 2 commits May 11, 2026 13:27

dsv4-fp4-mi355x-atom: remove enforce-eager and per-request limits fro…

b320bc7

…m server args Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dsv4-fp4-mi355x-atom: start sweep at conc=1

dfe6cdf

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

functionstackx added the full-sweep-enabled label May 11, 2026

functionstackx mentioned this pull request May 11, 2026

Clean up DSv4 ATOM AITER PR2998 overlay #1260

Closed

1 task

seungrokj and others added 2 commits May 11, 2026 13:41

dsv4-fp4-mi355x-atom: reduce conc-end from 1024 to 256

9873308

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dsv4-fp4-mi355x-atom: set conc-start to 4

6a72802

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dsv4-fp4-mi355x-atom: start sweep at conc=1

e2fc356

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dsv4-fp4-mi355x-atom: remove ATOM_USE_TRITON_MOE=1

45fde59

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dsv4-fp4-mi355x-atom: remove OMP_NUM_THREADS and max-model-len logic

e0d2336

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dsv4-fp4-mi355x-atom: remove --trust-remote-code flags

ea84133

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

seungrokj and others added 2 commits May 13, 2026 18:36

dsv4-fp4-mi355x-atom: update image and expand concurrency search space

815f3ed

- Update atom-dev image to nightly_202605130853 - Expand conc-end from 256 to 512 for isl=1024 and isl=8192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into srok/atom_dsv4_fp4

92973d5

chunfangamd approved these changes May 13, 2026

View reviewed changes

Merge branch 'main' into srok/atom_dsv4_fp4

7b06567

Merge branch 'main' into srok/atom_dsv4_fp4

aa14e2f

seungrokj added 2 commits May 14, 2026 09:36

Update perf-changelog.yaml

efc5430

Update perf-changelog.yaml

afa858a

functionstackx approved these changes May 14, 2026

View reviewed changes

seungrokj merged commit b6faacd into main May 14, 2026
17 of 40 checks passed

seungrokj deleted the srok/atom_dsv4_fp4 branch May 14, 2026 00:51

github-project-automation Bot moved this to Done in InferenceMAX Board May 14, 2026

Conversation

seungrokj commented May 11, 2026 • edited by functionstackx Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

claude Bot May 11, 2026

Choose a reason for hiding this comment

What the bug is

Why nothing rescues this

Step-by-step proof

Addressing the refutations

Fix

Uh oh!

claude Bot May 11, 2026

Choose a reason for hiding this comment

What the bug is

Why existing patterns expect a number

Impact

Step-by-step proof

How to fix

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

functionstackx commented May 11, 2026

Uh oh!

functionstackx commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

functionstackx commented May 11, 2026

Uh oh!

seungrokj commented May 13, 2026

Uh oh!

chunfangamd left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

functionstackx commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

seungrokj commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

functionstackx commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

seungrokj commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

seungrokj commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

seungrokj commented May 11, 2026 •

edited by functionstackx

Loading