Bug: AssertionError in extend mode - Shape mismatch between target_latents and x0

## Bug Report: AssertionError in Extend Mode - Shape Mismatch

### Description
The pipeline crashes with an `AssertionError` when using the **extend mode** in the Upload tab with Text2Music Parameters. The error occurs due to a shape mismatch between `target_latents` and `x0` tensors.

### Error Message
```
AssertionError: target_latents.shape=torch.Size([1, 8, 16, 1292]) x0.shape=torch.Size([1, 8, 16, 1528])
```

### Full Stack Trace
```python
2026-01-23 17:37:35.863 | INFO     | acestep.pipeline_ace_step:text2music_diffusion_process:847 - cfg_type: apg, guidance_scale: 15, omega_scale: 10

Traceback (most recent call last):
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\gradio\queueing.py", line 766, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\gradio\route_utils.py", line 355, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\gradio\blocks.py", line 2152, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\gradio\blocks.py", line 1629, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 fn, *processed_input, limiter=self.limiter
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 )
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\anyio\to_thread.py", line 63, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           func, args, abandon_on_cancel=abandon_on_cancel, limiter=limiter
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           )
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 2502, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 986, in run
    result = context.run(func, *args)
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\gradio\utils.py", line 1036, in wrapper
    response = f(*args, **kwargs)
  File "C:\Users\Jack\source\ACE-Step\acestep\ui\components.py", line 777, in extend_process_func
    return text2music_process_func(
           format.value,
           ...
           )
  File "C:\Users\Jack\source\ACE-Step\acestep\pipeline_ace_step.py", line 1627, in __call__
    target_latents = self.text2music_diffusion_process(
                     duration=audio_duration,
                     ...
                     ref_latents=ref_latents,
                     )
  File "C:\Users\Jack\source\ACE-Step\acestep\cpu_offload.py", line 40, in wrapper
    return func(self, *args, **kwargs)
  File "C:\Users\Jack\source\ACE-Step\venv\Lib\site-packages\torch\utils\_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\Jack\source\ACE-Step\acestep\pipeline_ace_step.py", line 1050, in text2music_diffusion_process
    target_latents.shape[-1] == x0.shape[-1]
AssertionError: target_latents.shape=torch.Size([1, 8, 16, 1292]) x0.shape=torch.Size([1, 8, 16, 1528])
```

### Location
- **File:** `acestep/pipeline_ace_step.py`
- **Line:** 1050 (in `text2music_diffusion_process` method)
- **Function:** `extend_process_func` → `text2music_process_func` → `text2music_diffusion_process`

### Steps to Reproduce
1. Launch ACE-Step GUI
2. Generate audio using Text2Music tab
3. Go to the **Upload** tab under Text2Music
4. Upload the generated audio
5. Try to extend the audio (left or right)
6. Pipeline crashes with AssertionError

### Root Cause
After padding/trimming operations in extend mode, the concatenated `target_latents` tensor doesn't match the expected `x0` shape due to:
- Rounding errors in frame_length calculations
- Trimming when exceeding `max_infer_fame_length` (240 seconds)
- Concatenation of tensors from different sources

### Expected Behavior
The pipeline should handle shape mismatches gracefully by padding or trimming to ensure tensor compatibility.

### Actual Behavior
Pipeline crashes with AssertionError, preventing audio generation in extend mode.

### Environment
- **OS:** Windows 11
- **Python:** 3.11+
- **ACE-Step Version:** Latest (main branch)
- **Mode:** Extend (Upload tab)

### Severity
🔴 **CRITICAL** - Extend mode is completely broken without a fix.

### Proposed Solution
Add automatic shape alignment before the assertion:
```python
# Fix shape mismatch between target_latents and x0
if target_latents.shape[-1] != x0.shape[-1]:
    if target_latents.shape[-1] < x0.shape[-1]:
        # Pad with zeros if target_latents is shorter
        padding = x0.shape[-1] - target_latents.shape[-1]
        target_latents = torch.nn.functional.pad(
            target_latents, (0, padding), "constant", 0
        )
    else:
        # Trim if target_latents is longer
        target_latents = target_latents[..., :x0.shape[-1]]
```

### Related PR
A fix for this issue has been submitted in PR #373


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: AssertionError in extend mode - Shape mismatch between target_latents and x0 #374

Bug Report: AssertionError in Extend Mode - Shape Mismatch

Description

Error Message

Full Stack Trace

Location

Steps to Reproduce

Root Cause

Expected Behavior

Actual Behavior

Environment

Severity

Proposed Solution

Related PR

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: AssertionError in extend mode - Shape mismatch between target_latents and x0 #374

Description

Bug Report: AssertionError in Extend Mode - Shape Mismatch

Description

Error Message

Full Stack Trace

Location

Steps to Reproduce

Root Cause

Expected Behavior

Actual Behavior

Environment

Severity

Proposed Solution

Related PR

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions