Resolve merge conflicts: resync with upstream while preserving fork features by Copilot · Pull Request #6 · audiohacking/acestep.cpp

Copilot · 2026-03-01T22:39:35Z

PR #5 (ServeurpersoCom → audiohacking) had dirty merge state because the fork had diverged with its own additions (LoRA, cover/repaint mode, VAE encoder, reference audio). Blind upstream adoption would have severed those features. Each of the 16 conflicts was resolved by merging both sides' intent.

Brought in from upstream

Flash attention toggle — use_flash_attn: bool added to CondGGML, DetokGGML, Qwen3GGML, Qwen3LM; qwen3_attn_f32() pure-F32 fallback added to qwen3-enc.h; threaded through all qwen3_build_layer / qw3lm_build_attn call chains
--no-fa CLI flag — ace-qwen3 and dit-vae both accept --no-fa to disable flash attention at runtime
qwen3_embed_lookup() via ggml_get_rows — replaces the old CPU mmap dequant approach (qwen3_cpu_embed_lookup); qw3lm_forward / qw3lm_forward_batch switch to ggml_backend_sched_alloc_graph instead of ggml_gallocr; galloc, gf_mmap, embed_mmap_data, embed_type fields removed from Qwen3LM
ggml submodule pointer advanced to upstream (55e062ab)
tests/debug-dit-cossim.sh rewritten to build and test CUDA / Vulkan / CPU in sequence
README documentation updates

Preserved from fork

LoRA — DiTGGMLLayer adapter tensors, dit_ggml_linear_lora(), dit_ggml_load_lora(), lora_wctx / lora_scale in DiTGGML, LoRA CLI args in dit-vae, dit-lora.cpp in CMakeLists
Cover / repaint mode — task_type, reference_audio, src_audio, audio_cover_strength, repainting_start/end in AceRequest; custom_tag / genre for LoRA trigger words
VAE Encoder — VAEEncoderGGML + vae_encoder_load() for reference audio timbre encoding
FSQ codeword helpers — detok_ggml_build_codeword_table() + latent_frames_to_codes() kept in fsq-detok.h (used by file-based cover mode)
.gitignore — !tests/fixtures/ exception preserved so fixture JSON files remain tracked
CMakeLists — audio_loader.cpp and src/dit-lora.cpp remain in the build

Example: flash attention now toggleable per model

dit_ggml_init_backend(&model);
if (!use_fa) model.use_flash_attn = false;   // DiT

qwen3_init_backend(&text_enc);
if (!use_fa) text_enc.use_flash_attn = false; // text encoder

cond_ggml_init_backend(&cond);
if (!use_fa) cond.use_flash_attn = false;     // condition encoder

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Fix Metal col2im_1d: use 256 threads/group instead of 1 thread/group. Revert conv_transpose_1d bounded loop (8c70db8, e0e36f3) and im2col gridDim.y fix (b65bf45): not used by the project, reduce upstream diff. Rename CPU helpers ggml_load_f32/ggml_store_f32 to snake_load/snake_store

ace-qwen3: disables flash_attn_ext in prefill and batched decode, falls back to F32 manual attention. dit-vae: disables flash_attn_ext in TextEncoder, CondEncoder, Detokenizer and DiT. qwen3_attn_f32() fallback added in qwen3-enc.h, reused by qwen3-lm.h prefill/decode and dit-graph.h self/cross attention. DiT already had its own fallback: F16 accumulation drifts audibly over 24 layers x 8 iterative Euler steps on CPU

Drop manual CPU-side mmap dequant and gallocr in favor of standard ggml_get_rows with backend scheduler fallback. No functional change

… fork features (LoRA, cover mode, reference audio, VAE encoder)

…additions Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

ServeurpersoCom and others added 7 commits March 1, 2026 15:36

Tests

94f328f

doc + logs

e25efc2

ggml: fix col2im_1d and snake metal template instantiations

470edd2

refactor: use ggml_get_rows for all embedding lookups

e835cd0

Drop manual CPU-side mmap dequant and gallocr in favor of standard ggml_get_rows with backend scheduler fallback. No functional change

Initial plan

d4d3e3b

Copilot AI assigned Copilot and lmangani Mar 1, 2026

Copilot started work on behalf of lmangani March 1, 2026 22:39 View session

Copilot AI and others added 2 commits March 1, 2026 22:57

Merge upstream: add use_flash_attn/--no-fa, ggml_get_rows embed, keep…

f5ce6a8

… fork features (LoRA, cover mode, reference audio, VAE encoder)

Resolve all 16 merge conflicts: add upstream features, preserve fork …

b237e8e

…additions Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Resolve merge conflicts to resync with upstream changes~~ Resolve merge conflicts: resync with upstream while preserving fork features Mar 1, 2026

Copilot finished work on behalf of lmangani March 1, 2026 23:00

lmangani marked this pull request as ready for review March 1, 2026 23:03

lmangani merged commit 40d75d1 into master Mar 1, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve merge conflicts: resync with upstream while preserving fork features#6

Resolve merge conflicts: resync with upstream while preserving fork features#6
lmangani merged 9 commits intomasterfrom
copilot/resolve-merge-conflicts

Copilot AI commented Mar 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Brought in from upstream

Preserved from fork

Example: flash attention now toggleable per model

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 1, 2026 •

edited

Loading