feat(rosetta): Architecture::FalconClassic variant + falcon family (closes #1587) by noahgift · Pull Request #1673 · paiml/aprender

noahgift · 2026-05-14T14:58:48Z

Summary

Closes #1587. Adds `Architecture::FalconClassic` variant + Falcon-specific tensor mapper + `contracts/model-families/falcon.yaml`.

Why FalconClassic needs its own variant

Falcon (TII; `FalconForCausalLM`) is distinct from both:

FalconH1 (hybrid Transformer+SSM, uses `Architecture::FalconH1` mapper)
BLOOM (different prefix, different position encoding)

Falcon-specific traits:

HF prefix `transformer.h.N.` (BLOOM: `h.N.`; LLaMA: `model.layers.N.*`)
RoPE position encoding (BLOOM uses ALiBi)
Fused QKV with MQA (7B: 1 K/V head) or MGQA (40B/11B: 8 K/V groups)
Two layernorm layouts: 7B has single per-block; 40B has separate `ln_attn` + `ln_mlp` (parallel residual design)

Engine changes

`converter_types.rs::Architecture` + `FalconClassic` variant
`tensor_expectation.rs::map_name` + dispatch
`tensor_expectation.rs::is_llm`, `display_name`, `from_model_type` updates
`tensor_expectation.rs::falcon_classic_map_name` NEW (95 LOC) — handles both 7B (single `input_layernorm`) and 40B (`ln_attn` + `ln_mlp`) variants
Coverage test `test_from_model_type_unknown_gh219` updated: `from_model_type("falcon")` now returns `Some(FalconClassic)`

YAML

`contracts/model-families/falcon.yaml` covers 7B (MQA, RoPE θ=10000), 11B (MGQA, RoPE θ=500000), 40B (MGQA, RoPE θ=10000). All share 65024-token vocab.

Test plan

`pv validate` clean
FALSIFY-PARITY-002 `test_every_model_family_yaml_has_architecture` passes
FALSIFY-MF-006 `no_duplicate_architecture_classes` passes
FALSIFY-MF-011 `vocab_consistency` passes
All 13764 aprender-core --lib tests pass
CI: workspace-test

Out of scope

Parallel attn+mlp residual runtime — `is_inference_verified()` returns false for FalconClassic; engine has no parallel-residual code path
MQA/MGQA-aware QKV splitter at conversion layer

🤖 Generated with Claude Code

…loses #1587) Falcon classic (TII; FalconForCausalLM, 7B/40B/11B/RW variants) is distinct from FalconH1 (hybrid Transformer+SSM) and from BLOOM: - HF prefix is `transformer.h.N.*` (BLOOM uses `h.N.*`; LLaMA uses `model.layers.N.*`) - RoPE position encoding (not ALiBi like BLOOM) - Fused QKV (`self_attention.query_key_value`) with MQA/MGQA layout - Falcon-7B uses single per-block layernorm - Falcon-40B uses separate `ln_attn` + `ln_mlp` (parallel attn+mlp residuals) Adds `Architecture::FalconClassic` variant + `falcon_classic_map_name` (95 LOC) that translates both 7B and 40B layernorm variants: transformer.word_embeddings.weight → model.embed_tokens.weight transformer.h.N.input_layernorm.* → model.layers.N.input_layernorm.* (7B) transformer.h.N.ln_attn.* → model.layers.N.input_layernorm.* (40B) transformer.h.N.ln_mlp.* → model.layers.N.post_attention_layernorm.* (40B) transformer.h.N.self_attention.query_key_value.* → model.layers.N.self_attn.qkv_proj.* (fused) transformer.h.N.self_attention.dense.* → model.layers.N.self_attn.o_proj.* transformer.h.N.mlp.dense_h_to_4h.* → model.layers.N.mlp.up_proj.* transformer.h.N.mlp.dense_4h_to_h.* → model.layers.N.mlp.down_proj.* transformer.ln_f.* → model.norm.* YAML at contracts/model-families/falcon.yaml covers 7B (MQA), 11B (MGQA, RoPE θ=500000), and 40B (MGQA, RoPE θ=10000) sizes. Coverage test updated: `test_from_model_type_unknown_gh219` previously asserted `from_model_type("falcon") == None`. Updated to expect `Some(FalconClassic)` post-#1587. Verified: - pv validate clean - FALSIFY-PARITY-002, FALSIFY-MF-006, FALSIFY-MF-011 pass - All 13764 aprender-core --lib tests pass Out of scope (separate tickets): - Parallel attn+mlp residual runtime support - MQA/MGQA-aware QKV splitter at conversion layer

noahgift enabled auto-merge (squash) May 14, 2026 14:58

noahgift merged commit 41a5914 into main May 14, 2026
11 checks passed

noahgift deleted the fix/1587-falcon-classic-variant branch May 14, 2026 15:21

noahgift mentioned this pull request May 15, 2026

feat(rosetta): Architecture::Bloom variant + BLOOM model-family contract (closes #1586) #1694

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rosetta): Architecture::FalconClassic variant + falcon family (closes #1587)#1673

feat(rosetta): Architecture::FalconClassic variant + falcon family (closes #1587)#1673
noahgift merged 1 commit into
mainfrom
fix/1587-falcon-classic-variant

noahgift commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 14, 2026

Summary

Why FalconClassic needs its own variant

Engine changes

YAML

Test plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant