Int4 quantization produces all-zero embedding tensors

## Bug Report

**Source**: `tiny-model-ground-truth` parity checker (0/59 passing)
**Severity**: Critical — blocks ALL Int4 inference for LLaMA-style and Qwen architectures

## Description

`apr import --quantize int4` produces APR files where the embedding tensor (`model.embed_tokens.weight`) is entirely zeros. Inference fails with `F-DATA-QUALITY-001` density check.

## Affected Models

| Model | Architecture | Vocab | Hidden | Elements | Zero % |
|-------|-------------|-------|--------|----------|--------|
| SmolLM-135M | LLaMA | 49152 | 576 | 28,311,552 | 100% |
| Qwen2-0.5B | Qwen/GQA | 151936 | 896 | 136,134,656 | 100% |

## Error Output

```
[APR-LOAD] Embedding tensor 'model.embed_tokens.weight': dims=[49152, 576], expected [vocab=49152, hidden=576]
[APR-LOAD] Embedding dims=[49152, 576], using raw data (no transpose needed)
[APR-LOAD] WARNING: Token 0 embedding is all zeros - possible load failure
[APR-LOAD] Token 0 embedding sample: [0.0000, 0.0000, 0.0000, 0.0000, 0.0000]
[APR-LOAD] Embedding loaded: 28311552 elements (vocab=49152 x hidden=576)

error: Inference failed: Format error: [F-DATA-QUALITY-001] Tensor 'token_embedding': DENSITY FAILURE: 100.0% zeros (max 50%). Data likely loaded from wrong offset!
```

Note: Int4 element count (28,311,552) is correct unlike Int8 (7,077,889) — see issue #231. The shape is right but the data is all zeros.

## Root Cause Hypothesis

Int4 quantization writes the embedding tensor data at the wrong offset, or fails to write it entirely. The metadata (shape, dims) is correct, but the actual float data is zeroed. This may be a different manifestation of the same import pipeline bug as #231 (Int8 gets wrong count + corrupt data; Int4 gets right count + zero data).

## Reproduction

```bash
cd tiny-model-ground-truth
apr pull hf://HuggingFaceTB/SmolLM-135M
apr import hf://HuggingFaceTB/SmolLM-135M --quantize int4 -o models/smollm-135m-int4.apr
apr run models/smollm-135m-int4.apr -p "Hello" -n 32 --json
# → F-DATA-QUALITY-001 density failure
```

## Environment

- `apr` v0.2.16 (f39b7dfd)
- Oracle: transformers 5.1.0, torch 2.10.0, float32, CPU, greedy
- Platform: Linux x86_64

## Contract Reference

- `contracts/tensor-layout-v1.yaml` rule F-DATA-QUALITY-001

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Int4 quantization produces all-zero embedding tensors #232

Bug Report

Description

Affected Models

Error Output

Root Cause Hypothesis

Reproduction

Environment

Contract Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Architecture	Vocab	Hidden	Elements	Zero %
SmolLM-135M	LLaMA	49152	576	28,311,552	100%
Qwen2-0.5B	Qwen/GQA	151936	896	136,134,656	100%

Int4 quantization produces all-zero embedding tensors #232

Description

Bug Report

Description

Affected Models

Error Output

Root Cause Hypothesis

Reproduction

Environment

Contract Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions