Offline decode emits raw language tag tokens (`<en-US>`) in transcript text

## Bug

The offline transcription path emits raw special tokens like `<en-US>` in the output text:

```sh
parakeet-cli transcribe --model nemotron-3.5-asr-streaming-0.6b-q4_k.gguf --input test.wav --lang en
# → "The sun was setting slowly, casting long shadows across the empty field. <en-US>"
```

The trailing `<en-US>` is a language tag token from the model's tokenizer vocabulary that should be stripped.

## Root cause

`detokenize()` in `src/tokenizer.cpp` concatenates **all** token pieces for the given IDs, including special tokens. It does not filter tokens matching the `<...>` or `[...]` special-token pattern.

The **streaming** path already handles this correctly — `src/streaming.cpp` filters to `non_special_` tokens before calling `detokenize()`:

```cpp
// streaming.cpp — correct
text_ = detokenize(ml_.config().tokenizer_pieces, non_special_);
```

But the **offline** path in `src/model.cpp` passes all decoded IDs (including special tokens) directly to `detokenize()`:

```cpp
// model.cpp decode_enc_out() — bug: no special-token filtering
return detokenize(loader.tokenizer_pieces(), ids);
```

This affects all four offline decode paths:
- `decode_enc_out()` (line ~67, used by `transcribe_16k`)
- `transcribe_16k_batch()` (line ~175, the TDT/RNNT batch path)
- `decode_enc_out_with_timestamps()` (line ~210, used by `transcribe_16k_with_timestamps`)
- `transcribe_16k_batch_with_timestamps()` (line ~310, the batch timestamped path)

## Fix

Filter special tokens before detokenizing in the offline path. The `is_special_token()` function already exists in `src/transcription.cpp` and could be reused (or moved to a shared header):

```cpp
// In model.cpp, before detokenize():
std::vector<int32_t> non_special_ids;
non_special_ids.reserve(ids.size());
for (int32_t id : ids) {
    if (id >= 0 && (size_t)id < pieces.size()) {
        const std::string& piece = pieces[(size_t)id];
        if (!is_special_token(piece))  // reuse from transcription.cpp
            non_special_ids.push_back(id);
    }
}
return detokenize(pieces, non_special_ids);
```

Alternatively, add special-token filtering directly inside `detokenize()` in `src/tokenizer.cpp` — but this would change the streaming path's behaviour too (it pre-filters, so double-filtering is harmless but redundant).

## Environment

- parakeet.cpp built from master (f469a57)
- Model: `nemotron-3.5-asr-streaming-0.6b` (q4_k GGUF)
- Linux, Vulkan backend (AMD Radeon 880M)
- Reproduced with `--lang en`, `--lang en-US`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline decode emits raw language tag tokens (`<en-US>`) in transcript text #40

Bug

Root cause

Fix

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Offline decode emits raw language tag tokens (<en-US>) in transcript text #40

Description

Bug

Root cause

Fix

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Offline decode emits raw language tag tokens (`<en-US>`) in transcript text #40