'Voice Conversion' paper candidate 2603.27001

Please check whether this paper is about 'Voice Conversion' or not.
## article info.
- title: **PHONOS: PHOnetic Neutralization for Online Streaming Applications**
- summary: Speaker anonymization (SA) systems modify timbre while leaving regional or non-native accents intact, which is problematic because accents can narrow the anonymity set. To address this issue, we present PHONOS, a streaming module for real-time SA that neutralizes non-native accent to sound native-like. Our approach pre-generates golden speaker utterances that preserve source timbre and rhythm but replace foreign segmentals with native ones using silence-aware DTW alignment and zero-shot voice conversion. These utterances supervise a causal accent translator that maps non-native content tokens to native equivalents with at most 40ms look-ahead, trained using joint cross-entropy and CTC losses. Our evaluations show an 81% reduction in non-native accent confidence, with listening-test ratings consistent with this shift, and reduced speaker linkability as accent-neutralized utterances move away from the original speaker in embedding space while having latency under 241 ms on single GPU.
- id: http://arxiv.org/abs/2603.27001v1
## judge
Write [vclab::confirmed] or [vclab::excluded] in comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'Voice Conversion' paper candidate 2603.27001 #820

article info.

judge

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

'Voice Conversion' paper candidate 2603.27001 #820

Description

article info.

judge

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions