Skip to content

Mixed/Corrupted Language Transcriptions #3688

@hugoaap-code

Description

@hugoaap-code

🐞 Bug Report: Mixed/Corrupted Language Transcriptions

🚀 Description

  • Despite setting Portuguese as the main language, the transcription frequently displays random or "weird" characters/words from other languages that were not spoken. (Hindi, Russian, english, etc)
  • This corruption occurs consistently in nearly every conversation recorded using the L. Pendant with the Omi app.

✅ Expected Behavior

  • When a single language (e.g., Portuguese) is explicitly set, the transcription should exclusively use the characters and vocabulary of that language.
  • Alternatively, if multi-language transcription is the default, the user should have a clear option to select "Single Language Only" mode to prevent this corruption.

🛑 Actual Behavior

  • The transcription output is often garbled with foreign characters or words, even when only the designated language is spoken.

💥 Impact

  • Unreadable Transcriptions: The mixed characters/words make the text difficult or impossible to read and use.
  • Data Integrity Loss: The transcribed record is corrupted and unreliable.

📝 Environment/Context (Provided by User)

  • Device: Samsung S25 Ultra (Android 16, One UI 8.0).
  • Setup: Omi app set to never sleep/always run in the background; phone not in battery saving mode; device started, connected, recording initiated, and screen turned off; full battery, not charging.

💡 Suggested Solution

  • Provide a user setting to opt for Single/Multi Language transcription (as opposed to a default multi-language mode).

Metadata

Metadata

Assignees

Labels

maintainerLane: High-risk, cross-system changesp1Priority: Critical (score 22-29)understandLayer: Speech-to-text, language detection

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions