fix: detect Tencent SILK (\x02 prefix) in audio magic bytes to avoid ffmpeg failure#8009
fix: detect Tencent SILK (\x02 prefix) in audio magic bytes to avoid ffmpeg failure#8009Tom8266 wants to merge 2 commits intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
ensure_wav, the temp output path creation for SILK appears to duplicate logic likely already handled byconvert_audio_to_wavfor aNoneoutput_path; consider centralizing temp path creation to a single place to avoid divergence in behavior. - In
_get_audio_magic_type, the SILK detection logic could be made clearer and less error-prone by usingheader.startswith(b"#!SILK_V3")andheader.startswith(b"\x02#!SILK_V3")instead of explicit slice indices.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `ensure_wav`, the temp output path creation for SILK appears to duplicate logic likely already handled by `convert_audio_to_wav` for a `None` `output_path`; consider centralizing temp path creation to a single place to avoid divergence in behavior.
- In `_get_audio_magic_type`, the SILK detection logic could be made clearer and less error-prone by using `header.startswith(b"#!SILK_V3")` and `header.startswith(b"\x02#!SILK_V3")` instead of explicit slice indices.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request adds support for processing Tencent SILK audio files by updating the magic byte detection logic and integrating a conversion helper. The review suggests refactoring the temporary file path generation to reduce code duplication and improve consistency using pathlib, as well as adopting the more idiomatic startswith method with a tuple for checking file headers.
| if audio_type == "silk": | ||
| if output_path is None: | ||
| temp_dir = get_astrbot_temp_path() | ||
| os.makedirs(temp_dir, exist_ok=True) | ||
| output_path = os.path.join(temp_dir, f"media_audio_{uuid.uuid4().hex}.wav") | ||
| return await tencent_silk_to_wav(audio_path, output_path) | ||
|
|
||
| return await convert_audio_to_wav(audio_path, output_path) |
There was a problem hiding this comment.
The logic for generating a temporary output path is duplicated here and in convert_audio_format. Using pathlib.Path for consistency and refactoring the path generation to the top of the conversion logic makes the function cleaner and ensures output_path is available for both silk and other formats, adhering to the rule of avoiding code duplication. Furthermore, please ensure this new attachment handling functionality is covered by unit tests.
| if audio_type == "silk": | |
| if output_path is None: | |
| temp_dir = get_astrbot_temp_path() | |
| os.makedirs(temp_dir, exist_ok=True) | |
| output_path = os.path.join(temp_dir, f"media_audio_{uuid.uuid4().hex}.wav") | |
| return await tencent_silk_to_wav(audio_path, output_path) | |
| return await convert_audio_to_wav(audio_path, output_path) | |
| if output_path is None: | |
| temp_dir = Path(get_astrbot_temp_path()) | |
| temp_dir.mkdir(parents=True, exist_ok=True) | |
| output_path = str(temp_dir / f"media_audio_{uuid.uuid4().hex}.wav") | |
| if audio_type == "silk": | |
| return await tencent_silk_to_wav(audio_path, output_path) | |
| return await convert_audio_to_wav(audio_path, output_path) |
References
- When implementing similar functionality for different cases (e.g., direct vs. quoted attachments), refactor the logic into a shared helper function to avoid code duplication.
- New functionality, such as handling attachments, should be accompanied by corresponding unit tests.
| if header[:9] == b"#!SILK_V3": | ||
| return "silk" | ||
|
|
||
| # Tencent SILK: leading \x02 byte before #!SILK_V3 | ||
| if header[:1] == b"\x02" and header[1:10] == b"#!SILK_V3": | ||
| return "silk" |
There was a problem hiding this comment.
Using startswith with a tuple of prefixes is more idiomatic in Python (PEP 8) and less error-prone than manual slicing for magic byte detection, especially when dealing with prefixes of different lengths.
| if header[:9] == b"#!SILK_V3": | |
| return "silk" | |
| # Tencent SILK: leading \x02 byte before #!SILK_V3 | |
| if header[:1] == b"\x02" and header[1:10] == b"#!SILK_V3": | |
| return "silk" | |
| if header.startswith((b"#!SILK_V3", b"\x02#!SILK_V3")): | |
| return "silk" |
References
- Using startswith with a tuple of prefixes is more idiomatic in Python (PEP 8) for checking multiple possible prefixes. (link)
…ffmpeg failure QQ official bot sends voice in Tencent SILK format (leading \x02 byte before #!SILK_V3 magic). _get_audio_magic_type() had two off-by-one slice errors: 1. Standard SILK: header[:8] vs b'#!SILK_V3' (8 != 9 bytes) — never matched 2. Tencent SILK: not detected at all Fixes: - Standard SILK: header[:9] == b'#!SILK_V3' (correct 9-byte slice) - Tencent SILK: header[:1] == b"\x02" and header[1:10] == b'#!SILK_V3' - ensure_wav() routes detected silk to tencent_silk_to_wav() Before: QQ voice → ffmpeg → 'Invalid data found' After: QQ voice → magic detects silk → tencent_silk_to_wav → WAV OK
Replace manual slice comparisons with startswith() — cleaner, less error-prone, and immune to off-by-one slice errors. Suggested by: sourcery-ai
Root Cause
QQ official bot sends voice messages in Tencent SILK format (leading
\x02byte before#!SILK_V3magic)._get_audio_magic_type()had two off-by-one slice errors that prevented SILK detection:header[:8] == b"#!SILK_V3"—#!SILK_V3is 9 bytes, so this never matched\x02#!SILK_V3— not detected at allFix
tencent_silk_to_wavheader[:9](was[:8])header[1:10](new detection)ensure_wav()routessilktype →tencent_silk_to_wav()Relationship to #7832
#7832 adds silk/amr routing in
ensure_wav(), but does not fix the magic byte detection —_get_audio_magic_type()still never returns"silk". This PR completes the fix by correcting the two off-by-one errors so the detection actually works.Before / After
Testing
Fixed by: Tom8266 and deepseek-v4-pro