Skip to content

Conversation

@yannaingtun
Copy link

Description
Fix FTS5 tokenizer to prevent treating null character as token character
This PR fixes a security issue in the FTS5 tokenizer where a null character ('\0') could be incorrectly treated as a token character. This same issue was fixed in the SQLite main repository.
The fix ensures that the null character is explicitly marked as not being a token character by adding a single line that sets aAscii[0] = 0.

References:

  • Similar fix in SQLite: sqlite/sqlite@d1d43ef
  • Commit message from original fix: "Prevent fts5 tokenizer unicode61 from considering '\0' to be a token characters, even if other characters of class "Cc" are."
  • Related vulnerability: CVE-2023-36811

@levlam
Copy link
Contributor

levlam commented Mar 3, 2025

The character "\0" is never present in the strings for which TDLib uses FTS5, hence the upstream issue is irrelevant for TDLib.

@yannaingtun
Copy link
Author

I understand that TDLib doesn't process strings with null characters in FTS5, making this issue irrelevant for your implementation. Thank you for reviewing and providing this clarification.

@yannaingtun yannaingtun closed this Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants