Fix no-op chardet MINIMUM_THRESHOLD patch in dirtyPatches() (#6024)#6066
Merged
stamparm merged 1 commit intoJun 4, 2026
Merged
Conversation
dirtyPatches() set MINIMUM_THRESHOLD as a module-level attribute on thirdparty.chardet.universaldetector, but the value is read as a class attribute (UniversalDetector.MINIMUM_THRESHOLD, used as self.MINIMUM_THRESHOLD in get_confidence()). The module-level assignment was therefore a no-op and the effective threshold stayed at the 0.20 default instead of the intended 0.90, so low-confidence charset guesses on binary/ambiguous response data were accepted rather than rejected. Assign the class attribute so the patch takes effect. Verified: with the fix an ambiguous 24-byte sample now resolves to encoding=None (was a spurious 0.45-confidence 'IBM855'/Russian guess). Regenerated the sha256sums.txt entry for the modified file; smoke test passes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #6024.
The bug
dirtyPatches()inlib/core/patch.pydoes:But
MINIMUM_THRESHOLDis a class attribute ofUniversalDetector(thirdparty/chardet/universaldetector.py), read asself.MINIMUM_THRESHOLD. The line above sets an unused module-level attribute, so it has been a no-op — the effective threshold has stayed at the default0.20, not the intended0.90.The fix (1 line)
Assign the class attribute that is actually read, so the patch does what its comment says:
(Touches only
lib/core/patch.py;thirdparty/is left untouched.data/txt/sha256sums.txtis regenerated for the modified file;--smokepasses.)Behavioural impact (kept minimal/safe)
chardet.detect()is used only as a last-resort charset fallback inlib/request/basic.py— i.e. when a response has noContent-Typecharset and no<meta charset>. With the threshold actually applied:ascii).Noneinstead of a ~0.2–0.7 guess — and that is neutral, because sqlmap then falls back toDEFAULT_PAGE_ENCODING = "iso-8859-1", which is byte-identical. Page-comparison verdicts are unchanged.So the change suppresses exactly the low-confidence over-guessing the original comment was written to prevent, with no detection regression.
Open question for the maintainer
Since this line has effectively run at
0.20for years, there are two valid resolutions and Iʼm happy to switch to whichever you prefer:0.90takes effect, or0.20that has shipped in practice.Defaulted to (1) since it matches the codeʼs documented intent, but happy to change to (2).