-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Hey ,
I noticed that you are considering only two states:
- One regarding the path normalization if it is done or not before the safe check
- Second concerns the safe check.
as shown next:
codeql/python/ql/lib/semmle/python/security/dataflow/PathInjectionQuery.qll
Lines 20 to 28 in c1c0a70
/** A state signifying that the file path has not been normalized. */ | |
class NotNormalized extends NormalizationState { | |
NotNormalized() { this = "NotNormalized" } | |
} | |
/** A state signifying that the file path has been normalized, but not checked. */ | |
class NormalizedUnchecked extends NormalizationState { | |
NormalizedUnchecked() { this = "NormalizedUnchecked" } | |
} |
However, there is a third state that is a required one: Unicode normalized
. If ever a Unicode normalization is performed with a compatibility algorithm (NFKC or NFKD), the query would miss some cases precisely those ones where the Unicode normalization is not performed before the path normalization and the safe check. I draw a little chart to depict my saying:

The previous chart shows that when you consider a potential Unicode compatibility normalization, it is a required step before path normalization and safe check. If ever placed between the first two steps or after the last one, that would yield a vulnerable case that got missed due to the fact that the Unicode normalization may reintroduce unexpected special characters such as ..
and /
.
Regards
@Sim4n6