Description
It appears com.google.refine.importing.ImportingUtilities.allocateFile(File, String)
might not detect some malicious file names. However, I am not familiar with this project so hopefully someone more familiar with it can confirm or deny whether these cases are actually problematic.
Sorry in case these are false alerts.
Not checking file being same as directory
allocateFile
does not check whether the resolved file name is equivalent to the directory. This can for example be the case when the file name
- is an empty string
- consists only of single periods or slashes:
.
,///
,././.
, ... - uses double periods:
ignored/..
All these file names pass the startsWith
sanitization check and only in case the directory already exists would be renamed in the loop. Otherwise they would be returned as they are and might cause issues.
Rename loop might allow directory traversal
When the result file name is already taken, allocateFile
tries to construct an alternative file name. However, if an adversary knows the directory structure they can create a file name which, when transformed, allows directory traversal.
For example, let's assume the dir
parameter argument is C:/User/Test/temp
. An adversary could then use the file name "./../../Test/temp"
(respectively "../../../User/Test/temp"
):
// ImportingUtilities.allocateFile(File, String) code
File file = new File(dir, name); // file = "C:/User/Test/temp/./../../Test/temp"
if (!file.toPath().normalize().startsWith(dir.toPath().normalize())) {
throw new IllegalArgumentException("Zip archives with files escaping their root directory are not allowed.");
}
int dot = name.indexOf('.');
String prefix = dot < 0 ? name : name.substring(0, dot); // prefix = ""
String suffix = dot < 0 ? "" : name.substring(dot); // suffix = "./../../Test/temp"
int index = 2;
while (file.exists()) { // File already exists
// file = "C:/User/Test/temp/-2./../../Test/temp"
file = new File(dir, prefix + "-" + index++ + suffix);
}
The resulting file would then be C:/User/Test/temp/-2./../../Test/temp
(normalized: C:/User/Test/Test/temp
) which is outside the intended directory C:/User/Test/temp
. Though the adversary is restricted to the parent directory names and cannot choose arbitrary directory names.