Download cache is not concurrency safe #1141
Labels
auto-locked
Outdated issues that have been locked by automation
type: bug
A confirmed bug or unintended behavior
The PIP download cache has files added to it by cache_download. However this is the body of that method:
def cache_download(target_file, temp_location, content_type):
logger.notify('Storing download in cache at %s' % display_path(target_file))
shutil.copyfile(temp_location, target_file)
fp = open(target_file+'.content-type', 'w')
fp.write(content_type)
fp.close()
os.unlink(temp_location)
There are two racey operations here:
target_file might be partially written if something interrupts the copy - e.g. the machine is powered off, or python killed etc. That is somewhat tolerable, since the reading code looks for both the file name and the content-type file.
the content-type file is also written unsafely, without the mitigating aspect of the target_file race.
So you can have other processes observe:
because moving multiple files isn't atomic, and this creates the third race -
which will result in a pip process trying to reuse that file reading an invalid content type in unpack_http_url and passing that to unpack_file.
There are various ways to solve this:
The text was updated successfully, but these errors were encountered: