Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use better temporary files mechanism in install_unpacked_wheel #7699

Closed
sbidoul opened this issue Feb 5, 2020 · 8 comments · Fixed by #7929
Closed

Use better temporary files mechanism in install_unpacked_wheel #7699

sbidoul opened this issue Feb 5, 2020 · 8 comments · Fixed by #7929
Labels
S: auto-locked type: refactor

Comments

@sbidoul
Copy link
Member

@sbidoul sbidoul commented Feb 5, 2020

install_unpacked_wheel creates files such as INSTALLER or RECORD with a .pip extension then renames them to the intended file in order to have an atomic file creation.

Example:

# Record details of all files installed
record = os.path.join(dest_info_dir, 'RECORD')
temp_record = os.path.join(dest_info_dir, 'RECORD.pip')
with open_for_csv(record, 'r') as record_in:
with open_for_csv(temp_record, 'w+') as record_out:
reader = csv.reader(record_in)
outrows = get_csv_rows_for_installed(
reader, installed=installed, changed=changed,
generated=generated, lib_dir=lib_dir,
)
writer = csv.writer(record_out)
# Sort to simplify testing.
for row in sorted_outrows(outrows):
writer.writerow(row)
shutil.move(temp_record, record)

The goal of this issue is to use a better and safer approach.

@triage-new-issues triage-new-issues bot added the S: needs triage label Feb 5, 2020
@sbidoul sbidoul added the type: refactor label Feb 5, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage label Feb 5, 2020
@sbidoul sbidoul added the good first issue label Feb 5, 2020
@rudolfovic
Copy link

@rudolfovic rudolfovic commented Feb 5, 2020

adjacent_tmp_file opens files as binary while the csvwriter insists on writing strings no matter what is passed to it. A possible solution would be to add a parameter to adjacent_tmp_file so that it passes it accordingly to NamedTemporaryFile.
NamedTemporaryFile uses mode='w+b' by default and adjacent_tmp_file does not override this.
This fix passes the tests.
Mode can either be added to adjacent_tmp_file as a named parameter so that it does not have to be specified in other cases (of which there are currently two).
From what I found it is in line with the pip conventions
e.g. def decode(input, fallback_encoding, errors='replace'): in webencodings/__init__.py.
Let me know what you think.

@uranusjr
Copy link
Member

@uranusjr uranusjr commented Feb 6, 2020

Would it be enough to keep csv.writer happy if we wrap the file handle in an io.TextIOWrapper?

@siAyush
Copy link

@siAyush siAyush commented Mar 19, 2020

Hey, @uranusjr can you please clear it more. About this "if we wrap the filehandle in an io.TextIOWrapper" .
Thanks

@uranusjr
Copy link
Member

@uranusjr uranusjr commented Mar 19, 2020

So the problem here is that csvwriter expects a str-based file object, but on Python 3 the caller (adjacent_tmp_file) writes file in binary mode. So I was wondering whether it would be possible to wrap the file object opened by adjacent_tmp_file with an io.TextIOWrapper on Python 3 to turn it into a text file-like object, and pass that to csvwriter instead. The wrapper would perform the necessariy encoding when csvwriter writes to it, to turn the written data into bytes for the NamedTemporaryFile file object.

@pradyunsg
Copy link
Member

@pradyunsg pradyunsg commented Mar 19, 2020

@uranusjr I suggest avoiding complicating this issue by discussing trying to do this in-memory (since it's a good first issue).

@siAyush
Copy link

@siAyush siAyush commented Mar 20, 2020

Thank you @uranusjr for simplifying it. Let me try it.

@sbidoul sbidoul removed the good first issue label Mar 27, 2020
@McSinyx
Copy link
Contributor

@McSinyx McSinyx commented Mar 29, 2020

@siAyush, may I pick this up?

@siAyush
Copy link

@siAyush siAyush commented Mar 29, 2020

Sure

@lock lock bot added the S: auto-locked label May 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
S: auto-locked type: refactor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants