Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception on csv.read_csv() leaves opened handle on Windows #38812

Open
INRIX-Mark-Gershaft opened this issue Nov 20, 2023 · 0 comments
Open

Comments

@INRIX-Mark-Gershaft
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

On Windows via pyarrow reading CSV file with a very, very, very long field (several million characters).
Since we don't know ahead of time length of the column - keep trying to read doubling buffer size.
Issue is that upon exception file seems to be not properly closed leaving an open handle and preventing temporary file(s) from being removed.
Python code:

    while True:
        try:
            table = csv.read_csv(csv_file_path, read_options=read_options, parse_options=parse_options,
                                 convert_options=convert_options)
            break
        except pa.lib.ArrowInvalid:
            print(f'Doubling CSV block_size from {read_options.block_size}')
            read_options.block_size *= 2

This issue is possibly related to #31796 since it seems to be about properly closing file handles on Windows.

Component(s)

C++, Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant