Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix file leaks in csv parsers (GH#45384) #45388

Closed
wants to merge 1 commit into from

Conversation

mrob95
Copy link
Contributor

@mrob95 mrob95 commented Jan 15, 2022

I originally tried to implement the tests using the @td.check_file_leaks decorator but couldn't get this to cause a failing test. e.g. the following test:

import pandas.util._test_decorators as td

@td.check_file_leaks
def test_file_leaks(datapath):
    test2csv = datapath("io", "parser", "data", "test2.csv")
    h = open(test2csv, "r")

Run with:

python -W default -m pytest pandas/tests/io/parser/ -k test_file_leaks

Passes with the following output on Ubuntu 20 - note that the warning indicates that there is definitely a leak, but the decorator hasn't caught it:

=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.9, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/ubuntu/pandas, configfile: pyproject.toml
plugins: hypothesis-6.35.0
collected 4651 items / 4650 deselected / 1 selected

pandas/tests/io/parser/test_parse_dates.py .

================================================================================================ warnings summary =================================================================================================
pandas/tests/io/parser/test_parse_dates.py::test_file_leaks
  /home/ubuntu/pandas/.venv/lib/python3.9/site-packages/_pytest/python.py:183: ResourceWarning: unclosed file <_io.TextIOWrapper name='/home/ubuntu/pandas/pandas/tests/io/parser/data/test2.csv' mode='r' encoding='UTF-8'>
    result = testfunction(**testargs)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
------------------------------------------------------------------------------ generated xml file: /home/ubuntu/pandas/test-data.xml ------------------------------------------------------------------------------
============================================================================================== slowest 30 durations ===============================================================================================

(3 durations < 0.005s hidden.  Use -vv to show these durations.)
================================================================================== 1 passed, 4650 deselected, 1 warning in 0.97s ==================================================================================

@pep8speaks
Copy link

pep8speaks commented Jan 15, 2022

Hello @mrob95! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-01-15 21:27:57 UTC

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a lot more than a minimal change

pls be very targeted

@twoertwein
Copy link
Member

this looks a lot more than a minimal change

Mostly whitespace changes, can skip them on github:
image

@mrob95
Copy link
Contributor Author

mrob95 commented Jan 16, 2022

this looks a lot more than a minimal change

pls be very targeted

git made a bit of a meal of the diff but it is almost entirely indentation changes. Having one try/except around everything following _open_handles seems less error-prone than trying to wrap every fallible call individually.

#45389 looks like a cleaner fix anyway, so I'm happy for this to be closed.

@jreback jreback added the IO CSV read_csv, to_csv label Jan 16, 2022
@jreback
Copy link
Contributor

jreback commented Jan 16, 2022

thanks @mrob95 merged #45389

@jreback jreback closed this Jan 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: read_csv can leak file handles if validation fails
4 participants