Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR: to_stata tried to remove file before closing it #39202

Merged
merged 2 commits into from
Jan 18, 2021

Conversation

twoertwein
Copy link
Member

Spin-off from #39047:

If an error occurred during to_stata, the code tried to first remove the file and then close the file handle:

  1. removing the file failed on Windows (it is still opened)
  2. It created an empty file when closing the handle

@jreback jreback added the IO Stata read_stata, to_stata label Jan 16, 2021
@jreback
Copy link
Contributor

jreback commented Jan 16, 2021

this is a regression (i think so?). put a note on 1.2.1

@jreback jreback added this to the 1.2.1 milestone Jan 16, 2021
@twoertwein
Copy link
Member Author

yes, you are right! I extended a test to make sure that the file doesn't exists and added a whatsnew entry.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent ping on greenish

@twoertwein
Copy link
Member Author

twoertwein commented Jan 16, 2021

The failing test will not fail in 'real life': the test uses tm.ensure_clean() to ensure that the file will not exist once the test is done. Unfortunately, this means that we have an open file handle that to_stata cannot close. I will look into re-writing ensure_clean().

doc/source/whatsnew/v1.2.1.rst Outdated Show resolved Hide resolved
@jreback
Copy link
Contributor

jreback commented Jan 16, 2021

also some failures

@twoertwein twoertwein changed the title BUG: to_stata tried to remove file before closing it REGR: to_stata tried to remove file before closing it Jan 16, 2021
@twoertwein
Copy link
Member Author

I think to_stata is the only to_* method that tries to remove created files in case of an error. Is that something other methods should also try to do?

I'm not sure but I could imagine that __exit__ in IOHandles might be able to determine whether an exception from within pandas's io part was raised: if yes, try to delete the files. This would probably only cover local files (no fsspec strings) and only work for to_* methods that use with get_handles(...) (some methods have to use it without the context manager and explicitly call close - IOHandles wouldn't know whether an exception occurred.

@jreback
Copy link
Contributor

jreback commented Jan 16, 2021

we should remove it if it's corrupt - however this might be hard to tell

so ok leaving it

@twoertwein
Copy link
Member Author

@jreback green'ish

The new ensure_clean has the drawback (I see it as an advantage) that the code using it needs to make sure that no file handle references the temporary file. If an open file handle exists, the test on windows will fail. That's why I had to fix a few tests in this PR.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm comment, ping on green

pandas/_testing/contexts.py Outdated Show resolved Hide resolved
@twoertwein
Copy link
Member Author

twoertwein commented Jan 17, 2021

@jreback green'ish, failed tests seem to be unrelated

@simonjayhawkins
Copy link
Member

failed test seem to be unrelated

yep, known flaky test and pre-commit issue

@simonjayhawkins
Copy link
Member

one question before this is merged. How public-ish is tm.ensure_clean?

@jreback
Copy link
Contributor

jreback commented Jan 17, 2021

one question before this is merged. How public-ish is tm.ensure_clean?

its not

@jreback jreback merged commit 6ff2e7c into pandas-dev:master Jan 18, 2021
@jreback
Copy link
Contributor

jreback commented Jan 18, 2021

thanks @twoertwein

@jreback
Copy link
Contributor

jreback commented Jan 18, 2021

@meeseeksdev backport 1.2.x

@lumberbot-app

This comment has been minimized.

@jreback
Copy link
Contributor

jreback commented Jan 18, 2021

@twoertwein would you mind putting up the backport PR

simonjayhawkins pushed a commit to simonjayhawkins/pandas that referenced this pull request Jan 18, 2021
simonjayhawkins pushed a commit that referenced this pull request Jan 18, 2021
@twoertwein twoertwein deleted the stata branch January 19, 2021 15:54
nofarm3 pushed a commit to nofarm3/pandas that referenced this pull request Jan 21, 2021
Co-authored-by: Simon Hawkins <simonjayhawkins@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Stata read_stata, to_stata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants