Skip to content

Conversation

@satylogin
Copy link
Contributor

@satylogin satylogin commented Mar 9, 2023

Describe the changes

TextFileReader currently overwrites the engine as python when patching for pandas. I am not sure why is it needed since the default engine is "c" and "python" engine doesn't expects certain params like "float_precision" which results in it failing with the version of pandas that we have (1.1.5).

Tasks

  • Unit tests added that reproduce the issue or prove feature is working
  • Fix or feature added
  • Entry to release notes added
  • Pre-commit CI shows no errors
  • Unit tests passing
  • For documentation changes: The Read the Docs preview builds and looks as expected

@satylogin satylogin marked this pull request as ready for review March 9, 2023 15:46
@mrbean-bremen
Copy link
Member

Hm, that is strange. The patch was needed because with the "c" backend, filesystem calls cannot be patched. If I comment it out locally, the respective tests (test_read_csv and test_read_table) fail as I would have expected.
Right now I have no idea how that works in the CI tests (the tests trigger the use of filesystem calls that have to be patched).
So, I have to understand if the tests are not testing what they are supposed to, or if for some reason this indeed works in the CI, before I'm going to merge this (or not).

@mrbean-bremen
Copy link
Member

I'm still not quite sure what is going on, but in the CI tests the patched code is never called - it seems to go another path that uses the "python" engine by default. So in this case the change has no effect. If the code is called (as in my local tests), the tests fail with the change, so I don't see what the PR would achieve.
You can always choose to not use the patches (which is what the PR practically does), and while I still want to understand what goes on in the CI tests, I'm most probably not going to merge this.

@mrbean-bremen
Copy link
Member

Ok, I now understand what is going on, and your patch makes sense, after all - just a bit changed.
The behavior of pandas has changed, and from version 1.2 onwards the patch is no longer needed, as internally Python fs functions are used even with the C engine. From version 1.3 onwards the patched code is no longer used, so the patch makes no difference at all, but specifically for version 1.2 it indeed makes a difference.

Sorry for the confusion... You can adapt the PR accordingly, if you want (will add a couple of comments in a moment), or I can make the changes myself.


class TextFileReader(parsers.TextFileReader):
def __init__(self, *args, **kwargs):
kwargs["engine"] = "python"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole patching of pandas (all what is currently inside the if parsers is not None conditions) can be removed for pandas >= 1.2. You can do something like this for the new condition:

import pandas as pd
...

patch_pandas = (
    parsers is not None and [int(v) for v in pd.__version__.split(".")] < [1, 2, 0]
)

Also, please add an entry to the release notes - thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@mrbean-bremen
Copy link
Member

Oh, and I'm afraid that this will not help you with version 1.1.5 - this still needs the patch to work correctly. Though if you don't need that functionality, you can always switch off that patching, as mentioned before.

if parsers is not None:
# From pandas v 1.2 onwards the python fs functions are used even when the engine selected is "c".
# This means that we don't explicitly have to change the engine.
patch_pandas = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two other places above with use if parsers is not None - these also have to be adapted to use patch_pandas instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, let me update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Member

@mrbean-bremen mrbean-bremen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@mrbean-bremen mrbean-bremen merged commit 2bfdc02 into pytest-dev:main Mar 15, 2023
@satylogin
Copy link
Contributor Author

Thanks for all your help !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants