Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve messaging around sorted parquet columns for index #5265

Merged
merged 3 commits into from Aug 13, 2019

Conversation

martindurant
Copy link
Member

@martindurant martindurant commented Aug 13, 2019

  • Tests added / passed
  • Passes black dask / flake8 dask

Fixes #5260

"Multiple sorted columns found, cannot autodetect index",
"Multiple sorted columns found %s, cannot\n "
"autodetect index. Will continue without an index.\n"
"To pick an index column, use the index= kwargs; to \n"
Copy link
Member

@TomAugspurger TomAugspurger Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change "index=kwargs" to index keyword argument? Or at least make clear that it's a single argument, right?

pd.DataFrame({"cola": range(10), "colb": range(10)}), npartitions=2
)
df.to_parquet(tmpdir, engine=engine, write_index=False)
with pytest.warns(RuntimeWarning) as record:
Copy link
Member

@TomAugspurger TomAugspurger Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, you can use match="'cola', 'colb'" now I think, rather than the as record stuff.

I'm not sure whether it asserts that there is just one assertion. But why do we assert that?

Copy link
Member Author

@martindurant martindurant Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other warnings too for pyarrow, so I think it needs to be this way

Copy link
Member

@TomAugspurger TomAugspurger Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I made a small test

import pytest
import warnings


def f(x):
    warnings.warn("test", FutureWarning)
    warnings.warn("user", UserWarning)


def test_foo():
    with pytest.warns(UserWarning, match="user"):
        f(0)

and things seem fine with the match syntax.

Copy link
Member Author

@martindurant martindurant Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean it, I wanted to capture the recorder object, to explicitly count the number of warnings, and make sure that supplying the kwarg makes that number go down, but doesn't go to zero.

Copy link
Member

@TomAugspurger TomAugspurger Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@TomAugspurger TomAugspurger merged commit 266c314 into dask:master Aug 13, 2019
@TomAugspurger
Copy link
Member

@TomAugspurger TomAugspurger commented Aug 13, 2019

Thanks!

@martindurant martindurant deleted the multicol_warning branch Aug 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants