Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: add custom Exception for safe_sort #25569

Closed
wants to merge 12 commits into from

Conversation

simonjayhawkins
Copy link
Member

xref #25537 (comment)

custom Exception added primarily to allow more targeted testing, but also to distinguish sort failures and bad input to safe_sort

@codecov
Copy link

codecov bot commented Mar 6, 2019

Codecov Report

Merging #25569 into master will decrease coverage by 0.01%.
The diff coverage is 76.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25569      +/-   ##
==========================================
- Coverage   91.77%   91.76%   -0.02%     
==========================================
  Files         175      175              
  Lines       52607    52624      +17     
==========================================
+ Hits        48282    48288       +6     
- Misses       4325     4336      +11
Flag Coverage Δ
#multiple 90.32% <76.92%> (-0.01%) ⬇️
#single 41.9% <11.53%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/sorting.py 98.32% <100%> (+0.02%) ⬆️
pandas/core/reshape/merge.py 94.23% <50%> (-0.26%) ⬇️
pandas/core/algorithms.py 94.52% <75%> (-0.26%) ⬇️
pandas/core/indexes/base.py 96.47% <75%> (-0.1%) ⬇️
pandas/io/gbq.py 75% <0%> (-12.5%) ⬇️
pandas/core/frame.py 96.79% <0%> (-0.12%) ⬇️
pandas/util/testing.py 90.46% <0%> (-0.11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96a128e...94d76fc. Read the comment docs.

@@ -157,6 +157,13 @@ class MergeError(ValueError):
"""


class SortError(TypeError):
"""
Error raised when problems arise during sorting due to problems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to be more explicit about the types of "problems" that could arise? Is this just when sorting mixed types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd : the rationale here was to a create an exception analogous to the custom MergeError. Exceptions should be considered part of the api and so this SortError is a subclass of TypeError to be backwards compatible with the exceptions currently raised by np.sort from safe_sort and therefore result in a non-breaking api change.

Rather than be explicit on the exceptions covered, I consider this to be a base class for any sort related errors that may need to be handled. (although i would probably not have chosen TypeError as the base class). I should probably subclass this again for the safe_sort errors, but that is not strictly necessary at the moment.

@WillAyd WillAyd added the Error Reporting Incorrect or improved errors from pandas label Mar 6, 2019
@jreback
Copy link
Contributor

jreback commented Mar 6, 2019

what is the reason for this change?

@simonjayhawkins
Copy link
Member Author

what is the reason for this change?

during the process of converting the pytest.raises tests to use a context manager, two tests have so far had the match parameter added to check that the correct error is raised, but these were only testing safe_sort and factorise helper functions directly.

There are still many tests that require the match parameter added and there are several instances in the code base where the calls to safe-sort are not wrapped in a try/except block.

This PR is not critical at the moment and can be put on hold until more tests have been updated and further work on testing error messages is completed.

@jreback
Copy link
Contributor

jreback commented Mar 6, 2019

@simonjayhawkins right not objecting to making testing easier. This exception is a good idea, though have to think of implications for a user level exception like this.

@simonjayhawkins
Copy link
Member Author

though have to think of implications for a user level exception like this.

i'm assuming that will need to wrap all uses of safe_sort and then raise an appropriate user facing Exception or RunTimeWarning (I believe that @reidy-p has already started doing this). so the SortError is not intended to be user facing.

@jreback
Copy link
Contributor

jreback commented Mar 6, 2019

i'm assuming that will need to wrap all uses of safe_sort and then raise an appropriate user facing Exception or RunTimeWarning (I believe that @reidy-p has already started doing this). so the SortError is not intended to be user facing.

yep exactly (though maybe we just make it an internal exception for this reason)

@simonjayhawkins
Copy link
Member Author

though maybe we just make it an internal exception for this reason

do you mean define the Exception class in pandas/core/sorting.py instead of pandas/errors/__init__.py?

@jreback
Copy link
Contributor

jreback commented Mar 10, 2019

actually I kind of like this custom exception now.

cc @TomAugspurger @jorisvandenbossche

@jorisvandenbossche
Copy link
Member

I am personally not a fan of adding custom user-level exceptions if there is no good reason for it (and here it only seems for testing?)

But is it the intent to expose it to users? The discussion above about it is not fully clear to me.

@jreback
Copy link
Contributor

jreback commented Mar 26, 2019

if you can merge master and update. I believe this would be ok if SortError is just an internal exception (e.g. define in pandas/core/sorting), mainly to disambiguate on testing; raising an appropriate error to users.

@jorisvandenbossche
Copy link
Member

Note that the current version of the PR does expose it publicly (from a quick look at the diff in pd.factorize, and also in pandas.errors)

@jreback
Copy link
Contributor

jreback commented Mar 26, 2019

@jorisvandenbossche

Note that the current version of the PR does expose it publicly (from a quick look at the diff in pd.factorize, and also in pandas.errors)

right my suggestion was to make it internal

@simonjayhawkins
Copy link
Member Author

started to internalize the custom error, still need to wrap other calls to safe_sort.

@simonjayhawkins
Copy link
Member Author

i'm not sure that at the moment there is much to gain from the added complexity here.

and also there is the helper function _is_unorderable_exception (that I only found after opening this PR) which serves the purpose of distinguishing sort failures and bad input to safe_sort

regexes should probably just be simplified to "not supported between instances|unorderable types" for testing purposes.

@jreback
Copy link
Contributor

jreback commented Mar 29, 2019

@simonjayhawkins agree with your comment. _is_unorderable_exception basically does this, so closing.

@jreback jreback closed this Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants