-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: basic sub_missing and sub_zero #244
Conversation
|
||
return isinstance(x, (pl.Null, type(None))) or x is np.nan or x is nan | ||
return isinstance(x, (pl.Null, type(None))) or (isinstance(x, float) and isnan(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that in python float("nan") is float("nan")
evaluates to false. Using math.isnan()
seems to work though for detecting either numpy or builtin nans.
@@ -64,6 +64,7 @@ def fmt( | |||
fns: Union[FormatFn, FormatFns], | |||
columns: SelectExpr = None, | |||
rows: Union[int, List[int], None] = None, | |||
is_substitution=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this flag mostly because was very fast to do, but am down for creating a dedicated sub()
function, if it seems better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you have makes a lot of sense! By way of comparison, the R version has the analogous subst()
function but it's not exported.
From a quick test of from great_tables import GT, md, exibble
GT(exibble).sub_missing(missing_text=md("*MISSING*"))
|
Having |
Using None as the flag for using some default, such as an emdash, seems reasonable! I wonder if we could keep the surface area of Great Tables down, by avoiding automatic conversions of plaintext to other characters (e.g. In the future, some kind of enum might make it easy to separate things like raw E.g.
(But I'm really spitballing here! May be some better way to represent? Or could people use unicode?!) |
I’m proposing skipping that conversion altogether. The default emdash is enough for this purpose. You’re right in that users could just supply Unicode characters for virtually anything here! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR adds support for substitution functions. It address #182, by adding the following methods:
sub_missing
: substitutes any missing values. This includes both null and nan..is_null() | .is_nan()
, since distinguishes between the two..isna()
, which flags both kinds of missingness.sub_zero
Currently, it is just using the formatter machinery. As I understand, substitutions should always go after formatters. (e.g. you should be able to
.sub_zero()
right away in a chain, and expect laterfmt_*()
calls to not override that.Note these important pieces:
FormatterSkipElement
, which when returned from a format call, indicates that no change should be made.<br>
.is_na
to properly detect float("nan")is_substitution=
tofmt()
. This deviates from gt. Happy to change to be more similar.