Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "inline" option for unicode and console unit formats to give negative powers instead of fractions #14393

Merged
merged 4 commits into from
Feb 16, 2023

Conversation

olebole
Copy link
Member

@olebole olebole commented Feb 15, 2023

Description

This PR adds inline versions of the unicode and console format representations for units. My primary interest is unicode_inline, but since I modified the console format as well, console_inline came out for (almost) free -- it is up to discussion whether it is useful enough to keep.

It looks like this:

>>> from astropy.constants import G
>>> print(f'{G:unicode_inline}')
6.6743e-11 m³ kg⁻¹ s⁻²
>>> print(f'{G:console_inline}')
6.6743e-11 m^3 kg^-1 s^-2

The change is done in the same way as in #2622 (which added latex_inline), to keep the same structure.

Im am also unsure what the use real case for the existing non-inline versions is as they are broken when used for quantities (IMO they should at least be disabled for formatting units, or switched to the proposed inline variants then):

>>> print(f'{G:unicode}')
6.6743e-11   m³  
 ─────
 kg s²
>>> print(f'{G:console}')
6.6743e-11   m^3  
 ------
 kg s^2

@github-actions github-actions bot added the units label Feb 15, 2023
@github-actions
Copy link

github-actions bot commented Feb 15, 2023

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • Is a milestone set? Milestone must be set but we cannot check for it on Actions; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@pllim pllim added this to the v5.3 milestone Feb 15, 2023
@pllim
Copy link
Member

pllim commented Feb 15, 2023

Thanks! Is this related to #14386 ?

@pllim
Copy link
Member

pllim commented Feb 15, 2023

Would need a change log and maybe a what's new too.

@pllim
Copy link
Member

pllim commented Feb 15, 2023

Failures are definitely related, so those also need to be addressed.

@olebole
Copy link
Member Author

olebole commented Feb 15, 2023

It is not related, and also not ready yet. It seems a bit more complicated than initially thought, because Console._format_bases needs the existing string (which was not the case in Latex._format_bases).

Maybe a better solution would be to add an inline=False class attribute instead, which set to True in the inline variants. This is done here, and also applied to latex_inline (which reverts most of #2622).

@olebole olebole force-pushed the units_unicode-inline-format branch 6 times, most recently from dc6df64 to 72438dc Compare February 15, 2023 18:44
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, all for this! If implemented via subclasses, though, I think it is best to just override relevant methods rather than use an attribute.

That said, I'm actually not sure it is most sensible to add subclasses -- to me, these are very much stylistic changes for which keyword arguments or configuration items are more appropriate -- a bit like np.printoptions.

We've gone already a bit in that direction for Quantity.to_string, which has a subfmt option. Should Unit.to_string() have something similar, which gets passed on to the format classes?

I worry in part because of seeing the test cases: why does 'generic' collect its negative units in parentheses and uses a solidus, while 'console' and 'unicode' use negative powers. That difference suggests there should be an option to choose, but do we really want an extra subclass for that too? Adding it as options to format.to_string() seems more logical...

astropy/units/format/console.py Outdated Show resolved Hide resolved
astropy/units/format/latex.py Outdated Show resolved Hide resolved
astropy/units/format/unicode_format.py Outdated Show resolved Hide resolved
@mhvk
Copy link
Contributor

mhvk commented Feb 15, 2023

@olebole - I made a separate issue about the formatting -- see #14398. My sense is that the easiest is for the format to default to inline -- which would likely be easier to implement if it is a keyword argument...

p.s. Sorry for making what looked like a simple PR complicated... But it would be nice to get it more or less right...

@olebole
Copy link
Member Author

olebole commented Feb 15, 2023

I am answering in the main thread since it may make the discussion easier.

Overwriting to_string() was my first thought. However, then the structure would be different to latex.py which I didn't like. And a lot of code was duplicated in the overwritten class. To me, this looked more complicated than before.

Having an argument in to_string() would be the better way; however this would change the API, and I wanted to make a small change first. And going from there to to_string() options is trivial if everything is in one class , while with merging to classes it is more difficult.

If one would add an argument to to_string():

    @classmethod
    def to_string(cls, unit, inline=False):
        […]

how would this be propagated to the user, f.e. in print(f'{c:unicode]')? And could this option set to True by default when formatting a quantity?

Another (more complicated) way could be to convert the formatters from classes to instances - this way one could have one unicode and one unicode_inline instance, and easily create others if needed (f.e. with central dots instead of spaces between the factors) -- however probably all use cases here could be better handled with subformats.

Can you point me to how quantities handle/specify subformats?

@mhvk
Copy link
Contributor

mhvk commented Feb 15, 2023

@olebole -- interesting! Just on the practical questions: arguments could be passed down from Unit.to_string() if it took arbitrary keyword arguments; see

def to_string(self, format=unit_format.Generic):
"""
Output the unit in the given format as a string.
Parameters
----------
format : `astropy.units.format.Base` instance or str
The name of a format or a formatter object. If not
provided, defaults to the generic format.
"""
f = unit_format.get_format(format)
return f.to_string(self)

I agree it would not help Quantity.__format__, though I feel that that one requires more thought anyway, of actually defining a useful extension to the standard formatting mini-language -- if we were going the route of keyword arguments, I'd just set the default to inline=True. But it would help with Quantity.to_string() -- as there again one could pass things down. Its implementation, including the currently rather kludgy subfmt is at

def to_string(self, unit=None, precision=None, format=None, subfmt=None):
"""
Generate a string representation of the quantity and its unit.
The behavior of this function can be altered via the
`numpy.set_printoptions` function and its various keywords. The
exception to this is the ``threshold`` keyword, which is controlled via
the ``[units.quantity]`` configuration item ``latex_array_threshold``.
This is treated separately because the numpy default of 1000 is too big
for most browsers to handle.
Parameters
----------
unit : unit-like, optional
Specifies the unit. If not provided,
the unit used to initialize the quantity will be used.
precision : number, optional
The level of decimal precision. If `None`, or not provided,
it will be determined from NumPy print options.
format : str, optional
The format of the result. If not provided, an unadorned
string is returned. Supported values are:
- 'latex': Return a LaTeX-formatted string
- 'latex_inline': Return a LaTeX-formatted string that uses
negative exponents instead of fractions
subfmt : str, optional
Subformat of the result. For the moment, only used for
``format='latex'`` and ``format='latex_inline'``. Supported
values are:
- 'inline': Use ``$ ... $`` as delimiters.
- 'display': Use ``$\\displaystyle ... $`` as delimiters.
Returns
-------
str
A string with the contents of this Quantity
"""
if unit is not None and unit != self.unit:
return self.to(unit).to_string(
unit=None, precision=precision, format=format, subfmt=subfmt
)
formats = {
None: None,
"latex": {
None: ("$", "$"),
"inline": ("$", "$"),
"display": (r"$\displaystyle ", r"$"),
},
}
formats["latex_inline"] = formats["latex"]
if format not in formats:
raise ValueError(f"Unknown format '{format}'")
elif format is None:
if precision is None:
# Use default formatting settings
return f"{self.value}{self._unitstr:s}"
else:
# np.array2string properly formats arrays as well as scalars
return (
np.array2string(self.value, precision=precision, floatmode="fixed")
+ self._unitstr
)
# else, for the moment we assume format="latex" or "latex_inline".
# Set the precision if set, otherwise use numpy default
pops = np.get_printoptions()
format_spec = f".{precision if precision is not None else pops['precision']}g"
def float_formatter(value):
return Latex.format_exponential_notation(value, format_spec=format_spec)
def complex_formatter(value):
return "({}{}i)".format(
Latex.format_exponential_notation(value.real, format_spec=format_spec),
Latex.format_exponential_notation(
value.imag, format_spec="+" + format_spec
),
)
# The view is needed for the scalar case - self.value might be float.
latex_value = np.array2string(
self.view(np.ndarray),
threshold=(
conf.latex_array_threshold
if conf.latex_array_threshold > -1
else pops["threshold"]
),
formatter={
"float_kind": float_formatter,
"complex_kind": complex_formatter,
},
max_line_width=np.inf,
separator=",~",
)
latex_value = latex_value.replace("...", r"\dots")
# Format unit
# [1:-1] strips the '$' on either side needed for math mode
if self.unit is None:
latex_unit = _UNIT_NOT_INITIALISED
elif format == "latex":
latex_unit = self.unit._repr_latex_()[1:-1] # note this is unicode
elif format == "latex_inline":
latex_unit = self.unit.to_string(format="latex_inline")[1:-1]
delimiter_left, delimiter_right = formats[format][subfmt]
return rf"{delimiter_left}{latex_value} \; {latex_unit}{delimiter_right}"

@mhvk
Copy link
Contributor

mhvk commented Feb 15, 2023

On the subclasses vs cls.inline -- in principle, subclassing should be fairly easy, since the non-inline version can use the inline version to typeset the nominator and denominator, and make those into a fraction. This is why I thought Console was more logically a subclass of ConsoleInline instead of vice versa. Anyway, I do still fear that this proliferation of subclasses is just going to hurt us, so let's think about the bigger picture first before considering implementation details...

@olebole
Copy link
Member Author

olebole commented Feb 15, 2023

For to_string() I would go the way to add an inline argument; this would probably already resolve many use cases (like having the unit prettyprinted in a table header on console or on rtd/sphinxdoc output). Maybe we should limit the goal of this PR to this. For later, I like the idea of a mini language.
Then, the open questions for this PR would be:

  • If this is forwarded in Unit.to_string() like this:
    def to_string(self, format=unit_format.Generic, subfmt=None):
        […]
        return f.to_string(self, subfmt=subfmt) 
    even for formatters that don't implement subfmts. Or we generically forward **kwargs.
  • Shall we keep latex_inline or make it "latex, subfmt=inline"? The latter would be more consistent, but how would we add the display/inline subformat?
  • How much can we break compatibility? Making inline the default may be incompatible to current uses. Replacing latex_inline by "latex, subfmt=inline" may break something.
  • Or we just define for the moment that the _ delimits units format and subformat, split them in Unit.to_string() and forward this:
    def to_string(self, format=unit_format.Generic):
        fmt = format.split('_', 1)
        f = unit_format.get_format(fmt[0])
        if len(fmt) == 1:       
            return f.to_string(self) 
        else:
            return f.to_string(self, subfmt=fmt[1]) 
    (and the query for the Base.registry should do the same)
    This would result in something that has the API identical to what we have, and it would be extendible. It would however promote the _ as delimiter between unit format and subformat, which currently is used only by latex_inline.

@mhvk
Copy link
Contributor

mhvk commented Feb 15, 2023

I'll have to think a little more, in particular about the implications for Quantity.to_string(), but what you write sounds like a super plan! I had independently wondered whether we could split on _ to keep backward compatibility.

@olebole
Copy link
Member Author

olebole commented Feb 16, 2023

I now again stepped back a bit and implemented just the basics; an "inline" option in console, unicode, latex, and used inline=True as default for console and unicode. Since latex multiline is used regularly (as formatting in IPython notebooks), I left the default there unchanged, and also left latex_inline for compatibility. The latter class has again trivial code (just overwriting the default), but this seems unavoidable for the moment.

I'd propose to take this as an acceptable intermediate step and to merge this. It only minimally changes the public API and postpones any "quantity format minilanguage", allowing to think here more about a more holistic approach. It also resolves somehoe #14398, because of the switch to inline formatting as default.
I am unsure how much we then would need the "multiline" options for console and unicode; maybe we wait here for user feedback, how much they are missed in the default formatting (I doubt that).

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This works well, and I agree we might as well break it up into small steps. Only real comment is about a method I think is no longer used, plus a small suggestion.

astropy/units/format/latex.py Outdated Show resolved Hide resolved
astropy/units/format/latex.py Outdated Show resolved Hide resolved
With the "inline" option for the formats, positive/negative is no
longer a good description of the parts. While they are formed from
parts of the unit with positive and negative power if not inline,
their use is nominator and denominator in a fraction.
docs/whatsnew/5.3.rst Outdated Show resolved Hide resolved
@mhvk
Copy link
Contributor

mhvk commented Feb 16, 2023

pre-commit.ci autofix

@mhvk mhvk changed the title Add "unicode_inline" and "console_inline" unit formats with negative powers instead of fractions Add "inline" option for unicode and console unit formats to give negative powers instead of fractions Feb 16, 2023
@olebole
Copy link
Member Author

olebole commented Feb 16, 2023

As the files were touched here anyway, I deleted/replaced the single l/r variable with something speaking,,,

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for changing those too -- definitely clearer! Since tests now all passed, I'll merge. Nice!

@mhvk mhvk merged commit 24c5278 into astropy:main Feb 16, 2023
@olebole olebole deleted the units_unicode-inline-format branch February 17, 2023 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants