Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I/O module: Meaning of %.6 varies depending on what type it is applied to #18497

Closed
lydia-duncan opened this issue Sep 30, 2021 · 6 comments
Closed

Comments

@lydia-duncan
Copy link
Member

lydia-duncan commented Sep 30, 2021

The number after the decimal point in formatted string specification varies depending on the type of the argument it is applied to. For integers, it means "insert a decimal point and pad to the specified number of zeroes", while for reals, we only print out the number of decimals that were already provided. This is especially confusing when the format string written doesn't appear to have changed otherwise, such as when using the generic n to indicate the type required:

use IO.FormattedIO;

writef("%.6n\n", 35);    // prints `35.000000`
writef("%.6n\n", 2.13);  // prints `2.13`

Such differences are reflected when explicitly specifying the type:

use IO.FormattedIO;

writef("%.6i\n", 35);    // prints `35.000000`
writef("%.6r\n", 2.13);  // prints `2.13`

While it may make sense in an individual type situation, in the larger picture it seems confusing and could cause problems when copying formatting lines for adjustment later in your program.

Should we change this behavior? How?

@bradcray
Copy link
Member

bradcray commented Oct 6, 2021

This is the kind of question that makes me want to ask "What do C and Python do?" (where I suspect that C has it easier because AFAIK, it doesn't support type-neutral format specifiers like Chapel's %n; I'm less familiar with what the situation is in Python).

@lydia-duncan
Copy link
Member Author

From some brief experimentation, it looks like Python:

  • always pads to the specified amount when used with the explicit float specifier:
>>> x = 3.2637
>>> y = 2
>>> print(f'{x:.3f}')
3.264
>>> print(f'{y:.3f}')
2.000
>>> z = 4.1
>>> print(f'{z:.3f}')
4.100
  • doesn't allow precision to be applied to integers without decimals, even using the generic number specifier (the specification says n functions like d, the integer specifier)
>>> y = 2
>>> print(f'{y:.3d}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Precision not allowed in integer format specifier
>>> print(f'{y:.3n}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Precision not allowed in integer format specifier
  • Treats .3 as referring to the number of digits in the number (not decimal places) and will not pad smaller decimal numbers out to fill the space (this is different from the behavior from using f)
>>> x = 3.2637
>>> z = 4.1
>>> print(f'{z:.3n}')
4.1
>>> print(f'{x:.3n}')
3.26

@lydia-duncan lydia-duncan changed the title [Library Stabilization] Meaning of %.6 varies depending on what type it is applied to I/O module: Meaning of %.6 varies depending on what type it is applied to Mar 23, 2022
@lydia-duncan
Copy link
Member Author

  • As Brad said above, C does not allow general type specifiers. In contrast to Python (which doesn't support it on integers, see above) and us, it pads integers with zeroes before the number. But its behavior with floating point numbers matches Python.
#include <stdio.h>

int main() {
  printf("%.3i\n", 2); // prints `002`
  printf("%.3f\n", 3.2637); // prints `3.264`
  printf("%.3f\n", 4.1); // prints `4.100`
}
  • Rust doesn't seem to allow you to specify the type as part of the format string, but will vary what it shows based on the type passed. Integers ignore precision specifiers, floating point numbers will use it to specify the number of digits after the decimal point (so will match Python and C).
  • Julia just provides a wrapper for C's printf so far as I can tell
  • Swift also does not allow general type specifiers and provides the same interface as C

@lydia-duncan
Copy link
Member Author

lydia-duncan commented May 12, 2023

Summary for the up-coming ad-hoc sub-team discussion:

What is the API being presented?

The number after the decimal point in formatted string specification varies depending on the type of the argument it is applied to. For integers, it means "insert a decimal point and pad to the specified number of zeroes", while for reals, we only print out the number of decimals that were already provided. This is especially confusing when the format string written doesn't appear to have changed otherwise, such as when using the generic n to indicate the type required:

use IO.FormattedIO;

writef("%.6n\n", 35);    // prints `35.000000`
writef("%.6n\n", 2.13);  // prints `2.13`

Such differences are reflected when explicitly specifying the type:

use IO.FormattedIO;

writef("%.6i\n", 35);    // prints `35.000000`
writef("%.6r\n", 2.13);  // prints `2.13`

While it may make sense in an individual type situation, in the larger picture it seems confusing and could cause problems when copying formatting lines for adjustment later in your program.

How is it intended to be used?

For reals and complexes it is used to truncate the number if it has more significant digits than would otherwise fit. Interestingly, the docs for integer format specifiers don't explicitly mention it as something that can be done, but Chapel accepts it today and it pads the number so it fills a certain width.

How is it being used in Arkouda and CHAMPS?

In Arkouda, there are 42 uses of this format modifier. All of them are with %r, there are no uses with %n, %i or %t.

Uses in Arkouda:

In CHAMPS, there are 300+ uses, all of which are with %r. There are no uses with %n, %t or %i.

2. What's the history of the feature, if it already exists?

It was added with readf/writef support in 2013 and has mostly been unchanged since then.

3. What's the precedent in other languages, if they support it?

(see above)

4. Are there known Github issues with the feature?

5. Are there features it is related to? What impact would changing or adding this feature have on those other features?

6. How do you propose to solve the problem?

  • Maintain the status quo (though we probably should document the behavior with ints)

    • Pros:
      • less effort
    • Cons:
      • Inconsistent across types, which could be confusing
      • Not consistent with other languages
  • Modify our behavior so that the int version matches C/Julia/Swift (pad with zeroes before the number)

    • Pros:
      • Consistent with C/Julia/Swift and IEEE standard
      • Integers printed with this will be easily recognizable as integers when being read back
    • Cons:
      • Behavior will be inconsistent across types, which could be confusing
  • Drop support for %.<num> with integers (arguably this could be categorized as a bug fix since we don't document it for ints).

    • Pros:
      • Matches Python
      • Arkouda and CHAMPS don't use it, arguably it is not useful to users
      • No documentation update
      • We already have some types that don't support it, so there's some precedent within our own implementation.
    • Cons:
      • Doesn't match C/Julia/Swift
      • Using %.<num> with %n and %t will still vary depending on what is sent (it's just sometimes now it'll be an error instead)

I favor dropping support for %.<num> with integers. I'd prefer not to maintain the status quo and would be fine with modifying it to match C/Julia/Swift

@lydia-duncan
Copy link
Member Author

In our ad-hoc sub-team meeting today, we decided to:

  • drop support for %.6i.
    • Jade: writing an integer with decimal points seems weird
  • adjust %.6r to match C/Julia/Swift (where %.6n for the same number will also adjust in the same way)
    • Jade: expect that when %.6 is provided, padding will occur.
    • Jeremiah: slight preference for leaving it alone but okay with changing it
  • allow %.6n when an integer is passed, but treat it like %r was used.
    • Jade: having it break when going from more specific to less specific seems counterintuitive
    • Lydia: writef("%.6r", 1) -> writef("%.6n", 1) seems like it should behave the same and with no %.6i, people probably won't expect any different behavior for %.6 and an integer.

jeremiah-corrado added a commit that referenced this issue Aug 9, 2023
Makes a couple of changes to handling of precision arguments in
formatted IO per discussion here:
#18497 (comment)

- `%i` and `%u` both emit a warning when a precision argument is
provided. This is not treated as a deprecation because the spec (to my
knowledge) doesn't indicate that it is a supported feature
- `%r` and `%n` both include precision digits for integer values. E.g.,
`writef("%.5r", 1)` prints `1.00000`. This is a bug fix.

Note: The `compopts` files for a few formatted IO C-tests are also
updated to include `-lm`. This flag ensures that the C standard
libraries math header gets linked with the program, as `floorf` is now
used in `qio_formatted.c`.

Testing:
- [x] local paratest
- [x] gasnet paratest

[ reviewed by @lydia-duncan ] - thanks!
@jeremiah-corrado
Copy link
Contributor

Resolved by #22924

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants