Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fish converts an empty string to zero in numeric contexts #3346

Closed
krader1961 opened this issue Sep 1, 2016 · 5 comments
Closed

fish converts an empty string to zero in numeric contexts #3346

krader1961 opened this issue Sep 1, 2016 · 5 comments
Assignees
Labels
bug Something that's not working as intended release notes Something that is or should be mentioned in the release notes
Milestone

Comments

@krader1961
Copy link
Contributor

krader1961 commented Sep 1, 2016

Several recent issues involving the test and printf builtins brought to light that fish is inconsistent about how it treats an empty, null, string when converting it to a number. For example, test "" -ge 0 is considered true because the empty string is converted to a zero. Doing printf '%f\n' "" prints 0.000000. Yet the similar printf '%d\n' "" on macOS Sierra (as of git commit eeb42f5) reports the error ": Invalid argument". Whereas printf '%f\n' "" prints "0.00000". On the other hand the printf '%d\n' "" on Ubuntu 16.04 (with the same version of fish) prints "0".

While I am curious why Ubuntu and macOS Sierra handle the printf '%d\n' "" case differently it's orthogonal to the core point of this issue. Which is, fish should consistently treat an empty, null, string (i.e., "") as an invalid number. The current behavior is problematic. It will also be inconsistent if the change I believe @floam is working on to make the test "" -ge 0 scenario be an error rather than return a true (zero) status is merged.

See also issue #3334.

@krader1961 krader1961 added the bug Something that's not working as intended label Sep 1, 2016
@krader1961 krader1961 added this to the fish-future milestone Sep 1, 2016
@floam
Copy link
Member

floam commented Sep 7, 2016

While I am curious why Ubuntu and macOS Sierra handle the printf '%d\n' "" case differently it's orthogonal to the core point of this issue.

Actually, the OS X/Linux difference here and the cause of the builtin_test failure are likely quite related: checking strtol success.

Consider how we were checking for a successful conversion by wcstoll in the test builtin:

static bool parse_number(const wcstring &arg, long long *out, wcstring_list_t &errors) {
     const wchar_t *str = arg.c_str();
     wchar_t *endptr = NULL;
     *out = wcstoll(str, &endptr, 10);
     return endptr && *endptr == L'\0';
}

wcstoll returns 0 in the case of failure. Here's the manpage suggesting how to test that the string was not invalid:

... strtol() stores the address of the first invalid
character in *endptr. If there were no digits at all, however, strtol()
stores the original value of str in *endptr. (Thus, if *str is not \0
but **endptr is \0 on return, the entire string was valid.)

By only doing half of this by checking *endptr == '\0' but not str, we end up storing 0 into out and then return true for a failed conversion if there were no digits at all and the input was an empty string. The correct check is *str != L'\0' && *endptr == L'\0'.

Linux vs. OS X

In my changes I went further than using only the prescribed test from the manpage for builtin_test, and I also am checking errno because I wanted to give a more helpful error message to the user (please don't mind some aspects of the diff which are not relevant to this discussion and are in-progress):

static bool parse_number(const wcstring &arg, long long *out, wcstring_list_t &errors) {
+    using namespace test_expressions;
     const wchar_t *str = arg.c_str();
     wchar_t *endptr = NULL;
+    // from the manual:
+    // If endptr is not NULL, wcstrtoll() stores the address of the first invalid
+    // character in *endptr.  If there were no digits at all, strtol() stores
+    // the original value of str in *endptr.  (Thus, if *str is not L'\0' but
+    // **endptr is `\0' on return, the entire string was valid.)
+    
+    errno = 0;
     *out = wcstoll(str, &endptr, 10);
-    return endptr && *endptr == L'\0';
+
+    // XXX DONTCOMMITAARON use &errors instead before committing
+    if (errno == EINVAL) {
+        debug(0, L"Invalid digit: '%S'", endptr);
+        return false;
+    } else if (errno == ERANGE) {
+        debug(0, L"Out of range digit: '%S'", endptr);
+        return false;
+    }
+
+    return *str != L'\0' && *endptr == L'\0';

I recall that according to the manpage, the behavior of errno being set EINVAL where conversion fails after no valid digits might be a platform-specific feature, although I never investigated what that would mean practically:

The strtol(), strtoll(), strtoimax(), and strtoq() functions return the
result of the conversion, unless the value would underflow or overflow.
If no conversion could be performed, 0 is returned and the global vari-
able errno is set to EINVAL (the last feature is not portable across all
platforms)
. If an overflow or underflow occurs, errno is set to ERANGE
and the function return value is clamped according to the following ta-
ble.

My hunch would be that in printf we intended to always do what I did up above, and it worked on OS X, but unfortunately by relying on errno to verify the conversion (which will turn out to work differently for OS X versus Linux here for me to be correct), and then not correctly checking both str and endptr, a major divergence in behavior is possible and that's what is happening.

So I might not want to check for EINVAL like I do above in my WIP. If it's not portable, it will be better to report the "invalid digit" error outside this function when it returns false due to the improvement to the return expression. We will want to make sure not to rely on EINVAL in this context for anything but diagnostics, if it isn't portable, in fish.

@floam
Copy link
Member

floam commented Sep 7, 2016

Indeed, commenting out the errno check in verify_numeric causes OS X to revert to the reported broken Linux behavior:

void builtin_printf_state_t::verify_numeric(const wchar_t *s, const wchar_t *end, int errcode) {
    /*if (errcode != 0) {
        this->fatal_error(L"%ls: %s", s, strerror(errcode));
    } else*/ if (*end) {
        if (s == end)
            this->fatal_error(_(L"%ls: expected a numeric value"), s);
        else
            this->fatal_error(_(L"%ls: value not completely converted"), s);
    }
}
> printf \%d ''
0⏎                        

It needs to check *end and *s correctly in the else if.

@krader1961
Copy link
Contributor Author

I see that @floam added a change, commit f843eb3 and PR #3405, intended to at least partially address this issue. Is that true, @floam? Given that commit and your earlier comments on this issue would you like to take ownership of it?

@ridiculousfish
Copy link
Member

FYI I reverted f843eb3

@krader1961
Copy link
Contributor Author

Fixing issue #3570 caused me to notice that there are numerous places in the fish code that set errno to zero before performing a string to int conversion then checking the value of errno. That is fubar.

@krader1961 krader1961 self-assigned this Nov 24, 2016
@krader1961 krader1961 modified the milestones: fish 2.5.0, fish-future Nov 25, 2016
krader1961 added a commit that referenced this issue Nov 28, 2016
This augments the previous change for issue #3346 by adding an error
message when an invalid integer is seen. This change is likely to be
controversial so I'm not going to squash it into the previous change.
@faho faho added the release notes Something that is or should be mentioned in the release notes label Dec 2, 2016
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something that's not working as intended release notes Something that is or should be mentioned in the release notes
Projects
None yet
Development

No branches or pull requests

4 participants