Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add to standard library: parse floats #375

Closed
andrewrk opened this issue May 20, 2017 · 14 comments · Fixed by #1958
Closed

add to standard library: parse floats #375

andrewrk opened this issue May 20, 2017 · 14 comments · Fixed by #1958
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@andrewrk
Copy link
Member

Depends on #374

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label May 20, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone May 20, 2017
@andrewrk
Copy link
Member Author

andrewrk commented Jul 8, 2017

Depends on #405

@raulgrell
Copy link
Contributor

raulgrell commented Aug 6, 2017

I've got a naive implementation of this in my tick repo, lib/float.zig

@andrewrk
Copy link
Member Author

andrewrk commented Aug 6, 2017

We have a robust implementation of this in zig now; it just needs a more complete bigint implementation (#405). It's a port of errol: https://github.com/marcandrysco/Errol

You can see the compile errors we get when you try to print a floating point now:

const io = @import("std").io;

pub fn main() -> %void {
    %%io.stdout.printf("Hello, world!\n{}\n", f32(12.34));
}
/home/andy/dev/zig/build/lib/zig/std/fmt/errol/index.zig:253:27: error: float value 10000000000000000000.000000 cannot be implicitly casted to type 'u128'
    const l64 = u64(low % pow19);
                          ^

hmm actually that's not what I was expecting. I think actually the bigint implementation may no longer be blocking this issue.

@andrewrk
Copy link
Member Author

Printing floats is basically working but there are bugs. I'm going to do some fuzz testing and see where the C errol code and zig port get different answers, then step them both side by side and see what happens differently.

@ofelas
Copy link

ofelas commented Aug 20, 2017

A wee test to trigger an edge case "integer overflow", just swap the comments on // if (false) and the next line

const print = @import("std").io.stdout.printf;
test "f64 print" {
    %%print("\n");
    var iter: usize = 0;
    while (iter < 16) : (iter += 1) {
        const u: u64 = (9007199254740992 - 3) + iter; // (2^53 - 3) + iter
        const uf: f64 = f64(u);
        const ufu: u64 = u64(uf);
        // integer overflow for 9007199254740992 and 9007199254740993
        // other neighbouring values seem ok
        // if (false) {
        if ((u == 9007199254740992) or (u == 9007199254740993)) {
            %%print("zig 64-bit: u64={} -> f64=n/a -> u64={} workaround\n", u, ufu);
        } else {
            %%print("zig 64-bit: u64={} -> f64={} -> u64={}\n", u, uf, ufu);
        }
    }
}

It's near when the f64 no longer can represent the large integer value.

zig 64-bit: u64=9007199254740990 -> f64=9.007199e18 -> u64=9007199254740990
zig 64-bit: u64=9007199254740991 -> f64=9.007199e18 -> u64=9007199254740991
zig 64-bit: u64=9007199254740992 -> f64=n/a -> u64=9007199254740992 workaround
zig 64-bit: u64=9007199254740993 -> f64=n/a -> u64=9007199254740992 workaround
zig 64-bit: u64=9007199254740994 -> f64=9.007199e18 -> u64=9007199254740994
zig 64-bit: u64=9007199254740995 -> f64=9.007199e18 -> u64=9007199254740996
zig 64-bit: u64=9007199254740996 -> f64=9.007199e18 -> u64=9007199254740996

C   64-bit: u64=9007199254740989 -> f64=9007199254740989.000000 -> u64=9007199254740989
C   64-bit: u64=9007199254740990 -> f64=9007199254740990.000000 -> u64=9007199254740990
C   64-bit: u64=9007199254740991 -> f64=9007199254740991.000000 -> u64=9007199254740991
C   64-bit: u64=9007199254740992 -> f64=9007199254740992.000000 -> u64=9007199254740992
C   64-bit: u64=9007199254740993 -> f64=9007199254740992.000000 -> u64=9007199254740992
C   64-bit: u64=9007199254740994 -> f64=9007199254740994.000000 -> u64=9007199254740994

@andrewrk andrewrk added the standard library This issue involves writing Zig code for the standard library. label Aug 28, 2017
@tiehuis
Copy link
Member

tiehuis commented Sep 6, 2017

Brought up in irc, but we currently have some problems on some simple cases too. Tested on latest master.

const io = @import("std").io;
pub fn main() -> %void {
    var a: f64 = 0.5;
    var b: f64 = 0.45;
    var c: f64 = 0.045;
    %%io.stdout.printf("{}\n", a); // -> 0.0
    %%io.stdout.printf("{}\n", b); // -> 0.45
    %%io.stdout.printf("{}\n", c); // -> 4.5e-1
}
$ ./float-example
0.0
0.45
4.5e-1

@tiehuis
Copy link
Member

tiehuis commented Sep 10, 2017

A more catastrophic case for reference.

pub fn main() -> %void {
    %%@import("std").io.stdout.printf("{}\n", f64(0.0));
}
integer overflow
/home/me/local/lib/zig/std/special/zigrt.zig:16:35: 0x000000000020b745 in ??? (t)
        @import("std").debug.panic("{}", message_ptr[0..message_len]);
                                  ^
/home/me/local/lib/zig/std/fmt/errol/index.zig:67:13: 0x000000000021ef3b in ??? (t)
        exp -= 1;
            ^
/home/me/local/lib/zig/std/fmt/errol/index.zig:28:19: 0x000000000020b48b in ??? (t)
    return errol3u(value, buffer);
                  ^
/home/me/local/lib/zig/std/fmt/index.zig:248:33: 0x000000000021caa2 in ??? (t)
    const float_decimal = errol3(f64(value), buffer[0..]);
                                ^
/home/me/local/lib/zig/std/fmt/index.zig:183:31: 0x000000000021b547 in ??? (t)
            return formatFloat(value, context, output);

@andrewrk andrewrk modified the milestones: 0.1.0, 0.2.0 Oct 16, 2017
@andrewrk andrewrk added the bug Observed behavior contradicts documented or intended behavior label Oct 16, 2017
@andrewrk andrewrk modified the milestone: 0.2.0 Oct 19, 2017
@scurest
Copy link
Contributor

scurest commented Oct 23, 2017

I need to print floats, so I've been poking at this today and I have some improvements in this commit. Here's what I did

  • Errol doesn't handle NaNs, the infinities, the zeros, or negative numbers. Those need to be done at some higher level. I just put it in formatFloat.

  • The 0.5, 0.45, 0.045 examples Errol actually gets right and they're just formatted wrong in formatFloat.

    This is because the special case for exponent=0 is handled wrong. I just dropped the special case, which means 0.5 is now printed as 5.0e-1, which is less pretty but its easier to check that the code's correct.

  • The seven digits we emit aren't enough to uniquely specify even an f32 if that was the intention. I think we need 8 for f32s and 17 for f64s.

  • Is 20 bytes enough for the buffer size? errolN_proc in the Errol repo uses 32. I bumped it up since its basically free since the array is undefined anyway.

  • I think this line in our Errol impl, corresponding to this line in the Errol repo, is a typo. I changed it but didn't notice any difference.

  • I also changed fpprev/fpnext to do what C does, but I don't know if it makes any difference.

This has been working for me so far, but if I wanted to PR it, how do you think I should test it?

@andrewrk
Copy link
Member Author

Thanks for looking into this.

This has been working for me so far, but if I wanted to PR it, how do you think I should test it?

One idea: write some C code to generate random floats and print them with printf. Output the float values as well as the printed strings. Have Zig print the same float values and check if you get the same results.

@nwsharp
Copy link
Contributor

nwsharp commented Jul 24, 2018

I'll pick up parsing IEEE-754 floating-point numbers from strings.

@kristate
Copy link
Contributor

kristate commented Aug 2, 2018

Can we get a checklist of what needs to happen here?

@andrewrk andrewrk changed the title add to standard library: parse floats and print floats add to standard library: parse floats Aug 27, 2018
@andrewrk andrewrk removed the bug Observed behavior contradicts documented or intended behavior label Aug 27, 2018
@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Aug 27, 2018
@andrewrk
Copy link
Member Author

What needs to happen here is std.fmt.parseFloat:

/// `bytes` is a utf-8 encoded string
pub fn parseFloat(comptime T: type, bytes: []const u8) !T {
    // ...
}

@winksaville
Copy link
Contributor

FYI, I've implemented ParseNumber which does both integers and floats and wrapped parseFloat(comptime T: type, bytes: []const u8) around it. I'd be glad to refactor it into a pull request if desired. And, of course, any other comments welcome.

Some Examples:

test "ParseNumber.parseFloat" {
    assert((try parseFloat(f64, "-1.000_001e10")) == -1.000001e10);
    assert((try parseFloat(f32, "0.1")) == 0.1);

    // Couple of examples using ParseNumber directly
    assert((try ParseNumber(i32).parse("0x1234_ABCD")) == 0x1234ABCD);
    const parseF32 = ParseNumber(f32).parse;
    assert((try parseF32("-1.0")) == -1.0);
}

@daurnimator
Copy link
Contributor

FWIW LuaJIT has an implementation that might be worth copying: https://github.com/LuaJIT/LuaJIT/blob/b025b01c5b9d23f6218c7d72b7aafa3f1ab1e08a/src/lj_strscan.c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants