Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

m68k FPU precision issue #62262

Closed
mirabilos mannequin opened this issue May 25, 2013 · 15 comments
Closed

m68k FPU precision issue #62262

mirabilos mannequin opened this issue May 25, 2013 · 15 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@mirabilos
Copy link
Mannequin

mirabilos mannequin commented May 25, 2013

BPO 18062
Nosy @mdickinson, @skrah, @mirabilos, @andreas-schwab
Superseder
  • bpo-20904: HAVE_PY_SET_53BIT_PRECISION for m68k
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-05-13.22:30:38.655>
    created_at = <Date 2013-05-25.21:00:51.320>
    labels = ['interpreter-core', 'type-bug']
    title = 'm68k FPU precision issue'
    updated_at = <Date 2014-05-13.22:30:38.653>
    user = 'https://github.com/mirabilos'

    bugs.python.org fields:

    activity = <Date 2014-05-13.22:30:38.653>
    actor = 'skrah'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-05-13.22:30:38.655>
    closer = 'skrah'
    components = ['Interpreter Core']
    creation = <Date 2013-05-25.21:00:51.320>
    creator = 'mirabilos'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 18062
    keywords = []
    message_count = 15.0
    messages = ['189999', '190007', '190009', '190027', '190029', '190030', '190068', '190069', '190081', '190086', '190087', '190091', '190103', '215467', '218487']
    nosy_count = 4.0
    nosy_names = ['mark.dickinson', 'skrah', 'mirabilos', 'schwab']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '20904'
    type = 'behavior'
    url = 'https://bugs.python.org/issue18062'
    versions = ['Python 3.3']

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 25, 2013

    Hi,

    splitting off bpo-18061 by request of pitrou:

    FPU precision issue: MC68881 FPU, similar to Intel 80387, uses 80-bit precision internally. We do not want to change it to 64-bit because it’s not supported in all environments. Can probably be reproduced on i386 (with disabled SSE); fixing this in general would allow keeping the FPUCW on i387 unchanged too.

    See bpo-18061 for details.

    @mirabilos mirabilos mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label May 25, 2013
    @skrah
    Copy link
    Mannequin

    skrah mannequin commented May 25, 2013

    We do not want to change it to 64-bit because it’s not supported in all environments.

    Does this also apply to changing the precision temporarily? Because
    that is what happens for other platforms, see Include/pyport.h:

    HAVE_PY_SET_53BIT_PRECISION
    _PY_SET_53BIT_PRECISION_START
    _PY_SET_53BIT_PRECISION_END
    ...

    To me it sounds that you'd basically have to provide these macros
    for the platform.

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 25, 2013

    > We do not want to change it to 64-bit because it’s not supported
    > in all environments.

    Does this also apply to changing the precision temporarily?

    Yes, that’s precisely what’s not working on the most widespread emulator, at the very least. (I have working code for that, in three(!) variants, but it just ignores those requests to change precision.)

    I think FPU is the one thing current m68k emulators do the worst… the responses I got from the mailing list back then said so, anyway.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented May 25, 2013

    I forgot to comment on this:

    fixing this in general would allow keeping the FPUCW on i387 unchanged too.

    Actually dtoa.c (which is used on i387) explicitly requires to change
    the control word.

    Looking at the test results, it seems to me that a couple of tests
    could be skipped if PY_NO_SHORT_FLOAT_REPR is defined, i.e.the
    failures are fully expected.

    Not sure about the lgamma etc. failures though, perhaps that's
    an emulator bug.

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 25, 2013

    mirabilos dixit:

    mirabilos added the comment:

    Yes, that’s precisely what’s not working on the most widespread
    emulator, at the very least. (I have working code for that, in
    three(!) variants, but it just ignores those requests to change
    precision.)

    Here’s the Python testcase made into a standalone test:

    -----BEGIN cutting here may damage your screen surface-----

    #include <stdio.h>
    #include <stdlib.h>
    #include <math.h>
    #ifndef __MirBSD__
    #include <fpu_control.h>
    #endif
    
    void runtests(void);
    
    void
    runtests(void)
    {
    	volatile double x, y, z;
    
    	/* 1./(1-2**-53) -> 1+2**-52 (correct), 1.0 (double rounding) */
    	x = 0.99999999999999989; /* 1-2**-53 */
    	y = 1./x;
    	printf("test#1 %s: %.17E\n", (y == 1.) ? "fail" : "good", y);
    	/* 1e16+2.99999 -> 1e16+2. (correct), 1e16+4. (double rounding) */
    	x = 1e16;
    	y = 2.99999;
    	z = x + y;
    	printf("test#2 %s: %.17E\n", z == 1e16+4. ? "fail" : "good", z);
    }
    
    int
    main(void) {
    #ifdef _FPU_SETCW
    	fpu_control_t cwold, cwnew, cwgot;
    #endif
    
    	runtests();
    #if (defined(__i386__) || defined(__m68k__)) && defined(_FPU_SETCW)
    	_FPU_GETCW(cwold);
    #ifdef __i386__
    	cwnew = 0x127f;
    #else
    	cwnew = _FPU_RC_NEAREST | _FPU_DOUBLE;
    #endif
    	_FPU_SETCW(cwnew);
    	_FPU_GETCW(cwgot);
    	printf("changing FPU control word from %08X to %08X => %08X\n",
    	    cwold, cwnew, cwgot);
    	runtests();
    #endif
    
    	return (0);
    }
    -----END cutting here may damage your screen surface

    Here’s the result of running it on the latest ARAnyM, which
    did get MPFR-based FPU emulation bugfixes, but apparently
    still ignores any FPUCW changes (or, at least the ones relating
    to precision):

    root@ara3:~ # ./a.out
    test#1 fail: 1.00000000000000000E+00
    test#2 fail: 1.00000000000000040E+16
    changing FPU control word from 0000000 to 00000080 => 00000080
    test#1 fail: 1.00000000000000000E+00
    test#2 fail: 1.00000000000000040E+16

    bye,
    //mirabilos

    "Using Lynx is like wearing a really good pair of shades: cuts out
    the glare and harmful UV (ultra-vanity), and you feel so-o-o COOL."
    -- Henry Nelson, March 1999

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 25, 2013

    Stefan Krah dixit:

    > fixing this in general would allow keeping the FPUCW on i387 unchanged too.

    Actually dtoa.c (which is used on i387) explicitly requires to change
    the control word.

    Right. By fixing the “older” code, i386 is not required to change
    the FPUCW any more and can use it too.

    The problem with changing the FPUCW on i387 is that it changes
    from 64/15 bit mantissa/exponent to 53/15 bit which is still
    not the 53/11 bit of IEEE double, so you *still* get double-
    rounding issues (with denormal numbers only, I guess) because
    the internal precision is still higher.

    And on m68k we simply cannot afford to change the FPUCW because
    that will cause runtime failures on some implementations.

    Looking at the test results, it seems to me that a couple of tests
    could be skipped if PY_NO_SHORT_FLOAT_REPR is defined, i.e.the
    failures are fully expected.

    Ah okay.

    Not sure about the lgamma etc. failures though, perhaps that's
    an emulator bug.

    If I can get self-contained test cases (preferably in C because
    that makes it easier for others to test), I can ask people who
    run real bare-metal Atari or Amiga to test them.

    Or you can ask on debian-68k@lists.debian.org for testers.

    I guess a python2.6 runnable testcase would be okay, even for
    those running an a bit older “base system”. (I admit mine is
    not really up to date either, just the build chroots, because
    of the post-release unfreeze upload peak and related transitions.)

    bye,
    //mirabilos

    “It is inappropriate to require that a time represented as
    seconds since the Epoch precisely represent the number of
    seconds between the referenced time and the Epoch.”
    -- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2

    @mdickinson
    Copy link
    Member

    The problem with changing the FPUCW on i387 is that it changes
    from 64/15 bit mantissa/exponent to 53/15 bit which is still
    not the 53/11 bit of IEEE double, so you *still* get double-
    rounding issues (with denormal numbers only, I guess) because
    the internal precision is still higher.

    That's not a problem for dtoa.c, at least: dtoa.c avoids any use of subnormals in intermediate calculations. It's not really too much of a problem for Python in general, either. Windows typically operates in this mode.

    @mdickinson
    Copy link
    Member

    It's also an option not to use dtoa.c: Python still has fallback code that uses the OS double <-> char* conversions for the case where the configuration step can't figure out how to change the FPU control word. In that case compilation should still succeed, and the resulting Python would show:

    >>> import sys
    >>> sys.float_repr_style
    'legacy'

    If there are tests failing with the 'legacy' mode, that may just indicate buggy tests that haven't been properly marked as depending on the short float repr. (E.g., by decorating with "@unittest.skipUnless(getattr(sys, 'float_repr_style', '') == 'short'"), or poorly-designed tests that could be rewritten.

    @andreas-schwab
    Copy link
    Mannequin

    andreas-schwab mannequin commented May 26, 2013

    Thorsten Glaser <tg@mirbsd.de> writes:

    root@ara3:~ # ./a.out
    test#1 fail: 1.00000000000000000E+00
    test#2 fail: 1.00000000000000040E+16
    changing FPU control word from 0000000 to 00000080 => 00000080
    test#1 fail: 1.00000000000000000E+00
    test#2 fail: 1.00000000000000040E+16

    What is the exact sequence of fpu insns?

    Andreas.

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 26, 2013

    Mark Dickinson dixit:

    If there are tests failing with the 'legacy' mode, that may just
    indicate buggy tests that haven't been properly marked as depending on
    the short float repr. (E.g., by decorating with

    I think that’s what we’re seeing here.

    Python 3.3.1 (default, May 10 2013, 02:52:57)
    [GCC 4.6.3] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.float_repr_style
    'legacy'

    Andreas Schwab dixit:

    What is the exact sequence of fpu insns?

    That’s all gcc-generated code and a system header…

    In main (after commenting out the second getcw and printf):

        jsr runtests
    

    #APP
    | 34 "x.c" 1
    fmove.l %fpcr, %d2
    | 0 "" 2
    #NO_APP
    move.l %d2,-4(%fp)
    move.l #128,-8(%fp)
    #APP
    | 40 "x.c" 1
    fmove.l -8(%fp), %fpcr
    | 0 "" 2
    #NO_APP
    jsr runtests

    And the subroutine:

    runtests:
    link.w %fp,#-24
    move.l %d3,-(%sp)
    move.l %d2,-(%sp)
    move.l #1072693247,-8(%fp)
    move.l #-1,-4(%fp)
    move.l -8(%fp),%d0
    move.l -4(%fp),%d1
    fmovecr #0x32,%fp0
    move.l %d1,-(%sp)
    move.l %d0,-(%sp)
    fmove.d (%sp)+,%fp1
    fdiv.x %fp1,%fp0
    fmove.d %fp0,-(%sp)
    move.l (%sp)+,%d0
    move.l (%sp)+,%d1
    move.l %d0,-16(%fp)
    move.l %d1,-12(%fp)
    move.l -16(%fp),%d2
    move.l -12(%fp),%d3
    move.l -16(%fp),%a0
    move.l -12(%fp),%a1
    move.l #1072693248,%d0
    clr.l %d1
    move.l %a1,-(%sp)
    move.l %a0,-(%sp)
    fmove.d (%sp)+,%fp0
    move.l %d1,-(%sp)
    move.l %d0,-(%sp)
    fmove.d (%sp)+,%fp1
    fcmp.x %fp1,%fp0
    fjne .L2
    move.l #.LC0,%d0
    jra .L3
    .L2:
    move.l #.LC1,%d0
    .L3:
    move.l #.LC2,%d1
    move.l %d3,-(%sp)
    move.l %d2,-(%sp)
    move.l %d0,-(%sp)
    move.l %d1,-(%sp)
    jsr printf
    lea (16,%sp),%sp
    move.l #1128383353,-8(%fp)
    move.l #937459712,-4(%fp)
    move.l #1074266106,-16(%fp)
    move.l #-1043161657,-12(%fp)
    move.l -8(%fp),%a0
    move.l -4(%fp),%a1
    move.l -16(%fp),%d0
    move.l -12(%fp),%d1
    move.l %a1,-(%sp)
    move.l %a0,-(%sp)
    fmove.d (%sp)+,%fp0
    move.l %d1,-(%sp)
    move.l %d0,-(%sp)
    fmove.d (%sp)+,%fp1
    fadd.x %fp1,%fp0
    fmove.d %fp0,-(%sp)
    move.l (%sp)+,%d0
    move.l (%sp)+,%d1
    move.l %d0,-24(%fp)
    move.l %d1,-20(%fp)
    move.l -24(%fp),%a0
    move.l -20(%fp),%a1
    move.l -24(%fp),%d0
    move.l -20(%fp),%d1
    move.l %d1,-(%sp)
    move.l %d0,-(%sp)
    fmove.d (%sp)+,%fp0
    fcmp.d #0x4341c37937e08002,%fp0
    fjne .L4
    move.l #.LC0,%d0
    jra .L5
    .L4:
    move.l #.LC1,%d0
    .L5:
    move.l #.LC3,%d1
    move.l %a1,-(%sp)
    move.l %a0,-(%sp)
    move.l %d0,-(%sp)
    move.l %d1,-(%sp)
    jsr printf
    lea (16,%sp),%sp
    move.l -32(%fp),%d2
    move.l -28(%fp),%d3
    unlk %fp
    rts

    bye,
    //mirabilos

    Yay for having to rewrite other people's Bash scripts because bash
    suddenly stopped supporting the bash extensions they make use of
    -- Tonnerre Lombard in #nosec

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 26, 2013

    Laurent Vivier dixit:

    BTW, the result on a real CPU (68040) is :

    68881 even ;-)

    test#1 fail: 1.00000000000000000E+00
    test#2 fail: 1.00000000000000040E+16
    changing FPU control word from 0000000 to 00000080 => 00000080
    test#1 good: 1.00000000000000022E+00
    test#2 good: 1.00000000000000020E+16

    Thanks, that’s what I was guessing. I get similar results on i386.

    Now as additional data point, UAE/WinUAE/etc. would be interesting.

    Even so, I’d be very reluctant to add this code to Python to make
    it change the FPU mode, because Python is heavily used in Debian,
    and getting varying results between emulation and bare metal is
    something we’d like to not have…

    bye,
    //mirabilos

    "Using Lynx is like wearing a really good pair of shades: cuts out
    the glare and harmful UV (ultra-vanity), and you feel so-o-o COOL."
    -- Henry Nelson, March 1999

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 26, 2013

    Laurent Vivier dixit:

    For the "etc" ;-) , in Qemu, I have:

    Hm, I thought qemu did not emulate an MMU?

    bye,
    //mirabilos

    [...] if maybe ext3fs wasn't a better pick, or jfs, or maybe reiserfs, oh but
    what about xfs, and if only i had waited until reiser4 was ready... in the be-
    ginning, there was ffs, and in the middle, there was ffs, and at the end, there
    was still ffs, and the sys admins knew it was good. :) -- Ted Unangst über *fs

    @mirabilos
    Copy link
    Mannequin Author

    mirabilos mannequin commented May 26, 2013

    Eero Tamminen dixit:

    > > Now as additional data point, UAE/WinUAE/etc. would be interesting.

    I built the test with fpu_control.h header from eglibc, using
    Sparemint GCC 2.9.5 (with 2010 binutils) and MiNTlib. When it's

    Nice ;)

    I.e. it seems that WinUAE FPU emulation is also lacking FPUCW change
    handling (for precision).

    (Hatari's WinUAE CPU core code was synched with upstream last year.)

    OK, thanks. I’d just say let’s say changing FPU precision is not
    part of the target we support. (Funnily enough, ColdFire according
    to the ’net has (unchangeable) 64-bit precision… maybe let’s just
    say precision on m68k in general is not defined.)

    bye,
    //mirabilos

    17:08⎜«Vutral» früher gabs keine packenden smartphones und so
    17:08⎜«Vutral» heute gibts frauen die sind facebooksüchtig
    17:10⎜«Vutral» aber auch traurig; früher warst du als nerd voll am arsch
    17:10⎜«Vutral» heute bist du als nerd der einzige der wirklich damit klarkommt

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Apr 3, 2014

    Can we somehow merge this issue with bpo-20904?

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented May 13, 2014

    This should be fixed now.

    @skrah skrah mannequin closed this as completed May 13, 2014
    @skrah skrah mannequin added the type-bug An unexpected behavior, bug, or error label May 13, 2014
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant