Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong characters in decimal number representation on s390x #10549

Closed
bencz opened this issue Sep 11, 2018 · 19 comments

Comments

@bencz
Copy link

@bencz bencz commented Sep 11, 2018

Steps to Reproduce

  1. Just build and execute the source
    main.zip

Current Behavior

00:00:45.0119377: 3.1415926535897932467959767074
00:00:10.9924616: 3.14159264358933
Seno: 0.08Ͽ釀823762829505493651
Coss: 0.99Ͽ釀917455304513339041
tang: 0.08Ͽ釀77181611388746733

Expected Behavior

00:00:19.3970898: 3,1415926535897932467959767090
00:00:02.2648858: 3,14159264358933
Seno: 0,0872673486823762829505493651
Coss: 0,9961946980917455304513339041
tang: 0,087600695777181611388746733

On which platforms did you notice this

[ ] macOS
[X] Linux
[ ] Windows

Version Used:

Mono JIT compiler version 5.19.0 (master/d7337904202 Tue Sep 11 10:14:02 EDT 2018)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
TLS: __thread
SIGSEGV: normal
Notifications: epoll
Architecture: s390x
Disabled: none
Misc: softdebug
Interpreter: yes
Suspend: preemptive
GC: sgen (concurrent by default)

@lewurm

This comment has been minimized.

Copy link
Member

@lewurm lewurm commented Sep 11, 2018

/cc @nealef

@lewurm lewurm added the target-s390x label Sep 11, 2018
@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 11, 2018

Any time I see Chinese characters appear it makes me think we have a UTF-16 issue with endianess. I will check. Do you know if this also occurs on the other big endian platform(s) /cc @NattyNarwhal

@bencz

This comment has been minimized.

Copy link
Author

@bencz bencz commented Sep 11, 2018

I tested on AIX, and this does not happen

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 11, 2018

I will also have to wait until a new monolite is produced that fixes another endian problem. I believe the fix has been merged but I don't think the monolite has been built from it yet.

@NattyNarwhal

This comment has been minimized.

Copy link
Contributor

@NattyNarwhal NattyNarwhal commented Sep 12, 2018

I likewise can't triage until the fix is ready, but I can't reproduce on PASE with a build from July, FWIW:

sys720$ mono bencz.exe
00:00:36.8463190: 3.1415926535897932467959767130
00:00:43.3616680: 3.14159264358933
Seno: 0.0872673486823762829505493651
Coss: 0.9961946980917455304513339041
tang: 0.087600695777181611388746733
sys720$ mono --version
Mono JIT compiler version 5.17.0 (master/9f340f5 Fri Jul  6 00:00:25  2018)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:
        SIGSEGV:       normal
        Notification:  Thread + polling
        Architecture:  ppc
        Disabled:      shared_perfcounters
        Misc:          softdebug
        Interpreter:   yes
        Suspend:       preemptive
        GC:            sgen
sys720$ uname -srv
OS400 3 7
@NattyNarwhal

This comment has been minimized.

Copy link
Contributor

@NattyNarwhal NattyNarwhal commented Sep 13, 2018

I can't reproduce with the latest build on AIX either, now that the build on BE works - this might be s390x specific?:

aix$ MONO_PATH=mcs/class/lib/build runtime/mono-wrapper /home/calvin/bencz.exe
00:01:03.4127497: 3.1415926535897932467959767129
00:01:07.1237700: 3.14159264358933
Seno: 0.0872673486823762829505493651
Coss: 0.9961946980917455304513339041
tang: 0.087600695777181611388746733
aix$ mono/mini/mono --version
Mono JIT compiler version 5.19.0 (master/c2a94b3 Wed Sep 12 20:08:39  2018)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:
        SIGSEGV:       normal
        Notification:  Thread + polling
        Architecture:  ppc
        Disabled:      none
        Misc:          softdebug
        Interpreter:   yes
        Suspend:       preemptive
        GC:            sgen
@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 13, 2018

The simplest test case is:

using System;
using System.Text;

public class main{
        static void Main (string[] args)
        {
                decimal x = 0.928349234m;
		Console.WriteLine("{0}",x);
        }
}

When I run it on x86_64 with aot turned off I see this in the trace:

[0x7f722de00780: 0.09639 8] ENTER: System.Text.StringBuilder:AppendFormatHelper (System.IFormatProvider,string,System.ParamsArray)(this:0x7f722c406e80[System.Text.StringBuilder test.exe], [System.Globalization.CultureInfo:0x7f722c4049c8], [STRING:0x7f722dd74130:{0}], [], )
[0x7f722de00780: 0.09677 9] ENTER: System.Globalization.CultureInfo:GetFormat (System.Type)(this:0x7f722c4049c8[System.Globalization.CultureInfo test.exe], [TYPE:System.ICustomFormatter], )
[0x7f722de00780: 0.09681 9] LEAVE: System.Globalization.CultureInfo:GetFormat (System.Type)[OBJECT:(nil)]
[0x7f722de00780: 0.09697 9] ENTER: System.ParamsArray:get_Item (int)(value:0x7ffd4bf7a2f0, 0, )
[0x7f722de00780: 0.09704 9] LEAVE: System.ParamsArray:get_Item (int)[System.Decimal:0x7f722c4004d8]
[0x7f722de00780: 0.09749 9] ENTER: System.Decimal:ToString (string,System.IFormatProvider)(value:0x7f722c4004e8, (nil), [System.Globalization.CultureInfo:0x7f722c4049c8], )

For s390x I am seeing:

0x3ffb7377740 [  8] . . . . . . . . ENTER: System.Text.StringBuilder:AppendFormatHelper (System.IFormatProvider,string,System.ParamsArray) ip: 0x3ffb733ffbe sp: 0x3ffc057dbc8 - this[MONO_TYPE_CLASS]: 0x3ffb0400fc8, [CLASS/OBJ:0x3ffb0404090 [0x3ffc057db88] [System.Globalization.CultureInfo:0x3ffb0404090]], [CLASS/OBJ:0x3ffae75c130 [0x3ffc057db90] [STRING:0x3ffae75c130:{0}], ], [VALUETYPE:00,00,03,ff,b0,40,04,90,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,03,ff,b0,40,0f,38,],
0x3ffb7377740 [  9] . . . . . . . . . ENTER: System.Globalization.CultureInfo:GetFormat (System.Type) ip: 0x3ffb7340bb6 sp: 0x3ffc057d7e8 - this[MONO_TYPE_CLASS]: 0x3ffb0404090, [CLASS/OBJ:0x3ffae7790f0 [0x3ffc057d7a8] [System.RuntimeType:0x3ffae7790f0]],
0x3ffb7377740 [  9] . . . . . . . . . LEAVE: System.Globalization.CultureInfo:GetFormat (System.Type)[OBJECT:(nil)] ip: 0x3ffb732ce06
0x3ffb7377740 [  9] . . . . . . . . . ENTER: System.ParamsArray:get_Item (int) ip: 0x3ffb7341402 sp: 0x3ffc057d7e8 - this:[value:0x3ffc057dc68:0000000000000000], [INT4:0],
0x3ffb7377740 [  9] . . . . . . . . . LEAVE: System.ParamsArray:get_Item (int)[System.Decimal:0x3ffb0400490] ip: 0x3ffb7342a9e
0x3ffb7377740 [  9] . . . . . . . . . ENTER: System.Decimal:TryFormat (System.Span`1<char>,int&,System.ReadOnlySpan`1<char>,System.IFormatProvider) ip: 0x3ffb7341ed2 sp: 0x3ffc057d7e8 - this:[value:0x3ffb04004a0:0000000005d4c5c0], [MONO_TYPE_GENERICINST], [BYREF:0x3ffc057d9a0], [MONO_TYPE_GENERICINST], [CLASS/OBJ:0x3ffb0404090 [0x3ffc057d7c0] [System.Globalization.CultureInfo:0x3ffb0404090]],

s390x is doing TryFormat but x86_64 just does Format. I assume this is because we either have a different provider. It appears the Provider is obtained via:

System.IO.TextWriter:get_FormatProvider

I need to understand this Provider logic and why it is different for s390x.

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 13, 2018

Actually, I think it's more to do with NumberFormatInfo but I still need to understand how this stuff fits together.

@NattyNarwhal

This comment has been minimized.

Copy link
Contributor

@NattyNarwhal NattyNarwhal commented Sep 13, 2018

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 13, 2018

Thanks. It answers my question - AIX also uses System.Decimal:TryFormat. Therefore, there's something in that instruction path that s390x is doing wrong.

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 13, 2018

Trace shows AIX uses same code path as s390x. I have verified that methods such as UInt32ToDecimalChars, FormatGeneral, and NumberToString are producing the correct stuff. Something is trashing that between the sb.TryCopyTo() and Buffer:Memmove().

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 14, 2018

It appears to be an overwriting of the string with the ValueStringBuilder object when it's about to be used as a parameter to the TryCopyTo() method.

nealef added a commit to nealef/mono that referenced this issue Sep 14, 2018
@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 14, 2018

The s390x implementation of OUTARG_VT for the case where a structure is being passed on the stack was faulty. I have tested it on my simple test case as well as the test case provided. The output is in the correct format. However, the coss() value is not the same so I am not sure what else I may have broken:

00:00:44.5156635: 3.1415926535897932467959767072
00:00:10.6698318: 3.14159264358933
Seno: 0.0872673486823762829505493651
Coss: 11.415434445219119130984073813
tang: 0.087600695777181611388746733
@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 14, 2018

OUTARG_VT processing was incorrect - the structure wasn't being used properly on the stack such that things were being overwritten. However, I note the PPC implementation of mono_arch_emit_outarg_vt() differs from mine in a couple of key respects and I am not sure what conclusions to draw.

PPC -

                MonoInst *vtcopy = mono_compile_create_var (cfg, m_class_get_byval_arg (src->klass), OP_LOCAL);
                MonoInst *load;
                guint32 size;

                /* FIXME: alignment? */
                if (call->signature->pinvoke) {
                        size = mono_type_native_stack_size (m_class_get_byval_arg (src->klass), NULL);
                        vtcopy->backend.is_pinvoke = 1;
                } else {
                        size = mini_type_stack_size (m_class_get_byval_arg (src->klass), NULL);
                }
                if (size > 0)
                        g_assert (ovf_size > 0);

                EMIT_NEW_VARLOADA (cfg, load, vtcopy, vtcopy->inst_vtype);
                mini_emit_memcpy (cfg, load->dreg, 0, src->dreg, 0, size, TARGET_SIZEOF_VOID_P);

                if (ainfo->offset)
                        MONO_EMIT_NEW_STORE_MEMBASE (cfg, OP_STORE_MEMBASE_REG, ppc_r1, ainfo->offset, load->dreg);
                else
                        mono_call_inst_add_outarg_reg (cfg, call, load->dreg, ainfo->reg, FALSE);

s390x -

                ERROR_DECL (error);
                MonoMethodHeader *header;
                int srcReg;

                header = mono_method_get_header_checked (cfg->method, error);
                mono_error_assert_ok (error); /* FIXME don't swallow the error */
                if ((cfg->flags & MONO_CFG_HAS_ALLOCA) || header->num_clauses)
                        srcReg = s390_r11;
                else
                        srcReg = STK_BASE;

                if (ainfo->reg == STK_BASE) {
                        MONO_EMIT_NEW_STORE_MEMBASE (cfg, OP_STORE_MEMBASE_REG, srcReg, ainfo->offset, src->dreg);

                        if (cfg->compute_gc_maps) {
                                MonoInst *def;

                                EMIT_NEW_GC_PARAM_SLOT_LIVENESS_DEF (cfg, def, ainfo->offset, m_class_get_byval_arg (ins->klass));
                        }
                } else
                        mono_call_inst_add_outarg_reg (cfg, call, src->dreg, ainfo->reg, FALSE);

The key area of difference is the use of vtcopy and EMIT_NEW_VARLOADA.

The code generated for s390x looks like this:

 13e:   ec 3f 00 e4 00 d9       aghik   %r3,%r15,228 .  - R3 = Address of the structure
 144:   ec 2f 00 f4 00 d9       aghik   %r2,%r15,244 .  - R2 = Location of copy area
 14a:   a7 49 00 10             lghi    %r4,16 .        - R4 = Length of structure
 14e:   a7 49 00 10             lghi    %r4,16          - Not sure why R4 is loaded twice
 152:   c0 e8 00 00 03 ff       iihf    %r14,1023       - R14 = Address of memcpy (top half)
 158:   c0 e9 7e 95 4e c8       iilf    %r14,2123714248 - (bottom half)
 15e:   0d ee                   basr    %r14,%r14       - Perform copy of structure
 160:   a7 29 00 05             lghi    %r2,5           - R2 = 5 (1st parameter to check_outarg_vt)
 164:   ec 3f 00 f4 00 d9       aghik   %r3,%r15,244    - R3 = Address of copy of the structure (2nd parm)
 16a:   c0 e8 00 00 03 ff       iihf    %r14,1023       - R14 = Address of check_outarg_vt (top half)
 170:   c0 e9 7e 95 4e 68       iilf    %r14,2123714152 - (bottom half)
 176:   0d ee                   basr    %r14,%r14       - Call to method

So I think the logic is correct. I am not sure why I would need to use the vtcopy method of ppc which would generate a copy of the copy. mono_decompose_vtype_opts() in method-to-ir.c appears to take care of that for us. @vargaz Is my understanding correct?

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 14, 2018

That create_var in mono_decompose_vtype_opts() is only performed if src_var is NULL. For my test case it is so we get a copy. For the original test case src_var isn't so we don't get a copy and as a result that Cos value is being overwritten by the result of the division. I am not sure how I can tell whether I need to use the vtcopy mechanism (which would fix this particular case but break my check_outarg_vt() test) and when not to (when my test case works but the original one doesn't).

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 14, 2018

It appears I was missing some code to handle the return of a structure. That has partially fixed the problem.

--- a/mono/mini/mini-s390x.c
+++ b/mono/mini/mini-s390x.c
@@ -2543,6 +2543,17 @@ mono_arch_emit_call (MonoCompile *cfg, MonoCallInst *call)
            (i == sig->sentinelpos)) {
                emit_sig_cookie (cfg, call, cinfo);
        }
+
+       if (cinfo->struct_ret) {
+               MonoInst *vtarg;
+
+               MONO_INST_NEW (cfg, vtarg, OP_MOVE);
+               vtarg->sreg1 = call->vret_var->dreg;
+               vtarg->dreg = mono_alloc_preg (cfg);
+               MONO_ADD_INS (cfg->cbb, vtarg);
+
+               mono_call_inst_add_outarg_reg (cfg, call, vtarg->dreg, cinfo->struct_ret, FALSE);
+       }
 }
 
 /*========================= End of Function ========================*/

So the case where:

decimal x = 100m;
decimal y = 500m;
var z = x  / y;

which had been resulting in x = 100 y = 5 z = 5 now gives the expected 100 500 5. However, if I change it to:

decimal x = 100m;
decimal y = 500m;
var z = 1 / (x  / y);

I get x=100 y=5 z=0.2.

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 17, 2018

I restored the vtcopy code that had been there and now both the simple and original test cases work.

@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 17, 2018

@luhenry luhenry closed this in cd70e9c Sep 18, 2018
@nealef

This comment has been minimized.

Copy link
Contributor

@nealef nealef commented Sep 18, 2018

Fix merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.