Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when compiling with -fsanitize=undefined #812

Open
saghul opened this issue May 25, 2016 · 74 comments
Open

Errors when compiling with -fsanitize=undefined #812

saghul opened this issue May 25, 2016 · 74 comments

Comments

@saghul
Copy link
Contributor

saghul commented May 25, 2016

Using Duktape 1.5.0 and GCC 6.1.0 on OSX:

duk_hthread_builtins.c:313:3: runtime error: left shift of 31 by 27 places cannot be represented in type 'int'
duk_hobject_props.c:829:14: runtime error: load of misaligned address 0x7fe62c013f8f for type 'duk_uint32_t', which requires 4 byte alignment
0x7fe62c013f8f: note: pointer points here
 ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff
             ^
duk_hobject_props.c:831:15: runtime error: store to misaligned address 0x7fe62c013f8f for type 'duk_uint32_t', which requires 4 byte alignment
0x7fe62c013f8f: note: pointer points here
 ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff
             ^
duk_hobject_props.c:1298:17: runtime error: load of misaligned address 0x7fe62c013fd3 for type 'duk_uint32_t', which requires 4 byte alignment
0x7fe62c013fd3: note: pointer points here
 ff  ff ff ff 17 00 00 00 14  00 00 00 ff ff ff ff 03  00 00 00 1a 00 00 00 ff  ff ff ff 0d 00 00 00
              ^
duk_hobject_props.c:1306:15: runtime error: store to misaligned address 0x7fe62c01400b for type 'duk_uint32_t', which requires 4 byte alignment
0x7fe62c01400b: note: pointer points here
 13  00 00 00 ff ff ff ff 04  00 00 00 07 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00
              ^
duk_hobject_props.c:1160:6: runtime error: load of misaligned address 0x7fe62c013fa7 for type 'duk_uint32_t', which requires 4 byte alignment
0x7fe62c013fa7: note: pointer points here
 0b 00 00 00 05  00 00 00 0a 00 00 00 1c  00 00 00 ff ff ff ff 0c  00 00 00 ff ff ff ff ff  ff ff ff
             ^
duk_hobject_props.c:775:4: runtime error: member access within misaligned address 0x7fe62bd14ec2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd14ec2: note: pointer points here
 00 00  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:3958:3: runtime error: store to misaligned address 0x7fe62bd14ec2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd14ec2: note: pointer points here
 00 00  01 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:3959:3: runtime error: member access within misaligned address 0x7fe62bd14ec2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd14ec2: note: pointer points here
 00 00  01 00 08 00 00 00 e6 7f  00 00 80 4d d1 2b e6 7f  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:3959:3: runtime error: member access within misaligned address 0x7fe62bd14ec2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd14ec2: note: pointer points here
 00 00  01 00 08 00 00 00 e6 7f  00 00 80 4d d1 2b e6 7f  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:2092:2: runtime error: member access within misaligned address 0x7fe62bd14ed2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd14ed2: note: pointer points here
 e6 7f  00 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:2092:2: runtime error: store to misaligned address 0x7fe62bd14ed2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd14ed2: note: pointer points here
 e6 7f  00 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_js_ops.c:582:39: runtime error: member access within misaligned address 0x7fe62bd15242 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15242: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_js_ops.c:597:41: runtime error: member access within misaligned address 0x7fe62bd15242 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15242: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:3001:4: runtime error: load of misaligned address 0x7fe62bd154e2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd154e2: note: pointer points here
 e6 7f  00 00 08 00 00 00 ff 7f  00 00 f0 09 c0 2b e6 7f  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:3001:4: runtime error: member access within misaligned address 0x7fe62bd154e2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd154e2: note: pointer points here
 e6 7f  00 00 08 00 00 00 ff 7f  00 00 f0 09 c0 2b e6 7f  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:2027:7: runtime error: member access within misaligned address 0x7fe62bd152d2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd152d2: note: pointer points here
 00 00  01 00 08 00 00 00 00 00  00 00 e0 0e c0 2b e6 7f  00 00 08 00 00 00 39 00  00 00 10 13 c0 2b
              ^
duk_api_stack.c:2971:2: runtime error: load of misaligned address 0x7fe62bd152d2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd152d2: note: pointer points here
 00 00  01 00 08 00 00 00 00 00  00 00 e0 0e c0 2b e6 7f  00 00 08 00 00 00 39 00  00 00 10 13 c0 2b
              ^
duk_api_stack.c:2972:2: runtime error: member access within misaligned address 0x7fe62bd152d2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd152d2: note: pointer points here
 00 00  01 00 08 00 00 00 00 00  00 00 e0 0e c0 2b e6 7f  00 00 08 00 00 00 39 00  00 00 10 13 c0 2b
              ^
duk_api_stack.c:2972:2: runtime error: member access within misaligned address 0x7fe62bd152d2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd152d2: note: pointer points here
 00 00  01 00 08 00 00 00 00 00  00 00 e0 0e c0 2b e6 7f  00 00 08 00 00 00 39 00  00 00 10 13 c0 2b
              ^
duk_js_compiler.c:744:3: runtime error: load of misaligned address 0x7fe62bd15472 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15472: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 08 00 00 00 ff 7f  00 00 60 09 c0 2b
              ^
duk_js_compiler.c:746:3: runtime error: member access within misaligned address 0x7fe62bd15472 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15472: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 08 00 00 00 ff 7f  00 00 60 09 c0 2b
              ^
duk_js_compiler.c:746:3: runtime error: member access within misaligned address 0x7fe62bd15472 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15472: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 08 00 00 00 ff 7f  00 00 60 09 c0 2b
              ^
duk_heap_refcount.c:522:6: runtime error: member access within misaligned address 0x7fe62bd15472 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15472: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 08 00 00 00 ff 7f  00 00 60 09 c0 2b
              ^
duk_heap_refcount.c:523:16: runtime error: member access within misaligned address 0x7fe62bd15472 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bd15472: note: pointer points here
 00 00  01 00 08 00 00 00 fe ff  ff ff 60 00 c0 2b e6 7f  00 00 08 00 00 00 ff 7f  00 00 60 09 c0 2b
              ^
duk_js_compiler.c:759:5: runtime error: member access within misaligned address 0x7fe62bc02f82 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc02f82: note: pointer points here
 00 00  01 00 09 00 00 00 e6 7f  00 00 d0 2c c0 2b e6 7f  00 00 00 00 00 00 e6 7f  00 00 00 00 00 00
              ^
duk_js_ops.c:571:34: runtime error: member access within misaligned address 0x7fe62bc0d4e2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc0d4e2: note: pointer points here
 00 00  01 00 08 00 00 00 e6 7f  00 00 a0 83 c0 2b e6 7f  00 00 08 00 00 00 e6 7f  00 00 e0 0e c0 2b
              ^
duk_js_ops.c:575:11: runtime error: member access within misaligned address 0x7fe62bc0ecd2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc0ecd2: note: pointer points here
 e6 7f  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 40 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_heap_markandsweep.c:191:6: runtime error: member access within misaligned address 0x7fe62bc089d2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc089d2: note: pointer points here
 00 00  01 00 09 00 00 00 ff 7f  00 00 90 86 c0 2b e6 7f  00 00 00 00 00 00 e6 7f  00 00 00 00 00 00
              ^
duk_heap_markandsweep.c:192:3: runtime error: member access within misaligned address 0x7fe62bc089d2 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc089d2: note: pointer points here
 00 00  01 00 09 00 00 00 ff 7f  00 00 90 86 c0 2b e6 7f  00 00 00 00 00 00 e6 7f  00 00 00 00 00 00
              ^
duk_regexp_compiler.c:1040:2: runtime error: left shift of 31 by 27 places cannot be represented in type 'int'
duk_hobject_props.c:4657:3: runtime error: member access within misaligned address 0x7fe62bc3b792 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc3b792: note: pointer points here
 00 00  01 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:4657:3: runtime error: store to misaligned address 0x7fe62bc3b792 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bc3b792: note: pointer points here
 00 00  01 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 07 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_hobject_props.c:1625:9: runtime error: member access within misaligned address 0x7fe62bf0cf72 for type 'struct duk_tval', which requires 8 byte alignment
0x7fe62bf0cf72: note: pointer points here
 00 00  01 00 07 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
              ^
duk_js_executor.c:352:11: runtime error: left shift of 327675 by 16 places cannot be represented in type 'int'
duk_hobject_props.c:4210:23: runtime error: store to misaligned address 0x7fe62c027269 for type 'duk_uint32_t', which requires 4 byte alignment
0x7fe62c027269: note: pointer points here
 61 00 00  00 01 00 00 00 be 00 00  00 cf 00 00 00 d2 00 00  00 d0 00 00 00 bf 00 00  00 cb 00 00 00
              ^
@svaarala
Copy link
Owner

svaarala commented May 25, 2016

The alignment warnings may be just normal operation: contrary to what the message says, x86/x64 doesn't (generally) require any alignment and that is intentionally taken advantage of to pack internal structures more tightly. You can force align-to-4 or align-to-8 by editing duk_config.h if that's an actual concern (#define DUK_USE_ALIGN_BY in the architecture autodetection section).

The shift issues are something else, and are most likely caused by a 5-bit signed constant being shifted left 27 places (e.g. 23 << 27) with the result then used as an unsigned value. The intermediate value, a 32-bit signed integer cannot technically hold the value without overflow, but there's no actual problem with the shift in practice because the result goes into an unsigned result field. This should be fixable by changing the constants to be unsigned, or just casting them to unsigned before the shift.

@svaarala svaarala added this to the v2.0.0 milestone May 25, 2016
@fatcerberus
Copy link
Contributor

Aren't there performance pitfalls for unaligned access though? I know x86/x64 will allow it but I was under the impression it degraded performance because the CPU has to do extra masking/fetches.

@fatcerberus
Copy link
Contributor

@saghul Aww, you stole my favorite number! :)

@svaarala
Copy link
Owner

Unaligned reads/writes may have almost zero impact to some impact, depending on the specific x86 processor in question (my understanding is that with newer processors the effect is negligible). Unaligned reads are used e.g. for the object property table to minimize memory usage, but you can force a specific alignment by setting DUK_USE_ALIGN_BY=4 or DUK_USE_ALIGN_BY=8. Maybe the default could be reversed so that one would enable unaligned reads/writes only for low memory targets.

@saghul
Copy link
Contributor Author

saghul commented May 25, 2016

Out of curiosity I set it to 8 and the failures got reduced to:

duk_hthread_builtins.c:313:3: runtime error: left shift of 31 by 27 places cannot be represented in type 'int'
duk_regexp_compiler.c:1040:2: runtime error: left shift of 31 by 27 places cannot be represented in type 'int'
duk_js_executor.c:352:11: runtime error: left shift of 327675 by 16 places cannot be represented in type 'int'

@svaarala
Copy link
Owner

@saghul Thanks - while I can't directly replicate those shift warnings (only GCC 6 seems to have a warning option for that) I should be able to fix the warnings. They should be totally benign though.

@saghul
Copy link
Contributor Author

saghul commented May 25, 2016

Clang also has it AFAIK, but not on the version that ships with OSX. There you can compile with -fsanitize=undefined-trap -fsanitize-undefined-trap-on-error, but the output is not that useful.

@svaarala
Copy link
Owner

Tested the microbenchmarks quickly with align-by-1 vs. align-by-8 (natural alignment up to IEEE doubles) and on this laptop (Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz) some tests are marginally faster while others are marginally slower - essentially there's no net difference. But the situation is most likely different for e.g. Atom.

As said above, maybe it'd be a useful change to default to natural alignment on all platforms unless the user explicitly modifies duk_config.h to do something different (e.g. tweaking for low memory platforms).

@svaarala
Copy link
Owner

Clang with -Wshift-sign-overflow catches the above issues, see #813.

@saghul
Copy link
Contributor Author

saghul commented May 25, 2016

Oh, I didn't know that one!

@svaarala svaarala removed this from the v2.0.0 milestone Dec 9, 2016
@tavianator
Copy link

contrary to what the message says, x86/x64 doesn't (generally) require any alignment

This is true (provided the AC flag is off, and no one runs with AC set, so...), but the C language itself imposes alignment restrictions that may be stronger than the underlying platform. In fact, just creating an unaligned pointer is undefined behaviour already. Compilers may optimize assuming that unaligned accesses don't happen, according to C's rules, not just x86's.

However, optimizers are also aware of the alignment requirements of the target, and may generate unaligned accesses at the machine code level. So it's possible to write code that obeys C's alignment rules while still generating efficient unaligned moves, for example by using memcpy() or byte-by-byte copies:

#include <stdlib.h>
#include <string.h>

char *unaligned_bad() {
  char *ptr = malloc(sizeof(int) + 1);
  int *iptr = (int *)(ptr + 1); // Already undefined behaviour
  *iptr = 42;                   // Also undefined
  return ptr;

  // Compiles with gcc -O3 to
  // subq $8, %rsp
  // movl $5, %edi
  // call malloc
  // movl $42, 1(%rax)
  // addq $8, %rsp
  // ret
}

char *unaligned_good() {
  char *ptr = malloc(sizeof(int) + 1);
  int value = 42;
  memcpy(ptr + 1, &value, sizeof(value));
  return ptr;

  // Also compiles with gcc -O3 to
  // subq $8, %rsp
  // movl $5, %edi
  // call malloc
  // movl $42, 1(%rax)
  // addq $8, %rsp
  // ret
}

@svaarala
Copy link
Owner

@tavianator Interesting, I wasn't aware of that. Could you provide a link to the alignment rules?

@svaarala
Copy link
Owner

I guess that also means that any compiler supporting packed structs will necessarily have well defined behavior for unaligned fields because otherwise struct packing wouldn't work. However, compilers not supporting struct packing would be free to do otherwise.

@svaarala
Copy link
Owner

Just to be clear, there are several duk_hobject layout options exactly to allow Duktape to work in environments where strict alignment is required (by the platform or the compiler). So if one is using a hypothetical x86 compiler that doesn't support unaligned pointers, one can just force alignment-to-8 for example. So not all layouts need to work with all compilers/targets - as long as at least one of them works.

But even so, it's interesting if there are no guarantees even on x86 and it's definitely worth mentioning somewhere. The AC flag bit is already mentioned in the code comments:

/* XXX: This is technically not guaranteed because it's possible to configure
 * an x86 to require aligned accesses with Alignment Check (AC) flag.
 */
#if !defined(DUK_USE_ALIGN_BY)
#define DUK_USE_ALIGN_BY 1
#endif

... but like @tavianator said, almost no-one forces alignment on x86 so this is not a practical issue. If it were, it can be overridden by giving -DDUK_USE_ALIGN_BY=8 to configure.py.

Quite apart from the portability issue, I'd like to reduce the layout options to 2 (or even just 1) simply to make the code nicer to maintain.

@tavianator
Copy link

@tavianator Interesting, I wasn't aware of that. Could you provide a link to the alignment rules?

The latest C11 draft says (§6.3.2.3 p7):

A pointer to an object type may be converted to a pointer to a different object type. If the
resulting pointer is not correctly aligned for the referenced type, the behavior is
undefined.

I guess that also means that any compiler supporting packed structs will necessarily have well defined behavior for unaligned fields because otherwise struct packing wouldn't work. However, compilers not supporting struct packing would be free to do otherwise.

Indeed, gcc for example carves out an exception for accesses to packed struct fields, but things like this are still undefined:

struct __attribute__((packed)) foo {
  char c;
  int i;
};
struct foo f;
int *p = &f.i; // Undefined, f.i is not properly aligned

(See some of the discussion at this bug report for example.)

@svaarala
Copy link
Owner

@tavianator What I find puzzling is what is "correctly aligned" - is this a reference to that target architecture alignment rules or something defined at the C standard level (e.g. all integer pointers must be sizeof(int) aligned)?

For example, for x86 the target architecture alignment is align-by-1 (at least with AC off) so wouldn't this mean there are never unaligned pointers on that specific target?

This matter for the warning because the intent for Duktape is to enable what amount to struct packing for the object property table when the target is, indeed, align-by-1.

@lieff
Copy link

lieff commented Oct 16, 2017

Sample above with *iptr = 42; not about alignment rule, but about aliasing. Can be resolved by

__attribute__((__may_alias__))

on most compilers. About alignment it's also true, it can crash on ARM for example. Can be resolved by

__attribute__((aligned (n)));

MSVC have different non-standard keywords.

@svaarala
Copy link
Owner

@lieff Just to be clear, Duktape respects default alignment rules unless DUK_USE_ALIGN_BY (specified for each architecture/compiler) allows lesser alignment. For example, x86 generally allows align-by-1, some 32-bit targets allow align-by-4 (but allow 8-byte values to be only aligned by 4) etc.

So this is just about a warning that gets issued with certain sanitize options.

There was a similar issue with MSVC warnings: when compiling as 32-bit code it would warn about some code constructs being unsafe with 64-bit targets. But the warning was actually incorrect because those constructs would never be enabled on 64-bit targets.

@lieff
Copy link

lieff commented Oct 16, 2017

Warnings is 100% correct, because compiler optimizations can make assumptions from it. Clang already started process of magic broken code on UB. gcc goes on that too.
Even while x86 do not have breaking instruction on unaligned access (it have to be honest, align check is eflags and os dependent), tricky optimization still can break UB code.

@svaarala
Copy link
Owner

svaarala commented Oct 16, 2017

If the warning is correct and leads to actual, broken code on x86, then the best step would be to enable DUK_USE_ALIGN_BY=8 for x86 and x64.

Or if possible:

  • If there is a way to know which compiler versions are affected, restrict the change to those versions only.
  • If this only affects some specific -std=xxx situation, restrict the change to that version (and above) only.

All the x86 compilers I've worked with have supported packed structures because they're so ubiquitous in embedded low memory programming. But I'm happy to change the default if that default is not safe. The goal is to use nice target specific tricks when they are available and avoid them when not, and duk_config.h tries to detect when / when not.

@fatcerberus
Copy link
Contributor

It's my understranding that alignment rules apply to the declared type of the pointer -- so an int* should be int-aligned, double should be double-aligned, etc. This is why it's legal to alias any pointer with a char*, because the alignment requirement then is 1 byte by definition.

@svaarala
Copy link
Owner

svaarala commented Oct 16, 2017

@fatcerberus Right, but there are targets where e.g. IEEE double alignment is align-by-4, and a lot of code happily works with that (and expects that). So that's why I was asking above whether the standard requirement is relative to the target architecture or some abstract, non-target-specific notion of required alignment.

The most portable way to align is obviously multiple of sizeof(type) but that's not always desirable, e.g. when one can save memory by packing data more tightly. It doesn't matter on desktops, but targets running with a few hundred kBs of total memory often benefit from packing - so it's available and duk_config.h tries to autodetect when it's safe to use.

@lieff
Copy link

lieff commented Oct 16, 2017

Standard requirement is non-target specific. It's just many compilers violates standard. But latest clang and gcc integrates some aggressive optimizations that rely on standard. Standard way to fix it is tricky, char * and memcpy for example. Compiler dependent __attribute and __declspec in macros produces more elegant code.

@svaarala
Copy link
Owner

Assuming it's too unsafe to try to default to anything than natural alignment in duk_config.h autodetection, it would also be viable to always default to natural alignment on all targets, and only use less alignment when explicitly requested in configure.py options.

Because that's really the goal of duk_config.h defaults: they should be the safe defaults best suited to the specific target/compiler in question, to allow tricks like packing when it's safe to do, and disable them when it's not (and erring on the side of caution). If there's no reasonable logic to detect those requirements (using architecture and compiler version or standards level gets a bit hairy, for example) then defaulting to the natural alignment would be most straightforward.

Low memory applications would then need to force alignment explicitly, but that's not generally a problem because they do a lot of tweaking with their compilers already for optimizing footprint, memory usage, etc.

@svaarala
Copy link
Owner

@lieff Thanks, I've learned something new (and disturbing) today ;)

This means I would never write an operating system in C because one couldn't do half the things one would want to ;)

@fatcerberus
Copy link
Contributor

Yeah, I was referring to what the C specification says (to my knowledge - a copy of the standard isn't exactly free :)

@svaarala
Copy link
Owner

@tavianator Thanks, excellent points! Couple of comments:

So this is just about a warning that gets issued with certain sanitize options.
There's not really such a thing as "just a warning" when it comes to -fsanitize=undefined; everything it reports is undefined behaviour that may be taken advantage of by optimizers etc.

I agree, my comment was based on the impression that the warning meant that the issue might be a problem in some other compilation context -- not the exact one being compiled with. For example, MSVC will issue 64-bit warnings (if enabled) when compiling against a 32-bit target, and those warnings would be valid if one compiled the exact same code as 64-bit. However, in Duktape's case, the code paths triggering 64-bit warnings in 32-bit mode never come into play in a 64-bit compilation because they'll be #ifdef'd out. So these particular warnings are not relevant at all, and I was (first) under the impression this was a similar case.

So similarly, if a compiler warns about undefined behavior but the compiler itself does provide deterministic behavior, the warning may be irrelevant if that code is #ifdef'd out on other compilers/platforms. For example, making an align-by-1 assumption for a compiler that does work with that alignment is fine, and a warning about "this might be a problem somewhere else" would be irrelevant. But apparently in this case the compiler would itself concretely generate broken code, at least in certain situations, and the warning is valid.

In general, the alignment requirements are fully at the discretion of the compiler.

That's true but there's also a semi-agreed notion of architecture alignment. To be able to link binaries from different compilers and compiler versions, dynamically link libraries, etc, there must be some agreement on what alignment rules are used for struct layout. So in practice x86/x64 targets have been quite deterministic in how they pack structures, and that they allow align-by-1 at least so far as generating correct code and not trapping for unaligned accesses.

Note that I'm not at all arguing that it wouldn't be a good idea to follow natural alignment for portability. But people use struct packing and lower alignment assumptions for specific reasons, usually to save memory. In practice Duktape needs to be able to work with that for low memory targets, as in a lot of cases such assumptions are valid for the specific target (like x86 for a long while) and do free up memory for other parts of the application.

I think the approach in master is the best for this case: assume nothing unportable w.r.t. alignment and document how you can use memory more efficiently for targets where it matters.

Not any char-sized pointer -- only char *, unsigned char *, and signed char * (and const/volatile-qualified versions) get this treatment. Something like struct foo { char bar; } doesn't get the aliasing exception for struct foo * even if sizeof(struct foo) == 1.

Good to know. I mainly meant that typedefs ultimately mapping to char types will get the same aliasing behavior as char pointers.

@tavianator
Copy link

This should be harmless: it is caused by code which checks whether a double is in integer range by casting it to an integer and back, and then comparing the doubles so the truncation (whose result doesn't matter if the result is out of range) is intended.

Casting an out-of-range floating-point value to and integer is explicitly undefined behaviour (as opposed to casting integers to narrower integers, which is merely implementation defined). See for example http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer.

@svaarala
Copy link
Owner

svaarala commented Oct 20, 2017

@tavianator Right, but as long as the result is any integer (and not FORMAT C:\) the warning is entirely harmless because the code doesn't rely on any specific value for out-of-range conversions. The link you provided just points out that the results vary -- which is fine here. Do you think there's a practical issue, i.e. would some realistic platform cause a trap for example?

The reason I ask is that avoiding the cast means peeking into the IEEE representation directly to make the determination which is slower, adding explicit range checks (slower), or relying on more platform bindings which cause portability headaches for people working on bare metal and other exotic platforms.

@svaarala
Copy link
Owner

@tavianator Never mind, re-reading the article:

Dillon Pariente was first to draw our attention to overflow in floating-point-to-integer conversions, which caused CPU exceptions on the target CPU for the code he was analyzing

So yeah, apparently it can be a real world issue.

So the question is:

  • Is there a way to do the fast, ideal check when it's safe (how to detect it?)
  • Or should the code just do a portable check which is awkward and slower

Humdidum.

@tavianator
Copy link

There may exist platforms where the natural instruction to convert a float to an int traps on range errors, I'm not sure. I worry more about the fact that compilers like to use the assumption that undefined behaviour doesn't occur to infer data flow facts. And even if today's compilers don't do a particular thing, tomorrow's compilers may begin to exploit new kinds of undefined behaviour. Here's something I can realistically see an optimizer doing:

double x = ...;
int n = (int)x;
if (n != x) {
  // This can only happen if d was out of range, so the compiler might
  // delete this entire block as unreachable!
  report_error("double was out of range!");
}

Whether or not compilers currently perform such an optimization is less important to me than whether they are allowed to. We might have had a similar discussion 15 years ago about integer overflow, but now compilers will definitely make similar optimizations (e.g. removing if (x + 1 < x) blocks) as they assume signed integer overflow doesn't happen.

@fatcerberus
Copy link
Contributor

I guess the only potential issue would be if the result happened to be one which passed the check even though it's actually out of range.

@svaarala
Copy link
Owner

svaarala commented Oct 20, 2017

Just to be sure, I'm not a fan of relying on undefined behavior or portability assumptions. But over time I've come to the conclusion that you really can't write really portable code and must make some practical assumptions. For example, assuming two's complement and ASCII character set are two assumptions Duktape makes because: (1) almost no-one encounters issues with them, (2) it's very costly to avoid them. For example, out of interest I tried to figure out what avoiding the ASCII assumption (and supporting e.g. EBCDIC) would cost; it's just too much effort for the trouble.

So I try to treat portability assumptions quite practically:

  • Is it trivial to fix with no downsides? If yes, fix.
  • Is anyone likely to run into it in practice (and the fix is not trivial or causes downsides)? If not, probably not worth fixing.
  • Is someone running into it? Needs to be fixed, and if there are too many downsides to removing the assumption, may need a config to option to allow most users to avoid the downsides.

Not sure where this issue lands.

@svaarala
Copy link
Owner

@fatcerberus

I guess the only potential issue would be if the result happened to be one which passed the check even though it's actually out of range.

That shouldn't be an issue. There are two cases:

  • The input is out of integer range. It gets cast to an integer. Whatever the result is, when that integer is casted back to a double, they won't compare equal (because the input was out of range).
  • The input is in range. This case makes no undefined behavior assumptions.

@svaarala
Copy link
Owner

I worry more about the fact that compilers like to use the assumption that undefined behaviour doesn't occur to infer data flow facts.

That's true and I also see this is a potential risk. Especially because the results may be highly unpredictable and difficult to diagnose so there's a large cost when anyone encounters any actual issues.

@svaarala
Copy link
Owner

svaarala commented Oct 20, 2017

@tavianator I'll try to see if there's a way to rework that check into a portable one with minimal downsides; it would then be an easy decision to make. In practical terms, I'd be happy if the code is awkward but compiles down to good assembly using gcc/clang/msvc like e.g. most memcpy() patterns do.

My previous attempt involved inspecting the IEEE double representation, i.e. looking at the exponent and masking the mantissa based on the exponent. This was slower and larger unfortunately, but maybe I can figure out another way. (There's a similar check for fastint code so maybe I'll be able to adapt that.)

Another easy fix would be an explicit range check - and that would be fine if gcc/clang/msvc had enough information to optimize it away (given that they know how they do the conversion). But I suspect they won't be able to infer this and optimize away the check.

@svaarala
Copy link
Owner

svaarala commented Oct 20, 2017

@tavianator Regarding this example:

double x = ...;
int n = (int)x;
if (n != x) {
  // This can only happen if d was out of range, so the compiler might
  // delete this entire block as unreachable!
  report_error("double was out of range!");
}

This particular case couldn't happen here, because the comparison is also used and relevant for in-range inputs (and comes out both true and false for them too). But I get your point, and it would be of course easy to write a compiler that specifically sabotaged the out-of-range case, and similar behavior may turn out unexpectedly when doing enough optimizations, combining optimizations, etc.

Even in this case, while the branch cannot be eliminated entirely, the taken branch of the if-clause could be compiled with knowledge of the range of the values, which incorporated the assumption that the undefined behavior case is not possible. And that might then cause something interesting depending on what's done in the branch.

@lieff
Copy link

lieff commented Oct 20, 2017

May be something like that will help?

	t = (duk_int32_t)(duk_int64_t) x;

Also I'm not sure if isnan() case should be handled.

@svaarala
Copy link
Owner

svaarala commented Oct 20, 2017

@lieff It wouldn't because the input may be any IEEE double (overflowing the 64-bit value) and Duktape doesn't generally assume availability of 64-bit types because they're not always available on real targets (and in some cases, are available but broken).

Re: NaNs, the simple cast approach (which is undefined behavior) deals with them automatically in the final comparison: regardless of what the integer-converted value is, when it is converted back to a double and compared against the input NaN, the result is always false. Same for infinities.

@lieff
Copy link

lieff commented Oct 20, 2017

Hmmm

	t = (duk_int32_t) fmin(fmax(x, INT_MIN), INT_MAX);

then? Still do not looks too complex.

@svaarala
Copy link
Owner

@lieff That would work as long as fmin() and fmax() had the property that NaN+"something" returns the non-NaN "something" (which should be true for modern compilers but not necessarily legacy ones). To be safe an explicit NaN check would make the behavior more portable to older platforms. This approach would be equivalent to an explicit range check I suggested above.

But these both have the downside of additional footprint and being slower. How much that matters depends on whether the code is a hot path. Anything happening in the bytecode executor, for example, gets hugely amplified so each cycle added/removed from opcode handling matters.

I'll see if I can adapt a similar check from fastint code to this purpose. It is faster than a double-integer-double conversion on most platforms I checked - but has the downside of a much larger footprint. It also relies on 64-bit integers now which are not always available, so I'll need to port the check to 32-bit integers which should be straightforward.

@lieff
Copy link

lieff commented Oct 20, 2017

Casting NaN to integer is already UB: pandas-dev/pandas#12303
So yes, I think isnan() check is needed too in any case.

@svaarala
Copy link
Owner

svaarala commented Oct 20, 2017

@lieff Right, but if fmin() and fmax() obey the rule that fmin(NaN, something) and fmax(NaN, something) returns the non-NaN argument, your code works because the result of the fmin() + fmax() will be one of the non-NaN limits. So a NaN-to-integer coercion wouldn't happen unless fmin() and fmax() didn't obey this policy.

@tavianator
Copy link

@lieff That would work as long as fmin() and fmax() had the property that NaN+"something" returns the non-NaN "something" (which should be true for modern compilers but not necessarily legacy ones).

Returning the non-NaN argument (if any) is the standard-specified behaviour of fmin()/fmax(), but maybe there are legacy platforms that violate (or predate) the standard? Anyway, they expand to terribly slow library calls, so if you go that route at least do something like this:

static double duk_fmin(double x, double y) {
  return x < y ? x : y;
}

which propagates the second argument on NaN, and then call it like duk_fmin(x, INT_MAX). This compiles to a single instruction on x86 at least.

Anyway, I'm currently trying to find out if there's a way to get gcc to generate just a single cvttsd2si without relying on undefined behaviour. I'll let you know if I find anything.

@svaarala
Copy link
Owner

@tavianator I think a plain range check using double comparisons would be better than fmin()+fmax(). It would automatically evaluate to false for NaNs. But even that check is unnecessary in practical terms, at least for the vast majority of platforms.

@lieff
Copy link

lieff commented Oct 20, 2017

What case better is compiler dependent. fmin/fmax can be preferred to remove branches. Using gcc7 -O2 -ffast-math it's compiled into:

#include <math.h>
#include <limits.h>
int test(int x) {
    return fmin(fmax(x, INT_MIN), INT_MAX);
}
test(int):
  pxor xmm0, xmm0
  cvtsi2sd xmm0, edi
  maxsd xmm0, QWORD PTR .LC0[rip]
  minsd xmm0, QWORD PTR .LC1[rip]
  cvttsd2si eax, xmm0
  ret

@svaarala
Copy link
Owner

@lieff I was just about to say the same, sometimes branches are more expensive than extra computation.

@svaarala
Copy link
Owner

svaarala commented Nov 7, 2017

@lieff Testing on gcc 7, with -O2 but with standard math (not -ffast-math) fmin() + fmax() make actual function calls. -ffast-math doesn't work with Duktape because it violates IEEE behavior (which are assumed). So the current solution in master is two comparisons. If the limits were symmetric (MIN_XXX = -MAX_XXX) fabs() + comparison would be better, but unfortunately the limits are not symmetric so that doesn't work directly.

@lieff
Copy link

lieff commented Nov 7, 2017

@svaarala Right, ffast-math is dangerous. One way to bypass it - use architecture-dependent intrinsic functions directly. It will be something like that:

#if __SSE2__
#include <xmmintrin.h>
double minsd(double a, double b)
{
    _mm_store_sd(&a, _mm_min_sd(_mm_set_sd(a), _mm_set_sd(b)));
    return a;
}

double maxsd(double a, double b)
{
    _mm_store_sd(&a, _mm_max_sd(_mm_set_sd(a), _mm_set_sd(b)));
    return a;
}
#elif __ARM_NEON__
// same for neon
#else
// general c code
#endif

@tavianator
Copy link

@lieff Or, like I wrote above, just do

static double duk_fmin(double x, double y) {
  return x < y ? x : y;
}

and GCC will optimize it to minsd.

@svaarala
Copy link
Owner

svaarala commented Nov 7, 2017

@tavianator That didn't happen in my case unfortunately. The result was explicit comparisons, at least with -O2: https://github.com/svaarala/duktape/blob/master/src-input/duk_util_double.c#L127-L134.

@svaarala
Copy link
Owner

svaarala commented Nov 7, 2017

@saghul Long time, but could you check what your test yields from master? My test run is now clean so I'll close this issue in a while unless something new is reported (and naturally a separate issue should be opened if something pops up later).

@lieff
Copy link

lieff commented Nov 7, 2017

In my case there no minsd too.

#include <math.h>
#include <limits.h>
static double duk_fmin(double x, double y) {
    return x < y ? x : y;
}
static double duk_fmax(double x, double y) {
    return (x > y ? x : y);
}
int test(int x) {
    return duk_fmin(duk_fmax(x, INT_MIN), INT_MAX);
}

Compiled gcc 7.2 into:

test(int):
  pxor xmm0, xmm0
  cvtsi2sd xmm0, edi
  ucomisd xmm0, QWORD PTR .LC0[rip]
  jbe .L7
  movsd xmm1, QWORD PTR .LC1[rip]
  mov eax, 2147483647
  ucomisd xmm1, xmm0
  cmova eax, edi
  ret
.L7:
  mov eax, -2147483648
  ret
.LC0:
  .long 0
  .long -1042284544
.LC1:
  .long 4290772992
  .long 1105199103

It removes one branch with cmova, but second branch remains.

#include <math.h>
#include <limits.h>
#include <xmmintrin.h>
double minsd(double a, double b)
{
    _mm_store_sd(&a, _mm_min_sd(_mm_set_sd(a), _mm_set_sd(b)));
    return a;
}

double maxsd(double a, double b)
{
    _mm_store_sd(&a, _mm_max_sd(_mm_set_sd(a), _mm_set_sd(b)));
    return a;
}
int test(int x) {
    return minsd(maxsd(x, INT_MIN), INT_MAX);
}

Compiled into:

test(int):
  pxor xmm0, xmm0
  cvtsi2sd xmm0, edi
  movq xmm0, xmm0
  maxsd xmm0, QWORD PTR .LC0[rip]
  movq xmm0, xmm0
  minsd xmm0, QWORD PTR .LC1[rip]
  cvttsd2si eax, xmm0
  ret
.LC0:
  .long 0
  .long -1042284544
  .long 0
  .long 0
.LC1:
  .long 4290772992
  .long 1105199103
  .long 0
  .long 0

And no branches.

@tavianator
Copy link

Oh wow, I'd call that a gcc bug. Looks like the explicit constants inhibit the optimization. This does what I expected at least:

$ cat foo.c
#include <math.h>
#include <limits.h>
static inline double duk_fmin(double x, double y) {
    return x < y ? x : y;
}
static inline double duk_fmax(double x, double y) {
    return x > y ? x : y;
}
int test(int x, int y, int z) {
    return duk_fmin(duk_fmax(x, y), z);
}
$ gcc -O2 -S foo.c
$ cat foo.s
        .file   "foo.c"
        .text
        .p2align 4,,15
        .globl  test
        .type   test, @function
test:
.LFB5:
        .cfi_startproc
        pxor    %xmm2, %xmm2
        pxor    %xmm0, %xmm0
        pxor    %xmm1, %xmm1
        cvtsi2sd        %esi, %xmm2
        cvtsi2sd        %edi, %xmm0
        cvtsi2sd        %edx, %xmm1
        maxsd   %xmm2, %xmm0
        minsd   %xmm1, %xmm0
        cvttsd2si       %xmm0, %eax
        ret
        .cfi_endproc
.LFE5:
        .size   test, .-test
        .ident  "GCC: (GNU) 7.2.0"
        .section        .note.GNU-stack,"",@progbits

@lieff
Copy link

lieff commented Nov 8, 2017

True, have optimizations in that case too.

kyllikki pushed a commit to netsurf-browser/netsurf that referenced this issue Dec 11, 2017
This reproduces upstream commit de7ae8a2ecc597e1c2024c15dbeae4d28c9f2a2c.

*   svaarala/duktape@de7ae8a

It was applied to master after the release of Duktape 2.2.0.  See also:

*   svaarala/duktape#1783
*   svaarala/duktape#812 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants