Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastCopy may silently fail to do anything #12344

Closed
iximeow opened this issue Jan 10, 2019 · 4 comments

Comments

@iximeow
Copy link

@iximeow iximeow commented Jan 10, 2019

Steps to Reproduce

  1. Build and run
public class foo {
  public static void Main() {
    int[] startArray = new int[0x1fffffff + 3];
    int[] lastArray = new int[startArray.Length * 2];
    startArray[0] = 0xeedface;
    System.Array.Copy(startArray, 0, lastArray, 0, startArray.Length);
    System.Console.WriteLine("post-copy: " + lastArray[0]);
  }
}

Current Behavior

After "copy" this prints 0, rather than 250477262. In fact, the entire array is not copied and this program completes suspiciously quickly.

If startArray is adjusted to new int[0x1fffffff + 2]; the mono runtime will crash with a stack trace like

Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Array.FastCopy (System.Array,int,System.Array,int,int) <0xffffffff>
  at System.Array.Copy (System.Array,int,System.Array,int,int) <0x0008a>
  at foo.Main () <0x0006a>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>

Expected Behavior

startArray copies into lastArray (and no choice of array size leads to a crash)

On which platforms did you notice this

[ ] macOS
[x] Linux
[ ] Windows

Version Used:

4.2.1, 5.1.*

Stacktrace

This is just the report from mono

Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Array.FastCopy (System.Array,int,System.Array,int,int) <0xffffffff>
  at System.Array.Copy (System.Array,int,System.Array,int,int) <0x0008a>
  at foo.Main () <0x0006a>
  at (wrapper runtime-invoke) object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

	mono() [0x49ff2f]
	mono() [0x4f390e]
	mono() [0x426c19]
	/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f915d39c390]
	mono() [0x62be0c]
	mono() [0x539962]
	[0x416731aa]

Debug info from gdb:

[New LWP 22014]
[New LWP 22015]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f915d39bf7b in __waitpid (pid=22016, stat_loc=0x7f915ded310c, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
29	../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory.
  Id   Target Id         Frame 
* 1    Thread 0x7f915dead780 (LWP 22013) "mono" 0x00007f915d39bf7b in __waitpid (pid=22016, stat_loc=0x7f915ded310c, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
  2    Thread 0x7f915c7ff700 (LWP 22014) "mono" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  3    Thread 0x7f915a715700 (LWP 22015) "Finalizer" 0x00007f915d39a827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x956820) at ../sysdeps/unix/sysv/linux/futex-internal.h:205

Thread 3 (Thread 0x7f915a715700 (LWP 22015)):
#0  0x00007f915d39a827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x956820) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x956820, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f915d39a8d4 in __new_sem_wait_slow (sem=0x956820, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f915d39a97a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x0000000000622e06 in mono_sem_wait ()
#5  0x00000000005a5742 in ?? ()
#6  0x0000000000587b2b in ?? ()
#7  0x0000000000629a8c in ?? ()
#8  0x00007f915d3926ba in start_thread (arg=0x7f915a715700) at pthread_create.c:333
#9  0x00007f915d0c841d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7f915c7ff700 (LWP 22014)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005fdcdb in ?? ()
#2  0x00007f915d3926ba in start_thread (arg=0x7f915c7ff700) at pthread_create.c:333
#3  0x00007f915d0c841d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7f915dead780 (LWP 22013)):
#0  0x00007f915d39bf7b in __waitpid (pid=22016, stat_loc=0x7f915ded310c, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
#1  0x00000000004a0006 in ?? ()
#2  0x00000000004f390e in ?? ()
#3  0x0000000000426c19 in ?? ()
#4  <signal handler called>
#5  0x000000000062be0c in ?? ()
#6  0x0000000000539962 in ?? ()
#7  0x00000000416731aa in ?? ()
#8  0x00007f90d3fff018 in ?? ()
#9  0x0000000000000000 in ?? ()

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

Additional notes

I tried root causing this and am fairly confident this is a pair of overflow-related issues:

First, both element_size and length in element_size * length in System.Array.FastCopy are int, where the parameter on the mono_gc_memmove_atomic side is size_t. So in cases like this, the product overflows and passes 0xffffffff80000004 or something similar.

From there, I think the reason this doesn't segfault due to wildly copying until the end of memory is this size_t size ends up passed to mono_gc_memmove_aligned. From there it ends up in either MEMMOVE_WORDS_UPWARD or MEMMOVE_WORDS_DOWNWARD for the bulk of a copy in either direction, where the size_t size is reinterpreted as int and back to being a negative number. This results in the loop immediately exiting for copies in either direction, and I think explains why mono does not crash but does appear to finish the copy suspiciously quickly and actually copy nothing! 💃

I think the crash that manifests from new int[0x1fffffff + 2] is due to me running on a 64-bit system, meaning one int is copied in a path to handle the last few words for unaligned copies, actually copying from a bogus address, and causing the observed segfault.

@iximeow

This comment has been minimized.

Copy link
Author

@iximeow iximeow commented Jan 10, 2019

I was going to try also providing a patch/PR to fix this but am having issues building mono for other reasons - my suspicion is that changing MEMMOVE_WORDS_DOWNWARD and MEMMOVE_WORDS_UPWARD to size_t for iteration and bounds checks (or using memmove itself? I can't say I understand why not), while also changing element_size * length in FastCopy to size_t, should address this.

This issue manifests in strange ways in the rest of the stdlib and was initially found by a Dictionary appearing to forget all its members: Dictionary.Resize() for dictionaries sufficiently large (~90 million entries) will hit this in the referenced Array.Copy and drop all entries.

@lewurm lewurm added the area-Runtime label Jan 10, 2019
@marek-safar

This comment has been minimized.

Copy link
Member

@marek-safar marek-safar commented Jan 11, 2019

This looks quite nasty. I think @BrzVlad looked into this area recently. Vlad could you confirm the issue?

@BrzVlad

This comment has been minimized.

Copy link
Member

@BrzVlad BrzVlad commented Jan 14, 2019

On master this was already fixed by unrelated refactoring from 0d6d334

monojenkins added a commit to monojenkins/mono that referenced this issue Jan 14, 2019
monojenkins added a commit that referenced this issue Jan 15, 2019
[2018-12] [runtime] Fix integer overflow when copying array

Fixes #12344
monojenkins added a commit that referenced this issue Jan 15, 2019
[2018-10] [runtime] Fix integer overflow when copying array

Fixes #12344

Backport of #12407.

/cc @BrzVlad
@BrzVlad BrzVlad closed this Jan 15, 2019
@iximeow

This comment has been minimized.

Copy link
Author

@iximeow iximeow commented Jan 15, 2019

I was going to comment that that refactoring didn't touch the bug, but I see the followup commit fixing the overflow, and can confirm it's fixed on master - thanks!

Just a heads up, MEMMOVE_WORDS_UPWARD and MEMMOVE_WORDS_DOWNWARD still reinterpret the number of words to move as an int, so this bug can still creep up there. While my original test case does not trigger this, if someone were to try copying 0x80000000 or more words (18gb of data on a 64bit system), it looks like that would break as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.