Repeated deepCopy() on a recursive data structure eventually crashes #4340

Closed
mcclure opened this Issue Jun 15, 2016 · 4 comments

Projects

None yet

3 participants

@mcclure
mcclure commented Jun 15, 2016

I am using Nim self-built from git 07d7d35.

I run this program:

type Mirror = ref object
    tag:int
    other:array[0..1, Mirror]

var m1 = Mirror(tag:1)
var m2 = Mirror(tag:2)
var m3 = Mirror(tag:3)

m1.other[0] = m2; m1.other[1] = m3
m2.other[0] = m1; m2.other[1] = m3
m3.other[0] = m1; m3.other[1] = m2

for i in 1..30000:
    echo i
    var mx : Mirror; mx.deepCopy(m1)
    m1 = mx

This creates a data structure of three objects that all reference each other then deepcopies it over and over. On my machine this runs for 20,120 iterations, then crashes. The traceback looks like

Traceback (most recent call last)
test3.nim(15)            test3
deepcopy.nim(127)        genericDeepCopy
deepcopy.nim(109)        genericDeepCopyAux
deepcopy.nim(79)         genericDeepCopyAux
deepcopy.nim(21)         genericDeepCopyAux
deepcopy.nim(18)         genericDeepCopyAux
deepcopy.nim(83)         genericDeepCopyAux
deepcopy.nim(105)        genericDeepCopyAux
gc.nim(481)              newObj
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Error: execution of an external program failed: 'c:\msys64\home\andi\work\h\nim-practice\test3.exe '

In my tests:

  • Larger and more complicated data structures crashed more quickly. The crash was almost always with an "attempt to read from nil" with a deep call stack of genericDeepCopyAux's. On a minority of tests I saw a stack overflow.
  • If I add GC_fullCollect() at the end of the for loop (inside the for loop), it never crashes.

Because my test in #4339 suggests that deepCopy does not alter the structure after the first iteration, and because of the GC_fullCollect thing, my guess is that this occurs when a garbage collection occurs in the middle of a deepcopy.

Expected behavior: If deepCopy on a recursive data structure is something you're not supposed to do this should be documented, but even if it's something you're not supposed to do it probably shouldn't segfault.

@mcclure
mcclure commented Jun 15, 2016

By the way, neither this nor #4339 are blocking me on anything. I discovered this by accident when I used deepCopy in a place I meant to use shallowCopy anyway.

@Araq Araq added the Showstopper label Jun 16, 2016
@Araq
Member
Araq commented Jun 16, 2016

DeepCopy is essential for Nim's multi threading though and NEEDS to work. :-)

@Araq Araq added a commit that closed this issue Jul 6, 2016
@Araq Araq fixes #4340 caa7f42
@Araq Araq closed this in caa7f42 Jul 6, 2016
@Varriount
Contributor

@Araq So why is it that the GC needs to be disabled during deepcopies?

@Araq
Member
Araq commented Jul 7, 2016

@Varriount Because deepCopy temporarily modifies the GC header. It uses a "forwarding" pointer to avoid a temporary hash table for the cycle detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment