Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional SIGSEGV when using threadpool and channels #5626

Closed
vegansk opened this issue Mar 29, 2017 · 14 comments
Closed

occasional SIGSEGV when using threadpool and channels #5626

vegansk opened this issue Mar 29, 2017 · 14 comments
Labels

Comments

@vegansk
Copy link
Contributor

vegansk commented Mar 29, 2017

import threadpool

var ch: Channel[int]
ch.open
var pch = ch.addr

proc run(f: proc(): int {.gcsafe.}): proc() =
  let r = spawn f()
  return proc() = await(r)

var working = false

proc handler(): int =
  while true:
    let (h, v) = pch[].tryRecv()
    if not h:
      discard cas(working.addr, true, false)
      break
  1

proc send(x: int) =
  ch.send(x)
  if cas(working.addr, false, true):
    discard run(handler)

for x in 0..1000000:
  send(x)

crashes with error:

Traceback (most recent call last)
threadpool.nim(330)      slave
tp.nim(8)                :tmpWrapper
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
@Araq
Copy link
Member

Araq commented Apr 22, 2018

I suppose spawn does not yet support function pointers. Workaround:

import threadpool

var ch: Channel[int]
ch.open
var pch = ch.addr

var working = false

proc handler(): int =
  while true:
    let (h, v) = pch[].tryRecv()
    if not h:
      discard cas(working.addr, true, false)
      break
  1


proc run(): proc() =
  let r = spawn handler()
  return proc() = await(r)

proc send(x: int) =
  ch.send(x)
  if cas(working.addr, false, true):
    discard run()

for x in 0..1000000:
  send(x)

@genotrance
Copy link
Contributor

No longer crashes with #head.

-------- SNIPPET --------
import threadpool

var ch: Channel[int]
ch.open
var pch = ch.addr

proc run(f: proc(): int {.gcsafe.}): proc() =
  let r = spawn f()
  return proc() = await(r)

var working = false

proc handler(): int =
  while true:
    let (h, v) = pch[].tryRecv()
    if not h:
      discard cas(working.addr, true, false)
      break
  1

proc send(x: int) =
  ch.send(x)
  if cas(working.addr, false, true):
    discard run(handler)

for x in 0..1000000:
  send(x)
-------------------------

-------- OUTPUT --------
Hint: used config file 'c:\users\gt\Desktop\DL\programming\nimdevel\config\nim.cfg' [Conf]
Hint: system [Processing]
Hint: temp [Processing]
Hint: threadpool [Processing]
Hint: cpuinfo [Processing]
Hint: strutils [Processing]
Hint: parseutils [Processing]
Hint: math [Processing]
Hint: bitops [Processing]
Hint: algorithm [Processing]
Hint: unicode [Processing]
Hint: os [Processing]
Hint: times [Processing]
Hint: winlean [Processing]
Hint: dynlib [Processing]
Hint: ospaths [Processing]
Hint: cpuload [Processing]
CC: temp
CC: stdlib_system
CC: stdlib_sharedlist
CC: stdlib_locks
CC: stdlib_threadpool
CC: stdlib_cpuinfo
CC: stdlib_strutils
CC: stdlib_parseutils
CC: stdlib_math
CC: stdlib_bitops
CC: stdlib_algorithm
CC: stdlib_unicode
CC: stdlib_os
CC: stdlib_times
CC: stdlib_winlean
CC: stdlib_dynlib
CC: stdlib_ospaths
CC: stdlib_cpuload
Hint:  [Link]
Hint: operation successful (27677 lines compiled; 2.859 sec total; 31.848MiB peakmem; Debug Build) [SuccessX]

Ran successfully, returned -1073741819
------------------------

-------- NIMTEMP --------
Hint: used config file 'c:\users\gt\Desktop\DL\programming\nimdevel\config\nim.cfg' [Conf]
Hint: system [Processing]
Hint: temp [Processing]
Hint: threadpool [Processing]
Hint: cpuinfo [Processing]
Hint: strutils [Processing]
Hint: parseutils [Processing]
Hint: math [Processing]
Hint: bitops [Processing]
Hint: algorithm [Processing]
Hint: unicode [Processing]
Hint: os [Processing]
Hint: times [Processing]
Hint: winlean [Processing]
Hint: dynlib [Processing]
Hint: ospaths [Processing]
Hint: cpuload [Processing]
CC: temp
CC: stdlib_system
CC: stdlib_sharedlist
CC: stdlib_locks
CC: stdlib_threadpool
CC: stdlib_cpuinfo
CC: stdlib_strutils
CC: stdlib_parseutils
CC: stdlib_math
CC: stdlib_bitops
CC: stdlib_algorithm
CC: stdlib_unicode
CC: stdlib_os
CC: stdlib_times
CC: stdlib_winlean
CC: stdlib_dynlib
CC: stdlib_ospaths
CC: stdlib_cpuload
Hint:  [Link]
Hint: operation successful (27677 lines compiled; 5.860 sec total; 31.711MiB peakmem; Debug Build) [SuccessX]

Ran successfully, returned -1073741819
-------------------------

-------- VERSION --------
Nim Compiler Version 0.18.1 [Windows: i386]
Compiled at 2018-07-02
Copyright (c) 2006-2018 by Andreas Rumpf

git hash: d1f983b37cf1b0aa03b1424c3cca9d4212558db9
active boot switches: -d:release
-------------------------

@vegansk vegansk closed this as completed Jul 3, 2018
Varriount pushed a commit that referenced this issue Jul 4, 2018
@skilchen
Copy link
Contributor

skilchen commented Jul 4, 2018

On my Ubuntu 16.04 nim c --threads:on -r tp.nim still produces mostly the same runtime crash as before (linenumber in threadpool changed) :

Traceback (most recent call last)
threadpool.nim(339)      slave
tp.nim(8)                :tmpWrapper
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

If compiled with -d:release it doesn't crash.

But what should this program actually do?

  • Silently stop after some random number of tryRecv()calls?
  • Receive every x from the channel that was sent to it?

It effectively stops after a random number of tryRecv() calls, even with the workaround proposed by @Araq.

The test case from #8204 only tests if the program can be compiled. It doesn't run it. With a

discard """
action: run
"""

block at the top it would probably fail on the CI platforms.

@Araq
Copy link
Member

Araq commented Jul 4, 2018

threadpool simply doesn't work well and either needs to be rewritten or deprecated.

@genotrance
Copy link
Contributor

Please reopen this, it is broken on Linux.

@Araq Araq reopened this Jul 4, 2018
@genotrance
Copy link
Contributor

How come testament doesn't run this?

# ------------------------- threading tests -----------------------------------
proc threadTests(r: var TResults, cat: Category, options: string) =
template test(filename: untyped) =
testSpec r, makeTest(filename, options, cat, actionRun)
testSpec r, makeTest(filename, options & " -d:release", cat, actionRun)
testSpec r, makeTest(filename, options & " --tlsEmulation:on", cat, actionRun)
for t in os.walkFiles("tests/threads/t*.nim"):
test(t)

Per the code, all threading test cases are executed with actionRun.

@Araq
Copy link
Member

Araq commented Jul 4, 2018

Good catch, mayb it did run it and yet it was green.

@genotrance
Copy link
Contributor

Nope, it doesn't run - setting actionRun in makeTest() is completely ignored in testSpec().

if test.action != actionRunNoSpec:
expected = parseSpec(tname)
else:
specDefaults expected
expected.action = actionRunNoSpec

The default value of action ends up being actionCompile but can change to actionRun if output, exitCode or other such options are defined in the test. I see some test categories which could possibly not run the tests at all if there's nothing in the test that flips to actionRun. For example: the gc/cyclecollector.nim.

I think this code should be changed to:

  if test.action != actionRunNoSpec:
    expected = parseSpec(tname)
    if test.action == actionRun and expected.action == actionCompile:
      expected.action = actionRun
  else:
    specDefaults expected
    expected.action = actionRunNoSpec

Feedback appreciated.

@Araq
Copy link
Member

Araq commented Jul 5, 2018

Sounds good.

Araq pushed a commit that referenced this issue Jul 5, 2018
@FedericoCeratto
Copy link
Member

FedericoCeratto commented Mar 22, 2019

This happens with Nim 0.19.4 on Linux:

All receiver block after receiving 2 or 202, 3 and 303 and later are never received.

import os
import threadpool
import times

type EventChannel = Channel[int]

proc sender(c: ptr EventChannel, starting_from:int) {.thread.} =
  for i in 0..5:
    doAssert c[].trysend(i + starting_from) == true
    echo "sent " & $(i + starting_from)
    sleep 10

proc sender_broken(c: ptr EventChannel, starting_from: int) {.thread.} =
  var chan = c[]
  for i in 0..5:
    doAssert chan.trysend(i + starting_from) == true
    echo "sent " & $(i + starting_from)
    sleep 10

proc receiver(pc: ptr EventChannel, n:int) {.thread.} =
  var chan = pc[]
  chan.open(0)
  while true:
    echo $n & " blocks"
    let x = recv(chan)
    echo($n & " received from chan: " & $x)

var chan: EventChannel
chan.open(0)
spawn sender(addr chan, 0)
spawn sender_broken(addr chan, 100)
spawn sender(addr chan, 200)

spawn receiver(addr chan, 1)
sleep 10
spawn receiver(addr chan, 2)
sleep 10
spawn receiver(addr chan, 3)

sleep 500
echo "Finished listening"

close(chan)

Occasional SIGSEGV:

sent 0
sent 200
sent 100
1 blocks
1 received from chan: 0
1 blocks
1 received from chan: 100
1 blocks
sent 1
sent 201
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

@Araq
Copy link
Member

Araq commented Mar 25, 2019

It's invalid to copy a channel like this var chan = pc[].

@narimiran
Copy link
Member

This happens with Nim 0.19.4 on Linux:
All receiver block after receiving 2 or 202, 3 and 303 and later are never received.

@FedericoCeratto, I've tried your code example with Nim devel v1.3.5 and i couldn't reproduce the SIGSEGV. Can you confirm?

narimiran added a commit to narimiran/Nim that referenced this issue Jan 20, 2021
@narimiran
Copy link
Member

I couldn't reproduce the SIGSEGV.

The missed keyword was occasional SIGSEGV — I was writing the test case for this, everything was passing at first, and then I noticed it fails sometimes, in ~1/4 of the runs.

@timotheecour timotheecour changed the title Crash when using threadpool and channels occasional SIGSEGV when using threadpool and channels Jan 29, 2021
@Araq
Copy link
Member

Araq commented Aug 27, 2023

Use Malebolgia instead and avoid channels with thread pools.

@Araq Araq closed this as completed Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants