I am trying to parallelize my computation using this system; at its core I need to evaluate a single function call with multiple sets of arguments and so it is embarrassingly parallel. I have tried the calculation in serial and reimplemented it with eager-future2. I get a different result in the parallel form; I had a similar result with Pcall. I am using SBCL on Debian amd64. Unfortunately, it is a complicated collection of code so I can't post a simple example, but I was wondering if anyone else has seen this, and what debugging techniques might be useful here.
This sounds like a race condition in eager-future2. It's hard to debug without being able to run the code - are you able to share it?
Debugging these things generally involves trying to distill the minimum amount of code that reliably produces the problem from the code that you're starting with, and then thinking very carefully about the concurrent code (locks and semaphores). It takes a long time (at least for me).
The alternative is to sit down and prove the eager-future2 code correct. I haven't done a correctness proof like that in years though, and I wasn't very good at it back then either.
It's quite possible that the problem lies in the application. It is big and complex and though I try to write in a threadsafe style, mistakes are always possible. So the bottom line is I don't have a minimal amount of code. I suspect that I need to hack it down until I get a minimal set of code and in that process I'll probably thing(s) that is not threadsafe. I don't suppose there are any tools that help detect things like writing/reading global variables, etc., are there? I'm using SBCL so it could be some tool specific to that implementation.
Thanks for the information; I use CCL occasionally and I did not know about watched-objects. I will look into it.