Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent "wrong number of arguments" in parallel #3383

Closed
alanedelman opened this issue Jun 13, 2013 · 3 comments
Closed

intermittent "wrong number of arguments" in parallel #3383

alanedelman opened this issue Jun 13, 2013 · 3 comments
Labels
bug Indicates an unexpected problem or unintended behavior parallelism Parallel or distributed computation priority This should be addressed urgently
Milestone

Comments

@alanedelman
Copy link
Contributor

I've been loving running a parallel histogram on up to 75 processors lately.
I'd say once out of every four or five runs, I get some havoc that seems to be
related to the handover of messages from remote processors.

I will show

  1. the code working

  2. the code breaking

  3. the code

  4. the code working

julia> include("demodriver.jl")
Matrix Size:25
Number of Monte Carlo trials (in millions): 0.164
Trials/Processor: 4000
Number of Processors: 41

2)the code breaking

    From worker 33:  in anonymous at multi.jl:416
    From worker 34: exception on 34: ERROR: in anonymous: wrong number of arguments
    From worker 34:  in anonymous at multi.jl:1182
    From worker 34:  in anonymous at multi.jl:1142
    From worker 34:  in anonymous at multi.jl:416
    From worker 35: exception on 35: ERROR: in anonymous: wrong number of arguments
    From worker 35:  in anonymous at multi.jl:1182
    From worker 35:  in anonymous at multi.jl:1142
    From worker 35:  in anonymous at multi.jl:416
    From worker 37: exception on 37: ERROR: in anonymous: wrong number of arguments
    From worker 37:  in anonymous at multi.jl:1182
    From worker 37:  in anonymous at multi.jl:1142
    From worker 37:  in anonymous at multi.jl:416
    From worker 36: exception on 36: ERROR: in anonymous: wrong number of arguments
    From worker 36:  in anonymous at multi.jl:1182
    From worker 36:  in anonymous at multi.jl:1142
    From worker 36:  in anonymous at multi.jl:416
    From worker 39: exception on 39: ERROR: in anonymous: wrong number of arguments
    From worker 39:  in anonymous at multi.jl:1182
    From worker 39:  in anonymous at multi.jl:1142
    From worker 39:  in anonymous at multi.jl:416
    From worker 40: exception on 40: ERROR: in anonymous: wrong number of arguments
    From worker 40:  in anonymous at multi.jl:1182
    From worker 40:  in anonymous at multi.jl:1142
    From worker 40:  in anonymous at multi.jl:416
    From worker 41: exception on 41: ERROR: in anonymous: wrong number of arguments
    From worker 41:  in anonymous at multi.jl:1182
    From worker 41:  in anonymous at multi.jl:1142
    From worker 41:  in anonymous at multi.jl:416
    From worker 38: exception on 38: ERROR: in anonymous: wrong number of arguments
    From worker 38:  in anonymous at multi.jl:1182
    From worker 38:  in anonymous at multi.jl:1142
    From worker 38:  in anonymous at multi.jl:416
ERROR: no method +(Array{Int64,1},ErrorException)
 in mapreduce at reduce.jl:118
 in preduce at multi.jl:1245
 in include_from_node1 at loading.jl:82
at /home/edelman/julia/borodingorin/demodriver.jl:1329
  1. the code

==== demodriver.jl ======

@everywhere include("demo.jl");

n=25
t=4000
x=[0:.005:0.75]

p=nprocs()
println("Matrix Size:",n)
println("Number of Monte Carlo trials (in millions): ", p*t/1e6)
println("Trials/Processor: ",t)
println("Number of Processors: ",p)

z=@parallel (+) for i=1:p 
    demo(n,t,x)
end 

x=x[2:end]-(x[2]-x[1])/2
y=z/(p*t)/(x[2]-x[1])

==== demo.jl================

function demo(n,t,x)
v=zeros(t);
for i=1:t
   a=randn(n,n);
   v[i]=(eigvals(a'*a))[3];
end
(x,y)=hist(v,x)
return(y)
end

[pao: formatting]

@staticfloat
Copy link
Sponsor Member

Can I assume this is run on julia.mit.edu? How recent is your Julia build?

@ViralBShah
Copy link
Member

Yes, it is on julia.mit.edu.

@alanedelman Can you run this with the latest julia?

@JeffBezanson
Copy link
Sponsor Member

I can still get this to happen intermittently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior parallelism Parallel or distributed computation priority This should be addressed urgently
Projects
None yet
Development

No branches or pull requests

4 participants