Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that BitArray isn't threadsafe #33750

Open
ianshmean opened this issue Nov 2, 2019 · 9 comments · May be fixed by #33754
Open

Document that BitArray isn't threadsafe #33750

ianshmean opened this issue Nov 2, 2019 · 9 comments · May be fixed by #33754

Comments

@ianshmean
Copy link
Contributor

@ianshmean ianshmean commented Nov 2, 2019

I'm seeing some cases (in 1.3-rc4) where Threads.@threads is missing some of the iteration assignments. With this example, 1 or a few are missed most of the times I run this.

Wrapping vals[i] = false in a spinlock fixes this, but I was under the impression that a @threads approach like this, where each element in the vals array is accessed once, shouldn't need a spinlock to avoid race conditions.

n = 10
vals = trues(n)
threadsUsed = Int[]
Threads.@threads for i = 1:n
    vals[i] = false
    !in(Threads.threadid(), threadsUsed) && push!(threadsUsed, Threads.threadid())
    sleep(0.1)
end
if any(vals) #Check that all elements have been iterated (set to false)
    @show vals
    @show threadsUsed
    @show Threads.nthreads()
end
vals = Bool[0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
threadsUsed = [1, 2, 3, 5, 4]
Threads.nthreads() = 5
@Keno

This comment has been minimized.

Copy link
Member

@Keno Keno commented Nov 2, 2019

The problem is that vals is a BitArray. Concurrent accesses to BitArray are racy.

@ianshmean

This comment has been minimized.

Copy link
Contributor Author

@ianshmean ianshmean commented Nov 2, 2019

Ah ok. Indeed, this doesn't have the same problem:

n = 10
vals = ones(n)
threadsUsed = Int[]
Threads.@threads for i = 1:n
    vals[i] = 0
    !in(Threads.threadid(), threadsUsed) && push!(threadsUsed, Threads.threadid())
    sleep(0.1)
end
if any(vals.==1) #Check that all elements have been iterated
    @show vals
    @show threadsUsed
    @show Threads.nthreads()
end
@ianshmean ianshmean changed the title `Threads.@threads` randomly missing assignment within loop `Threads.@threads` randomly missing assignment to a BitArray Nov 2, 2019
@yuyichao yuyichao added the doc label Nov 2, 2019
@yuyichao yuyichao changed the title `Threads.@threads` randomly missing assignment to a BitArray Document that BitArray isn't threadsafe Nov 2, 2019
@yuyichao

This comment has been minimized.

Copy link
Contributor

@yuyichao yuyichao commented Nov 2, 2019

BitArray basically can't be threadsafe (and if anything not for this usage pattern). This should be documented.

@ianshmean

This comment has been minimized.

Copy link
Contributor Author

@ianshmean ianshmean commented Nov 2, 2019

I guess it may be common for people to use a BitArray to check that every iteration has been run, so making the issue clear in the docs sounds good

@ianshmean

This comment has been minimized.

Copy link
Contributor Author

@ianshmean ianshmean commented Nov 2, 2019

As @yuyichao pointed out in the other thread above, Array[Bool] is threadsafe, while BitArray isn't. That should probably be noted in the docs.

Are there any other "array" types that aren't threadsafe?

@yuyichao

This comment has been minimized.

Copy link
Contributor

@yuyichao yuyichao commented Nov 2, 2019

Well, you can create as many non-threadsafe arrays as you want = = ....

If it's just overwriting without resizing (so be careful with sparse array), as long as different elements are at distinct memory location it'll be fine. BitArray violates this so it's unsafe.

I don't think Base offers much other options (not including wrapper types like SubArray's) to begin with though ...

@ianshmean

This comment has been minimized.

Copy link
Contributor Author

@ianshmean ianshmean commented Nov 2, 2019

I guess I meant common base types specifically.

How about the Threads.@threads docs read something like:


A macro to parallelize a for-loop to run with multiple threads. This spawns nthreads() number of threads, splits the iteration space amongst them, and iterates in parallel. A barrier is placed at the end of the loop which waits for all the threads to finish execution, and the loop returns.

Note that only thread-safe types should be used. For example, the BitArray type is not threadsafe, but Array{Bool} is.

An example that sanity-checks assignment in each iteration:

n = 10
assigned = zeros(Bool, n)
Threads.@threads for i in 1:n
    assigned[i] = true
end
@assert all(assigned)
@yuyichao

This comment has been minimized.

Copy link
Contributor

@yuyichao yuyichao commented Nov 2, 2019

This should be in the document for the type not the threads macro. A generic mention of thread safe type in the threading section (still not the one for the macro) would be fine too if it's not already there.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

@StefanKarpinski StefanKarpinski commented Nov 4, 2019

Other kinds of arrays are also not threadsafe. For example, writes to float arrays may not be atomic and the value that ends up in the array may not be one that was written by any thread. I guess the additional danger with BitArrays is that even if threads never try to write to the same slots, there is still an issue with adjacent slots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.