Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking parallel implementation leads to segfaults #136

Closed
davnn opened this issue Jan 26, 2022 · 2 comments
Closed

Benchmarking parallel implementation leads to segfaults #136

davnn opened this issue Jan 26, 2022 · 2 comments

Comments

@davnn
Copy link
Contributor

davnn commented Jan 26, 2022

Hi!

I'm currently trying to benchmark our parallel implementation of knn at different numbers of samples and dimensions.

using BenchmarkTools: @benchmark
import NearestNeighbors as NN

const input_samples = Int.(exp2.(9:14))
const input_dims = Int.(exp2.(7:12))

function knn_parallel(tree, X, k, sort)
    # pre-allocate the result arrays (as in NearestNeighbors.jl)
    indices = eachindex(X)
    dists = [Vector{NN.get_T(eltype(X))}(undef, k) for _ in indices]
    idxs = [Vector{Int}(undef, k) for _ in indices]

    # get number of threads
    nThreads = Threads.nthreads()

    # partition the input array equally
    n_samples = length(X)
    divides_data = mod(n_samples, nThreads) == 0
    partition_size = divides_data ? n_samples ÷ nThreads : n_samples ÷ nThreads + 1
    partitions = Iterators.partition(indices, partition_size)

    # parallel computation over the equal array splits
    Threads.@threads for idx in collect(partitions)
        @inbounds idxs[idx], dists[idx] = NN.knn(tree, X[idx], k, sort)
    end
    idxs, dists
end

for dims in input_dims, samples in input_samples
    @info "Running benchmark with $dims dims  and $samples samples"
    X = [NN.SizedVector{dims}(rand(Float32, dims)) for _ in 1:samples]
    cpu_bench = @benchmark knn_parallel(NN.BruteTree($X), $X, 10, true)
    display(cpu_bench)
end

However, the benchmark randomly leads to memory corruptions and crashes. Could not reproduce with a sequential call to knn yet. Do you maybe have an idea how that could happen?

@KristofferC
Copy link
Owner

Do you maybe have an idea how that could happen?

Could you try run with --check-bounds=no to ensure there is no bounds error happening that is removed by an @inbounds. Also, see #125

@davnn
Copy link
Contributor Author

davnn commented Feb 1, 2022

Interestingly, there are no problems with --check-bounds=yes, i.e. there are no out-of-bounds errors. Can anyone reproduce, maybe? Edit: My first guess would be that resource utilization is so great that my slightly overclocked processor is not handling the load, which does not happen with boundchecks.

OS Name	Microsoft Windows 10 Pro
Version	10.0.19044 Build 19044
Processor	AMD Ryzen 9 3900X 12-Core Processor, 4200 Mhz, 12 Core(s), 24 Logical Processor(s)

@davnn davnn closed this as completed Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants