Skip to content

Releases: orxfun/orx-concurrent-vec

IntoIterator is required for PinnedVec

09 May 20:31
7f97686
Compare
Choose a tag to compare
Merge pull request #9 from orxfun/IntoIterator-is-required-for-PinnedVec

IntoIterator is required for PinnedVec

PinnedConcurrentCol migration

12 Apr 04:02
26abf58
Compare
Choose a tag to compare

ConcurrentVec now is only a wrapper around a ConcurrentBag with safe get and iter implementations.

  • ConcurrentBag implements Deref and DerefMut for PinnedConcurrentCol . And ConcurrentVec for ConcurrentBag .
  • Documentation is revised.
  • Benchmarks are repeated, performance has not changed (as expected).

fix/issue-with-heap-corruption-is-fixed

08 Apr 14:15
d787c65
Compare
Choose a tag to compare

The fix is applied in orx-concurrent-bag. This PR involves updating the dependency version. Further, relevant test coverage is increased.

Performance improvements by faster mem zeroing

07 Apr 19:02
6502fa7
Compare
Choose a tag to compare

The main difference of ConcurrentVec from more performant ConcurrentBag is due to the fact that ConcurrentVec zeroes out new positions upon each new allocation. This is a requirement to provide safe versions of the get and iter methods; in other words, to allow safe concurrent read and write opeations.

In the prior version, memory zeroing had been handled by the concurrent bag's fill strategy which requires iteration over the new allocated positions.

In the new version. a memory zeroing concurrent bag is wrapped which now directly uses PinnedVecs grow_to method to zero out the memory on allocation. The implementations often reduce to a memset call and hence provides performance improvements.

Benchmarks are repeated, expected improvements are observed and reported.

`ConcurrentVec` wraps and extends `ConcurrentBag`

28 Mar 21:35
0a9455a
Compare
Choose a tag to compare
  • To completely avoid code repetition, ConcurrentVec<T> now wraps a ConcurrentBag<Option<T>>.
  • The separation is made clear: ConcurrentBag is performance optimal for concurrent collection, ConcurrentVec is a an efficient alternative which provides safe reading during growth.
    • Read and write feature is demonstrated with a simple example.
  • On the other hand, concurrent vec api and concurrent bag api are made almost identical to allow for changing the concurrent collection type easily.
  • Benchmarks are added, run and reported.
  • Documentation is revised.
  • Tests are extended.

Changing underlying storage to any `PinnedVec`

22 Mar 04:26
e27b796
Compare
Choose a tag to compare
  • Instead of hard-coded SplitVec, the concurrent bag now wraps any PinnedVec.
  • capacity method is added.
  • Debug is implemented.
  • Benchmarks added and reported.
  • AtomicUsize is used for capacity rather than usize. This update does not lead to diminished performance.
  • Memory orderings of atomic operations are strengthened.
  • UnsafeCell is used for the pinned vector, rather than direct cast.
  • Tests extended.

Inside the push

In the prior version, growth of the bag required two operations:

  • setting length of the pinned vector to its capacity, and
  • growing the pinned vector to its next size.

The reason the first operation is required is due to the fact that the second operation can only succeed when vector is at its capacity; i.e., len == capacity.

Having to make two mutations is not good for heavy load situations. grow_to method, on the other hand, does not require the pinned vector to be at its length, and hence, we do not need to make the set_len call. Furthemore, it will be required for a possible extend implementation.

Also

  • Errors are organized.
  • Memory orderings are strengthened.

Benchmarks and unsafe primitives

15 Mar 04:53
3c25783
Compare
Choose a tag to compare
  • Benchmarks are documented and reported.
  • push_none is implemented to allow reserving positions, or pushing holes to the vector.
  • unsafe primitives take and put are implemented.

ConcurrentVec is defined with push, iter, get

11 Mar 21:05
Compare
Choose a tag to compare

orx-concurrent-vec

An efficient and convenient thread-safe grow-only read-and-write collection, ideal for collecting results concurrently.

  • convenient: the vector can be shared among threads simply as a shared reference, not even requiring Arc,
  • efficient: for collecting results concurrently:
    • rayon is significantly faster than ConcurrentVec when the elements are small and there is an extreme load (no work at all among push calls),
    • ConcurrentVec is significantly faster than rayon when elements are large or there there is some computation happening to evaluate the elements before the push calls,
    • you may see the details of the benchmarks at benches/grow.rs.

The bag preserves the order of elements with respect to the order the push method is called.

Examples

Safety guarantees to push to the bag with an immutable reference makes it easy to share the bag among threads.

Using std::sync::Arc

Following the common approach of using an Arc, we can share the vector among threads and collect results concurrently.

use orx_concurrent_vec::*;
use std::{sync::Arc, thread};

let (num_threads, num_items_per_thread) = (4, 8);

let convec = Arc::new(ConcurrentVec::new());
let mut thread_vec: Vec<thread::JoinHandle<()>> = Vec::new();

for i in 0..num_threads {
    let convec = convec.clone();
    thread_vec.push(thread::spawn(move || {
        for j in 0..num_items_per_thread {
            convec.push(i * 1000 + j); // concurrently collect results simply by calling `push`
        }
    }));
}

for handle in thread_vec {
    handle.join().unwrap();
}

let mut vec_from_convec: Vec<_> = convec.iter().copied().collect();
vec_from_convec.sort();
let mut expected: Vec<_> = (0..num_threads).flat_map(|i| (0..num_items_per_thread).map(move |j| i * 1000 + j)).collect();
expected.sort();
assert_eq!(vec_from_convec, expected);

Using std::thread::scope

An even more convenient approach would be to use thread scopes. This allows to use shared reference of the vec across threads, instead of Arc.

use orx_concurrent_vec::*;
use std::thread;

let (num_threads, num_items_per_thread) = (4, 8);

let convec = ConcurrentVec::new();
let convec_ref = &convec; // just take a reference
std::thread::scope(|s| {
    for i in 0..num_threads {
        s.spawn(move || {
            for j in 0..num_items_per_thread {
                convec_ref.push(i * 1000 + j); // concurrently collect results simply by calling `push`
            }
        });
    }
});

let mut vec_from_convec: Vec<_> = convec.iter().copied().collect();
vec_from_convec.sort();
let mut expected: Vec<_> = (0..num_threads).flat_map(|i| (0..num_items_per_thread).map(move |j| i * 1000 + j)).collect();
expected.sort();
assert_eq!(vec_from_convec, expected);

Safety

ConcurrentVec uses a SplitVec as the underlying storage.
SplitVec implements PinnedVec which guarantees that elements which are already pushed to the vector stay pinned to their memory locations.
This feature makes it safe to grow with a shared reference on a single thread, as implemented by ImpVec.

In order to achieve this feature in a concurrent program, ConcurrentVec pairs the SplitVec with an AtomicUsize.

  • AtomicUsize fixes the target memory location of each element to be pushed at the time the push method is called. Regardless of whether or not writing to memory completes before another element is pushed, every pushed element receives a unique position reserved for it.
  • SplitVec guarantees that already pushed elements are not moved around in memory and new elements are written to the reserved position.

The approach guarantees that

  • only one thread can write to the memory location of an element being pushed to the vec,
  • at any point in time, only one thread is responsible for the allocation of memory if the vec requires new memory,
  • no thread reads an element which is being written, reading is allowed only after the element is completely written,
  • hence, there exists no race condition.

This pair allows a lightweight and convenient concurrent bag which is ideal for collecting results concurrently.

Write-Only vs Read-Write

The concurrent vec is read-and-write & grow-only vec which is convenient and efficient for collecting elements.
While allowing growth by pushing elements from multiple threads, each thread can safely read already pushed elements.

See ConcurrentBag for a write-only variant which allows only writing during growth.
The advantage of the bag, on the other hand, is that it stores elements as T rather than Option<T>.