Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Matcher threadsafe? #7

Closed
raethlo opened this issue Apr 18, 2016 · 3 comments
Closed

Is Matcher threadsafe? #7

raethlo opened this issue Apr 18, 2016 · 3 comments
Labels

Comments

@raethlo
Copy link

raethlo commented Apr 18, 2016

Hey,

I am running some processing in separate threads to speed things up, I'm wondering if Map and/or Matcher objects are threadsafe. I am sharing a Matcher object between threads but I am not quite sure if it is all right. If it's not safe I could pass in a map and instantiate separate matchers, but then again, I'd have to be sure Map is threadsafe.

Thanks

@smattheis
Copy link

smattheis commented Apr 18, 2016

Matcher and RoadMap are thread-safe in such a way that both are accessed read-only during map matching. However, Matcher is already running shortest path searches in parallel (data parallelism), i.e. for n candidates (sources) in t-1 and m candidates (targets) in t, it runs n SSMT shortest path searches in parallel (SSMT - single source multiple target); see line 246 and 295 in the Matcher class. The only non-thread-safe objects relevant during map matching are KState and MatcherKState, respectively.
To the best of my knowledge, 'more' parallelization of map matching of a single track requires non-trivial task parallelism, e.g., a parallel implementation of Dijkstra algorithm. However, processing multiple tracks in parallel is just data parallelism which is already implemented if you use the stand-alone server, see lines 252-261 in the Server class.
All in all, I don't see that much potential for trivial parallelization. You could check the speed-up of your application, if you set matcherNumThreads to one and to the number of virtual cores, respectively. If you use Barefoot as a library, run StaticScheduler.reset(1) and StaticScheduler.reset(p) where p is again the number of virtual cores before the map matching.
Finally, I would recommand to not use threads for further parallelization because it may conflict with Barefoot's internal work-stealing scheduler (planned to be extracted in a separate library in future), that provides (user-level) task parallelism with spawnand syncoperations as presented in Leiserson et al., Introduction to Algorithms, MIT Press, 3rd edition (see chapter 27).

@raethlo
Copy link
Author

raethlo commented Apr 18, 2016

Thanks for such a thorough answer. The parallelism in my case is indeed needed to load separate traces in parallel and then match them, with a code something like so in the implementation of Runnable#run() method:

while (loader.next()){
    List<List<MatcherSample>> samples = loader.loadCurrentBatch();

    for (List<MatcherSample> trace : samples) {
        MatcherKState candidate_state = mapTrace(trace);
        saveRoadSpeeds(candidate_state.samples(), candidate_state.sequence());
    }
}

thanks for the reference to the Server class, will check that code and see if it can improve the design I currently have.

Cheers!

@smattheis
Copy link

In that case (batch processing), I would just start as many threads as you have cores available and run that code.

If you're having a Spark cluster available, you could check out this to run it on the cluster: scalable map matching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants