-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize build? #70
Comments
It wouldn't be too hard – you would need a mutex in a couple of places and also watch out for reallocations, but other than that it should be easy |
Cool thanks, I'll look into it (although my C++ is a little rusty :)) |
Has there been any further work on this? |
not afaik :( |
No work on it from my end. |
Bummer. I'd love to help out, but I don't think I'm familiar enough with annoy yet (or frankly competent enough with C++) to spearhead anything. If anyone starts working on this and wants a hand, I'm happy to help out. |
Actually, I take it back. It looks like a pretty simple change. If I understand it properly, it's line 496 of
That can get parallelized, but with a mutex around |
roughly, but it's a bit more complicated. then you obviously need to create pthreads and collect them afterwards... honestly I haven't done that in like 10 years, but iirc it's not too hard |
@erikbern Actually I was working on a branch to add this just this weekend. It doesn't quite work yet though (I can only get it to run the precision_test.cpp example, only a debug binary works and it crashes sometimes even then); should I submit a PR? I didn't touch anything towards the bottom of annoylib.h (near those lines that you mentioned), that might be the problem... |
sure, feel free to submit a PR, just make sure to put "WIP" in the subject or something you definitely need a mutex around those lines, that's probably the issue :) |
@tjrileywisc what a coincidence! I also started working on this over the weekend. Mine also doesn't quite work... but I've pushed a copy to my fork: https://github.com/thomas4g/annoy/tree/parallelize_build I hope to keep working on it tonight, but if you're further along let me know if you'd like any help! Feel free to ping me here or shoot me an email me@thomasshields.net |
@thomas4g |
@tjrileywisc ahh, I wanted to use that but got thrown off by the build not supporting my |
Just submitted a PR for my build_trees_threaded branch. |
guys, is there any news on the multicore index building? |
I thought I might have a go at this. The actual threading machinery is all straightforward but I'm having some trouble seeing exactly which bits of
Any help with this is appreciated. |
interesting – you're probably right that it should be fairly straightforward, but i can't think of other critical sections off the top of my head |
Unless I'm misunderstanding the code this may be more complicated than I first thought. It appears that there are many places within |
From what you say I think the main issue is that the underlying memory can be reallocated at any point in time, and that invalidates any pointers held by any other thread. But those reallocations are actually pretty rare so there should be some way to fix. I haven't dealt with concurrency code in C++ since maybe 2007 so my knowledge is a bit rusty but couldn't you use a shared lock for this? Almost all access will be nonexclusive (so near-zero overhead), but the few times when you need to reallocate the underlying storage, you would have to acquire an exclusive lock. Does that make sense? |
I meant a shared mutex. This looks like the right concurrency primitive: https://en.cppreference.com/w/cpp/thread/shared_mutex So basically acquire a shared lock when writing individual vectors, acquire an exclusive lock when you have to resize the underlying data storage. But I'm mostly speculating, could be wrong :) |
I think what you are saying makes sense. Thanks for the tip on shared mutex - I hadn't seen that before. It was introduced in c++17 though which may be it's own problem. I'll take another look and see what I can figure out. |
@os-gabe Hey, Is it solved? Any insights on how to parallelize build? |
no, this would have to be implemented by someone |
@erikbern That's sad. I am trying to build an index for over 1M vectors and it is crashing even with on_disk_build. The process took up more than 30 GB memory and crashed. |
i'm not sure if parallelization would have helped, though. do you know what's causing it to crash? |
Yeah agreed that parallelization probably would not help. 1M vectors isn't that much unless they are extremely high dimensional vectors. Did you try with a smaller number of vectors to see where it starts failing? |
good point – annoy isn't meant for super high dimensionality, so if that's what you're facing then you should probably run dimensionality reduction outside of annoy first! |
Unfortunately I had to move on to other things and did not get parallel build working |
Conceptually, if the trees are independent, shouldn't we be able to build the trees in parallel across multiple cores? Is this something that could be implemented easily?
I haven't had a chance to dive deeply into the code/algorithm yet so I may be misunderstanding something.
The text was updated successfully, but these errors were encountered: