-
-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow (?) indexing on Apple m1 #55
Comments
Hey! No, this very much isn't expected. Tried it out on M1 and it took <30s to index the Miyazaki example, which seems to be what you're doing here! Are you running the standard notebook example on the latest version? I can't reproduce the issue on weaker hardware than yours via VSCode Jupyter (M1 Air 8GB, Python 3.11, indexing time ~27s)🤔 |
Yes - 01-basic_indexing_and_search.ipynb. I am still seeing same after doing the following:
When I interrupt it shows
|
It's very odd that it'd hang on a Mac, I think we've had a few hundred people do it locally already and it never occurred 🤔 Although it does reveal another issue -- @Anmol6 I've pinged you, we need a more reliable way of bypassing forking, currently RAGatouille has no easy way of enforcing it in every case. @gojira I've tried a new env on 3.9 as well, and everything runs smoothly... I think once we fixed the flawed MP-bypass implementation it'll also solve your problem! |
It's been a little while since I've had things break in AI this way - I guess it is truly cutting edge ;)! |
😄 Multiprocessing should now be properly disabled when fewer than 2 GPUs are detected -- let me know if the update fixes your issue! |
I installed the following and it still hangs. Now VS Code has problems interrupting the kernel - so I can't get a trace like I did last time
|
This is very strange, especially considering it seems to be dependant on something very specific, but utterly unclear what 🤔 ... Would you mind posting your pip freeze? I'll mark this issue as Help Wanted and it'd be great if someone managed to figure out exactly what's causing it. |
Yes - here you go - I deleted it before
|
@gojira can you provide the exact code you're running via a script file/gist + how you're creating your environment? |
Hi guys, I think I have the same issue (M3 pro 64gb) Was running python3.10, then tried with 3.9 with no change in the output (didn't wait for the program to finish but let it run > 15 min a couple of times) install pip freeze output: Let me know if I can provide more information ! |
(Copy/pasting this message in a few related issues) Hey guys! Thanks a lot for bearing with me as I juggle everything and trying to diagnose this. It’s complicated to fix with relatively little time to dedicate to it, as it seems like the dependencies causing issues aren’t the same for everyone, with no clear platform pattern as of yet. Overall, the issues center around the usual suspects of While because of this I can’t fix the issue with PLAID optimised indices just yet, I’m also noticing that most of the bug reports here are about relatively small collections (100s-to-low-1000s). To lower the barrier to entry as much as possible, #137 is introducing a second index format, which doesn’t actually build an index, but performs an exact search over all documents (as a stepping stone towards #110, which would use an HNSW index to be an in-between compromise between PLAID optimisation and exact search). The PR above (#137) is still a work in progress, as it needs CRUD support, tests, documentation, better precision routing (fp32/bfloat16) etc… (and potentially searching only subset of document ids). index(…
index_type=“FULL_VECTORS”,
) Any feedback is appreciated, as always, and thanks again! |
Hey Benjamin, Thanks a bunch for your time! Just tried it and it went smoothly. Installed the feat/full_vectors_indexing branch then tried with and without 'index_type=“FULL_VECTORS”,' both runs where a success! |
My code is getting stuck at
Even when running the most minimalist example like the one below from ragatouille import RAGPretrainedModel
if __name__ == "__main__":
RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0") I'm using This is my env
|
Hi - really excited to try RAGatouille. On Apple mac with M1Max - it's taken over 12 hours to index. Is this expected?
PyTorch emittd some warnings about CUDA not being available but it's running otherwise without error seemingly.
Below is the output in Jupyter in VS Code - it's running
The text was updated successfully, but these errors were encountered: