-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small ATLAS #7
Comments
Hi, @jamesoneill12 the 11B parameter reader model corresponds to the |
While building the FAISS index using the recommended setting: "--faiss_index_type ivfpq --faiss_code_size 16", my machine with 8 x 80GB A100 runs out of CUDA memory (after converting about 3 million passages). How can I save memory during this step? At this stage, I think that the T5 model is not even loaded. |
@minhluan1590 what size of model are you using? I am assuming it's xl, xxl so you might need more than 8 GPUS to load all the embeddings. You could either use a smaller model or try |
Hi @mlomeli1, could you specify the minimum requirement to run Atlas, i have a 12 GB GPU would that be sufficient for fine tuning? |
Well, it might be too small. The Atlas model requires very big GPU memory. I am converting the model from Sharded Data Parallel to Fully Sharded Data Parallel to use with smaller GPU memory. Now it is running, but I am still confused if I can load the SDP stored model and optimizer parameters and keep using with FSDP or not. |
@minhluan1590 Did you solve using atlas on small GPU, also earlier comment you mentioned 8 * 80GB A100 machine, are you totally using 1600gb of GPU memory! |
It's the finetuning process that force us use too much memory. During this process, the full index is still needed to computed, and convert into FAISS later. Still trying to optimize the memory use of this model. I will share with you when I am finished. |
Hi @minhluan1590 and @prasad4fun , thanks for all the discussions. As I said, different model sizes have different memory requirements so would be good to know which model size you intend to use. As a reference I've used the |
Will there be any plans to release a smaller version of ATLAS ?
Although 11B is relatively small when compared to the LLMs in the paper, it's still pretty large for ML practitioners with limited resources.
Thanks! James.
The text was updated successfully, but these errors were encountered: