New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HQQ FSDP #17
HQQ FSDP #17
Conversation
@mobicham - by way of background, this PR is need to allow HQQ to be used with FSDP for multi-GPU qlora training. We'll be releasing a blog post from Answer.AI soon showing folks how to use this functionality (we've also done it for bitsandbytes BTW). Let us know if you have any questions. |
Thank you very much for your contribution @KeremTurgutlu @jph00 ! Using multi-gpu to train quantized models is super valuable! The PR looks mostly fine, I just have a few things if you don't mind. Fortunately, these are easy fixes. In fact, I already included them here (via gist since the PR is coming from your repo): https://gist.github.com/mobicham/af0b7676c587ff36c0607affc00795eb Bugs(B1): Error when loading quantized models via
|
@mobicham thanks a lot for the detailed response and suggestions! I incorporated the changes and also added new tests. FSDP training runs fine as before. Fixing (S3): Scale/Zero still treated as int FSDP only requires parameters and buffers to be a float type, so tensors in (S5): Please provide an example Training script can be found here: https://github.com/AnswerDotAI/fsdp_qlora/blob/scaling_experiments/train.py. We are still actively working on it but should be finalized by the time we have the blog post ready. I also tested weights before and after training and made sure only the desired params are updated. FSDP model saving issue with HQQLinear
|
Thanks for the update @KeremTurgutlu ! I run more tests and it looks fine to me, so already merged! Regarding saving FSDP models, LoRa weights should be saved separately anyway, that's how we do it as well https://github.com/mobiusml/hqq/blob/master/hqq/core/peft.py#L316 . We can create a separate issue for |
SGTM! |
@mobicham FYI there's a draft post here that we'll be announcing tomorrow, that includes a discussion of the great results we're seeing with HQQ. Let us know if you see any issues with it: https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html |
@jph00 Thank you very much for sharing the draft! It is a very nice read! The only suggestion on our side would be using the official website for Mobius Labs https://www.mobiuslabs.com/ , the one mentioned in the article (https://mobiusml.github.io/) is very old! Looking forward to seeing and sharing the final version! |
@jph00 This reads really nice. I hope that future significant models will be trained using less powerful GPUs in remote parts of the world, potentially helping to level the playing field in AI. Upon this being live, would it be acceptable to link to and feature this on our blog, https://mobiusml.github.io/blog/, with a note expressing our delight at having contributed to this work? |
Of course we'd be delighted!
… Message ID: ***@***.***>
|
@mobicham Our HQQ+FSDP changes have been merged in main, so you'll want to update the HQQ ReadMe link from the |
@appoose FYI this is launched now. https://twitter.com/jeremyphoward/status/1765868543235805232 |
@jph00 It is now linked to our blog at https://mobiusml.github.io/blog/ . Looking forward for future collaborations! |
Related issue: #14
Note: 1-bit and 8-bit HQQLinear is failing in the
test_hqq_linear
test. Otherwise they work withQuantizer.quantize
andQuantizer.dequantize