-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Canine model and High VRAM usage #115
Comments
Hi, thanks for these benchmarks! And sorry for being slow to respond. You could debug this by checking how much memory the vanilla CANINE (https://huggingface.co/google/canine-s) takes for a forward pass vs. a forward pass of the WtP model (see e.g. here: https://github.com/bminixhofer/wtpsplit/?tab=readme-ov-file#advanced-usage). If there's a discrepancy there I'll investigate it. It's possible that CANINE just needs a lot of memory though, I am not super happy with that architecture and will upgrade the models to a different arch soon(ish). |
Will do. Btw, if you need gpu compute to train the next model, I can provide you with a A100 80+G. You can ping me up on Twitter at qbitium. |
Thanks! And that's very generous, deferring to @markus583 since he is doing the training but we are using TPUs so there is probably no need. |
Very generous indeed! Thanks but the TPUs are very strong. I'd be very curious whether there is a discrepancy too. |
@bminixhofer We are observing very high vram usage with canine model even though the
wtp-canine-s-12l-no-adapters
fp32 weights are only about 515MB so we naively expected batch=1 in fp16 mode to use 207.5MB of ram for weights plus runtime/inference costs. We didn't expect batch=1 vram to be 1.3GB. Input text is around 230kb text file.Is this a bug or architecture norm for the canine model? If norm, is there anything that we can do to reduce the memory footprint? Thanks.
The text was updated successfully, but these errors were encountered: