Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

[Question] Usage of bnb.nn.Embedding with existing classes from other libraries #19

Closed
LSinev opened this issue Nov 22, 2021 · 2 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@LSinev
Copy link

LSinev commented Nov 22, 2021

Replace embedding layer if necessary: torch.nn.Embedding(..) -> bnb.nn.Embedding(..)

Does it suppose user creation of custom classes to replace (for example) huggingface transformers' GPT2DoubleHeadsModel?
Or there is something like bnb.optim.GlobalOptimManager which change provided model instance to use bitsandbytes embeddings instead of torch ones?

@TimDettmers
Copy link
Contributor

Currently, a replacement is required, since the layer also adds an embedding layer. This is critical for pretraining models. If you are fine-tuning then you do not need the StableEmbedding layer (for GLUE, I am not sure for fine-tuning GPT-2 or seq-to-seq).

If you want to use 32-bit optimizers for the embedding, but without layer norm, you can add the following code after the embedding class is defined in the GPT2DoubleHeadsModel:

  self.emb = torch.nn.Embedding(..)
  GlobalOptimManager.get_instance().override_config(self.emb.weight, 'optim_bits', 32)
  GlobalOptimManager.get_instance().register_parameters(self.emb.weight)

This will add further stability to the fine-tuning, especially for seq-to-seq or LM fine-tuning. I would recommend replacing the embedding with the StableEmbedding layer if you do pretraining from scratch.

@TimDettmers TimDettmers added enhancement New feature or request question Further information is requested labels Nov 23, 2021
@TimDettmers
Copy link
Contributor

A standard Embedding layer has been added that is very easy to use in place of torch.nn.Embedding. The bnb.nn.Embedding class ensures that optimization happens in 32-bit for the embedding layer, even if the rest of the model is optimized with 8-bit optimizers. Thank you for this suggestion!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants