Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle) #31

Open
ScottishFold007 opened this issue Mar 8, 2022 · 4 comments

Comments

@ScottishFold007
Copy link

ScottishFold007 commented Mar 8, 2022

I'm trying to refactor your model based on transformers, but I'm having a problem: there's always an error somewhere, but I've tried a lot of solutions and I don't have a clue.
image

class ClipCaptionModel(PreTrainedModel):
  def __init__(self, config):
    super(ClipCaptionModel, self).__init__(config)
    self.prefix_length = config.prefix_length
    self.clip_length = config.clip_length
    self.prefix_size = config.prefix_size
    self.num_layers = config.num_layers
    self.mapping_type = config.mapping_type 
    decoder = config.decoder
    self.gpt = GPT2LMHeadModel.from_pretrained('uer/gpt2-chinese-cluecorpussmall')
    self.gpt_embedding_size = self.gpt.transformer.wte.weight.shape[1]
    self.clip_project = TransformerMapper(self.prefix_size, self.gpt_embedding_size, self.prefix_length, self.clip_length, self.num_layers)  #(512,768,10,8)
    print(self.prefix_size, self.gpt_embedding_size, self.prefix_length, self.clip_length, self.num_layers)

  def get_dummy_token(self, batch_size: int, device: torch.device) -> torch.Tensor:
    return torch.zeros(batch_size, self.prefix_length, dtype=torch.int64, device=device)

  def forward(self, 
              tokens: torch.Tensor, 
              prefix: torch.Tensor, 
              mask: Optional[torch.Tensor] = None,
              labels: Optional[torch.Tensor] = None):
    
      embedding_text = self.gpt.transformer.wte(tokens)
      print(prefix.shape)
      prefix_projections = self.clip_project(prefix).view(-1, self.prefix_length, self.gpt_embedding_size)
      embedding_cat = torch.cat((prefix_projections, embedding_text), dim=1)
      if labels is not None:
        dummy_token = self.get_dummy_token(tokens.shape[0], tokens.device)
        labels = torch.cat((dummy_token, tokens), dim=1)
      out = self.gpt(inputs_embeds=embedding_cat, labels=labels, attention_mask=mask)
      return out


class ClipCaptionPrefix(ClipCaptionModel):

    def parameters(self, recurse: bool = True):
        return self.clip_project.parameters()

    def train(self, mode: bool = True):
        super(ClipCaptionPrefix, self).train(mode)
        self.gpt.eval()
        return self

`
Here is the address on the colab:https://colab.research.google.com/drive/1sEg9HbDwRPs9_SNVjjsPE_sk449P9Svc#scrollTo=3pP_n5oQrXPg&uniqifier=1

@HalimSD
Copy link

HalimSD commented Mar 15, 2022

I think you're trying to give the linear layer a tensor with dim1=512. Which is the prefix you obtained from your preprocessing when you parsed the data. You encoded the images using the CLIP encode_image function which outputs a tensor with dim1=512. Then you tried to train the model with a prefix size that has a tensor with dim1=640.

@ScottishFold007
Copy link
Author

prefix size

Did you come to this conclusion from reading the above colab notebook? But I have changed the prefix size to 512, I still get this error? Do you have any good solution?

@HalimSD
Copy link

HalimSD commented Mar 15, 2022

I don’t have access to your notebook.
I came to that conclusion cuz i'm facing the exact error and came here to open a similar issue

Do you have any good solution?

No

@MachineLearning11
Copy link

我正在尝试基于转换器重构您的模型,但我遇到了一个问题:某处总是有错误,但我尝试了很多解决方案,但我不知道。 图片

class ClipCaptionModel(PreTrainedModel):
  def __init__(self, config):
    super(ClipCaptionModel, self).__init__(config)
    self.prefix_length = config.prefix_length
    self.clip_length = config.clip_length
    self.prefix_size = config.prefix_size
    self.num_layers = config.num_layers
    self.mapping_type = config.mapping_type 
    decoder = config.decoder
    self.gpt = GPT2LMHeadModel.from_pretrained('uer/gpt2-chinese-cluecorpussmall')
    self.gpt_embedding_size = self.gpt.transformer.wte.weight.shape[1]
    self.clip_project = TransformerMapper(self.prefix_size, self.gpt_embedding_size, self.prefix_length, self.clip_length, self.num_layers)  #(512,768,10,8)
    print(self.prefix_size, self.gpt_embedding_size, self.prefix_length, self.clip_length, self.num_layers)

  def get_dummy_token(self, batch_size: int, device: torch.device) -> torch.Tensor:
    return torch.zeros(batch_size, self.prefix_length, dtype=torch.int64, device=device)

  def forward(self, 
              tokens: torch.Tensor, 
              prefix: torch.Tensor, 
              mask: Optional[torch.Tensor] = None,
              labels: Optional[torch.Tensor] = None):
    
      embedding_text = self.gpt.transformer.wte(tokens)
      print(prefix.shape)
      prefix_projections = self.clip_project(prefix).view(-1, self.prefix_length, self.gpt_embedding_size)
      embedding_cat = torch.cat((prefix_projections, embedding_text), dim=1)
      if labels is not None:
        dummy_token = self.get_dummy_token(tokens.shape[0], tokens.device)
        labels = torch.cat((dummy_token, tokens), dim=1)
      out = self.gpt(inputs_embeds=embedding_cat, labels=labels, attention_mask=mask)
      return out


class ClipCaptionPrefix(ClipCaptionModel):

    def parameters(self, recurse: bool = True):
        return self.clip_project.parameters()

    def train(self, mode: bool = True):
        super(ClipCaptionPrefix, self).train(mode)
        self.gpt.eval()
        return self

` 这里是colab上的地址:https://colab.research.google.com/drive/1sEg9HbDwRPs9_SNVjjsPE_sk449P9Svc#scrollTo=3pP_n5oQrXPg&uniqifier=1

请问你解决了吗,我的问题和您相同也是在linear出出问题了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants