Skip to content

generating embeddings for large code #177

@SamihaShimmi

Description

@SamihaShimmi

Hi,
I am trying to get NL-PL embeddings for java code snippets using codeBert. Whenever the token size is larger than 1000, I cannot generate embeddings and getting an error message "The expanded size of the tensor (865) must match the existing size (514) at non-singleton dimension 1. Target sizes: [1, 865]. Tensor sizes: [1, 514]". Is there any way I can overcome the issue? I cannot find any token size limitation in the paper. Thanks in advance for the help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions