generating embeddings for large code

Hi,
I am trying to get NL-PL embeddings for java code snippets using codeBert. Whenever the token size is larger than 1000, I cannot generate embeddings and getting an error message "The expanded size of the tensor (865) must match the existing size (514) at non-singleton dimension 1.  Target sizes: [1, 865].  Tensor sizes: [1, 514]". Is there any way I can overcome the issue? I cannot find any token size limitation in the paper.  Thanks in advance for the help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

generating embeddings for large code #177

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

generating embeddings for large code #177

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions