Skip to content

Reset parameters for the ESM-2 contact head on HF export#983

Merged
pstjohn merged 1 commit into
NVIDIA-BioNeMo:mainfrom
pstjohn:pstjohn/init-esm2-contact-head
Jul 24, 2025
Merged

Reset parameters for the ESM-2 contact head on HF export#983
pstjohn merged 1 commit into
NVIDIA-BioNeMo:mainfrom
pstjohn:pstjohn/init-esm2-contact-head

Conversation

@pstjohn
Copy link
Copy Markdown
Collaborator

@pstjohn pstjohn commented Jul 16, 2025

Previously this weight was not initialized, and would be the result of whatever torch.empty() assigned to these values. If we want to use this layer in downstream tasks, we should at least randomly init these weights to reasonable values.

Previously this weight was not initialized, and would be the result of
whatever torch.empty() assigned to these values. If we want to use this
layer in downstream tasks, we should at least randomly init these
weights to reasonable values.

Signed-off-by: Peter St. John <pstjohn@nvidia.com>
Copy link
Copy Markdown
Collaborator

@jomitchellnv jomitchellnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread sub-packages/bionemo-esm2/tests/bionemo/esm2/model/test_convert.py
@pstjohn pstjohn added this pull request to the merge queue Jul 24, 2025
Merged via the queue into NVIDIA-BioNeMo:main with commit 95f7e62 Jul 24, 2025
16 checks passed
@pstjohn pstjohn deleted the pstjohn/init-esm2-contact-head branch July 24, 2025 22:27
edawson pushed a commit that referenced this pull request Jul 28, 2025
Previously this weight was not initialized, and would be the result of
whatever torch.empty() assigned to these values. If we want to use this
layer in downstream tasks, we should at least randomly init these
weights to reasonable values.

Signed-off-by: Peter St. John <pstjohn@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants