Skip to content

Remove strict comparison of tensors against golden values in evo2 test#901

Merged
balvisio merged 1 commit into
mainfrom
ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test
Jun 4, 2025
Merged

Remove strict comparison of tensors against golden values in evo2 test#901
balvisio merged 1 commit into
mainfrom
ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test

Conversation

@balvisio
Copy link
Copy Markdown
Collaborator

@balvisio balvisio commented May 30, 2025

Description

The pytorch 25.04 has CUDA 12.9 installed:

Build cuda_12.9.r12.9/compiler.35813241_0

while pytorch 25.01 has 12.8

Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0

This seems to be causing numerical differences so that strict checking of tensor values fails.

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):
    Remove a strict tensor comparison that doesn't hold across CUDA version.

CI Pipeline Configuration

Configure CI behavior by applying the relevant labels:

Note

By default, the notebooks validation tests are skipped unless explicitly enabled.

Authorizing CI Runs

We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.

  • If a pull request is opened by a trusted user and contains only trusted changes, the pull request's code will
    automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
  • If a pull request is opened by an untrusted user or contains untrusted changes, an NVIDIA org member must leave an
    /ok to test comment on the pull request to trigger CI. This will need to be done for each new commit.

Usage

TODO: Add code snippet

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 30, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Copy Markdown
Collaborator

@trvachov trvachov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine but can we have the "add notebook golden value comparisons" in this PR as well? or has that been merged already?

Copy link
Copy Markdown
Collaborator

@jstjohn jstjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #905 which I think should also land. We need to check that changes work with a downstream task such as AUC on BRCA1. cc @dorotat-nv @trvachov.

@balvisio balvisio force-pushed the ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test branch from 41475f9 to d70f94c Compare May 30, 2025 18:25
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test d70f94c

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 30, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.19%. Comparing base (f10e02c) to head (46fa3ab).
Report is 14 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #901      +/-   ##
==========================================
- Coverage   84.21%   84.19%   -0.02%     
==========================================
  Files         143      143              
  Lines        9044     9044              
==========================================
- Hits         7616     7615       -1     
- Misses       1428     1429       +1     

see 1 file with indirect coverage changes

@balvisio balvisio force-pushed the ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test branch from 0385690 to 7e0158c Compare May 30, 2025 21:39
@balvisio
Copy link
Copy Markdown
Collaborator Author

/ok to test 7e0158c

@balvisio
Copy link
Copy Markdown
Collaborator Author

balvisio commented Jun 2, 2025

@trvachov and @dorotat-nv ping for review. Thank you!

Comment thread sub-packages/bionemo-evo2/tests/bionemo/evo2/test_evo2.py
@balvisio balvisio force-pushed the ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test branch from 7e0158c to 6b01b17 Compare June 3, 2025 22:34
@balvisio balvisio enabled auto-merge June 3, 2025 22:34
@balvisio balvisio force-pushed the ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test branch 3 times, most recently from afe572e to d413a5e Compare June 4, 2025 05:02
@balvisio balvisio added this pull request to the merge queue Jun 4, 2025
@balvisio balvisio removed this pull request from the merge queue due to a manual request Jun 4, 2025
Signed-off-by: Bruno Alvisio <balvisio@nvidia.com>
@balvisio balvisio force-pushed the ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test branch from d413a5e to 46fa3ab Compare June 4, 2025 15:03
@balvisio
Copy link
Copy Markdown
Collaborator Author

balvisio commented Jun 4, 2025

/ok to test 46fa3ab

@balvisio balvisio enabled auto-merge June 4, 2025 15:04
@balvisio balvisio added this pull request to the merge queue Jun 4, 2025
Merged via the queue into main with commit da80d3f Jun 4, 2025
10 checks passed
@balvisio balvisio deleted the ba/BIONEMO-1902-remove-strict-comparison-in-evo2-test branch June 4, 2025 17:42
camirr-nv pushed a commit that referenced this pull request Jun 26, 2025
#901)

### Description
<!-- Provide a detailed description of the changes in this PR -->
The pytorch 25.04 has CUDA 12.9 installed:
```
Build cuda_12.9.r12.9/compiler.35813241_0
```

while pytorch 25.01 has 12.8
```
Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0
```

This seems to be causing numerical differences so that strict checking
of tensor values fails.

### Type of changes
<!-- Mark the relevant option with an [x] -->

- [ ]  Bug fix (non-breaking change which fixes an issue)
- [ ]  New feature (non-breaking change which adds functionality)
- [ ]  Refactor
- [ ]  Documentation update
- [x]  Other (please describe):
Remove a strict tensor comparison that doesn't hold across CUDA version.

### CI Pipeline Configuration
Configure CI behavior by applying the relevant labels:

-
[SKIP_CI](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#skip_ci)
- Skip all continuous integration tests
-
[INCLUDE_NOTEBOOKS_TESTS](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#include_notebooks_tests)
- Execute notebook validation tests in pytest
-
[INCLUDE_SLOW_TESTS](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#include_slow_tests)
- Execute tests labelled as slow in pytest for extensive testing

> [!NOTE]
> By default, the notebooks validation tests are skipped unless
explicitly enabled.

#### Authorizing CI Runs

We use
[copy-pr-bot](https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/#automation)
to manage authorization of CI
runs on NVIDIA's compute resources.

* If a pull request is opened by a trusted user and contains only
trusted changes, the pull request's code will
automatically be copied to a pull-request/ prefixed branch in the source
repository (e.g. pull-request/123)
* If a pull request is opened by an untrusted user or contains untrusted
changes, an NVIDIA org member must leave an
`/ok to test` comment on the pull request to trigger CI. This will need
to be done for each new commit.

### Usage
<!--- How does a user interact with the changed code -->
```python
TODO: Add code snippet
```

### Pre-submit Checklist
<!--- Ensure all items are completed before submitting -->

 - [x] I have tested these changes locally
 - [x] I have updated the documentation accordingly
 - [x] I have added/updated tests as needed
 - [ ] All existing tests pass successfully

Signed-off-by: Bruno Alvisio <balvisio@nvidia.com>
Signed-off-by: Ubuntu <camirr@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants