Skip to content

Add ExtendedTestBase shared base class for benchmark and functional tests#3689

Merged
lajagapp merged 7 commits intomainfrom
users/lajagapp/add-test-helper
Mar 10, 2026
Merged

Add ExtendedTestBase shared base class for benchmark and functional tests#3689
lajagapp merged 7 commits intomainfrom
users/lajagapp/add-test-helper

Conversation

@lajagapp
Copy link
Copy Markdown
Contributor

Summary

Extracts shared test infrastructure from BenchmarkBase into a new ExtendedTestBase base class, reducing code duplication between benchmark and functional tests.
This addresses the reviewer feedback on #3648 about sharing common logic between benchmark_base.py and functional_base.py.

Changes

New: utils/extended_test_base.py

  • Created ExtendedTestBase class with following shared infrastructure:
    execute_command(), get_gpu_architecture(), detect_gpu_count(), create_test_result(), calculate_statistics(), upload_results()

Updated: benchmark/scripts/benchmark_base.py

  • BenchmarkBase now inherits from ExtendedTestBase
  • Removed duplicated methods: execute_command, _detect_gpu_count, calculate_statistics, upload_results

Updated: benchmark/scripts/test_rccl_benchmark.py

  • self._detect_gpu_count() → self.detect_gpu_count() (now inherited from base)

Updated: utils/__init__.py, README.md, utils/README.md

  • Added ExtendedTestBase to exports and documentation

Inheritance Hierarchy

ExtendedTestBase (utils/extended_test_base.py)
├── BenchmarkBase (benchmark/scripts/benchmark_base.py)
│   ├── ROCfftBenchmark, RCCLBenchmark, ROCblasBenchmark, ...
└── FunctionalBase (functional/scripts/functional_base.py)  ← will inherit in follow-up PR
    ├── MIOpenDriverConv, RcclTestInfra, ...

Follow-up

  • Update FunctionalBase to inherit from ExtendedTestBase (after this PR merges)

…ests

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>
Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>
Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

letting other reviewers review

@lajagapp lajagapp requested a review from geomin12 March 4, 2026 18:06
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

letting other folks review - if this isn't reviewed till fri, i will take a look

Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but can we post some logs making sure this works?

@lajagapp
Copy link
Copy Markdown
Contributor Author

lajagapp commented Mar 5, 2026

lgtm, but can we post some logs making sure this works?

Triggerd nighty-ci benchmark runs with this change - https://github.com/ROCm/TheRock/actions/runs/22701813100/job/65820378187

@lajagapp lajagapp requested a review from geomin12 March 5, 2026 10:03
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

letting other folks review - if this isn't reviewed till fri, i will take a look

@lajagapp lajagapp requested a review from geomin12 March 7, 2026 02:54
"""Base class providing shared infrastructure for benchmark and functional tests."""

def __init__(self, test_name: str, display_name: str = None):
"""Initialize common test infrastructure.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Initialize common test infrastructure.
"""Initialize common extended test infrastructure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docstring as suggested.



class ExtendedTestBase:
"""Base class providing shared infrastructure for benchmark and functional tests."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Base class providing shared infrastructure for benchmark and functional tests."""
"""Base class providing shared infrastructure for extended tests."""

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docstring.

process.wait()
return process.returncode

def get_gpu_architecture(self) -> str:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed, seems identical to get_first_gpu_architecture

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just added get_gpu_architecture as a wrapper on top of get_first_gpu_architecture which is not required. So, removed it now.

"Ensure ROCm drivers are installed and GPU is accessible."
) from e

def detect_gpu_count(self) -> int:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identical to get_visible_gpu_count, can be removed as it is part of the shared utils lib

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just added detect_gpu_count as a wrapper on top of get_visible_gpu_count with logging which is not required. So, removed it and updated the benchmark script to directly call get_visible_gpu_count.

) -> Dict[str, Any]:
"""Create a standardized test result dictionary.

Builds the base result structure used by both benchmark and functional tests.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Builds the base result structure used by both benchmark and functional tests.
Builds the base result structure used by extended tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docstring as suggested.

Comment on lines +180 to +183
if key == "pass":
key = "passed"
elif key == "fail":
key = "failed"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just update the keys to be passed or failed? not sure why we do this conversion

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes removed now, this conversion logic not really needed here.

metadata.update(extra_metadata)

log.info(f"Uploading {test_type.title()} Test Results to API")
success = self.client.upload_results(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe the benchmark tests are passing but we still get the 403 error. can we remove this or comment this out until it is fixed?

open a GH issue with error logs and comment this out with a TODO?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs changes in LKG comparison and result handling. Can I create separate PR for this, so that we can get this PR reviewed and merged quickly which blocking other reviews.

I've created github-issue for #3850. Will create new PR and add for review.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created new PR to temporarily disable the API upload - #3853.

if success:
log.info("Results uploaded successfully")
else:
log.info("Results saved locally only (API upload disabled or failed)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add error log? this is a pretty vague message and if i got an error, i would want to see more than this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated to print the generic error message now. Will take care of detailed errors as part of new PR which disables API upload.

Comment thread tests/extended_tests/utils/README.md Outdated
## Usage

### From Benchmark Scripts
### From Test Base Classes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### From Test Base Classes
### From Extended Test Base Classes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

lajagapp added 3 commits March 9, 2026 17:20
- Remove get_gpu_architecture & detect_gpu_count and use shared utils lib
- Update docstrings

Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>
Signed-off-by: Lenine Ajagappane <Lenine.Ajagappane@amd.com>
@lajagapp lajagapp requested a review from geomin12 March 9, 2026 17:56
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lot cleaner, thanks for updates

@lajagapp lajagapp merged commit c6bc478 into main Mar 10, 2026
108 of 115 checks passed
@lajagapp lajagapp deleted the users/lajagapp/add-test-helper branch March 10, 2026 01:09
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants