Skip to content

Pytest discovery optimization #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 1, 2025
Merged

Conversation

KRRT7
Copy link
Contributor

@KRRT7 KRRT7 commented Mar 25, 2025

User description

with these changes, the test discovery and test processing is almost as fast as multi processing process_test_files but with less overhead

this should supersede #58, this just implements what @alvin-r says there


PR Type

  • Enhancement

Description

  • Enhance parallel test discovery using multiprocessing

  • Add progress bar for discovery feedback

  • Refactor test processing with caching and deduplication

  • Update project version metadata


Changes walkthrough 📝

Relevant files
Enhancement
discover_unit_tests.py
Enhance parallel test discovery and caching logic               

codeflash/discovery/discover_unit_tests.py

  • Introduce module-level regex constants for errors and tests
  • Wrap discovery function in a progress bar for multiprocessing
  • Refactor processing using sets and add goto_cache
  • Optimize parameterized test name handling and deduplication
  • +125/-63
    version.py
    Update project version metadata                                                   

    codeflash/version.py

  • Update version string to 0.10.3.post7.dev0+86aa1cd6
  • Refresh version tuple with detailed parts

  • Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @KRRT7 KRRT7 requested review from alvin-r and misrasaurabh1 and removed request for alvin-r March 25, 2025 23:11
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis 🔶

    58 - Partially compliant

    Compliant requirements:

    • Enhanced parallel test discovery using multiprocessing
    • Progress bar feedback integration
    • Refactored test processing with caching and deduplication improvements
    • Version metadata update

    Non-compliant requirements:

    Requires further human verification:

    • Verify performance improvements on systems with fewer than 25 files
    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Type Consistency

    Ensure consistent handling of file path types in filtering logic using discover_only_these_tests to avoid mismatches between Path objects and string representations.

            if (
                discover_only_these_tests
                and test_obj.test_file not in discover_only_these_tests
            ):
                continue
            file_to_test_map[test_obj.test_file].append(test_obj)
        # Within these test files, find the project functions they are referring to and return their names/locations
        return process_test_files(file_to_test_map, cfg)
    
    
    def discover_tests_unittest(
        cfg: TestConfig, discover_only_these_tests: list[str] | None = None
    ) -> dict[str, list[FunctionCalledInTest]]:
        tests_root: Path = cfg.tests_root
        loader: unittest.TestLoader = unittest.TestLoader()
        tests: unittest.TestSuite = loader.discover(str(tests_root))
        file_to_test_map: defaultdict[str, list[TestsInFile]] = defaultdict(list)
    
        def get_test_details(_test: unittest.TestCase) -> TestsInFile | None:
            _test_function, _test_module, _test_suite_name = (
                _test._testMethodName,
                _test.__class__.__module__,
                _test.__class__.__qualname__,
            )
    
            _test_module_path = Path(_test_module.replace(".", os.sep)).with_suffix(".py")
            _test_module_path = tests_root / _test_module_path
            if not _test_module_path.exists() or (
                discover_only_these_tests
                and str(_test_module_path) not in discover_only_these_tests
            ):
    Caching Robustness

    Verify the caching mechanism in process_test_files, particularly the use of goto_cache and cache key construction, to ensure it handles definitions reliably.

    cache_key = (name.full_name, name.module_name)
    try:
        if cache_key in goto_cache:
            definition = goto_cache[cache_key]
        else:
            definition = name.goto(
                follow_imports=True, follow_builtin_imports=False
            )
            goto_cache[cache_key] = definition
    except Exception as e:
        logger.debug(str(e))
    Parameter Extraction

    Validate the splitting logic for parameterized test names (using PYTEST_PARAMETERIZED_TEST_NAME_REGEX) to handle cases with unexpected formats gracefully.

    function_name = PYTEST_PARAMETERIZED_TEST_NAME_REGEX.split(
        function.test_function
    )[0]
    parameters = PYTEST_PARAMETERIZED_TEST_NAME_REGEX.split(
        function.test_function
    )[1]

    Copy link

    PR Code Suggestions ✨

    No code suggestions found for the PR.

    misrasaurabh1
    misrasaurabh1 previously approved these changes Mar 26, 2025
    Copy link
    Contributor

    @misrasaurabh1 misrasaurabh1 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    great work, thank you!

    @KRRT7 KRRT7 requested a review from misrasaurabh1 March 26, 2025 04:52
    @alvin-r alvin-r mentioned this pull request Mar 26, 2025
    Revert "implement suggestions"
    
    This reverts commit 8bd8068.
    
    first pass
    @KRRT7 KRRT7 force-pushed the pytest-discovery-optimization branch from 8bd8068 to ad9b306 Compare March 28, 2025 06:41
    @misrasaurabh1
    Copy link
    Contributor

    @KRRT7 you mentioned that there were some cases where you found the behavior is different. Can you make test cases for the failing cases and push here? Also, lets resolve and merge this quickly

    @KRRT7 KRRT7 requested a review from alvin-r April 1, 2025 18:02
    @KRRT7
    Copy link
    Contributor Author

    KRRT7 commented Apr 1, 2025

    @alvin-r we can merge this as is, this doesn't cause the issues I commented on elsewhere and such can be merged cleanly, this already has a major performance boost, there's still work to be done but for now it's fine.

    Copy link
    Contributor

    @alvin-r alvin-r left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    lets go

    @alvin-r alvin-r enabled auto-merge April 1, 2025 22:59
    @alvin-r alvin-r disabled auto-merge April 1, 2025 23:06
    @alvin-r alvin-r merged commit 01b7e94 into main Apr 1, 2025
    15 checks passed
    @KRRT7 KRRT7 deleted the pytest-discovery-optimization branch April 1, 2025 23:13
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants