Skip to content

Conversation

@lufftw
Copy link
Owner

@lufftw lufftw commented Dec 12, 2025

🛠️ Summary: Fix Judge Output Parsing and Complete Polymorphic Contract Migration

The changes improve test robustness and documentation clarity while preserving all existing solution algorithms.

Changes

Judge Output Normalization

The test framework parses solution outputs using ast.literal_eval(), which converts single-value outputs such as "1" into int. Several JUDGE_FUNC implementations assumed outputs were str or list only, causing failures in edge cases.

Fixes applied:

  • Add isinstance(actual, int) handling in affected JUDGE_FUNC implementations
  • Normalize single integers into single-element lists: [actual]
  • Improve input parsing to correctly preserve empty lines
  • No changes to any solution algorithms

Affected solutions:

  • 0021_merge_two_sorted_lists: Handle int input and fix empty line parsing
  • 0027_remove_element: Handle int input for k=0 single-line output
  • 0075_sort_colors: Handle int input for single-element arrays
  • 0088_merge_sorted_array: Handle int input and fix empty array parsing
  • 0283_move_zeroes: Handle int input for single-element arrays
  • 0876_middle_of_the_linked_list: Handle int input for single-node lists
  • 0905_sort_array_by_parity: Handle int input for single-element arrays

Architecture and Contract Updates

Complete the transition from the deprecated wrapper-based solution pattern to the polymorphic solution contract.

Key updates:

  • Add docs/solution_contract.md as the canonical specification for:
    • Solution file structure (SOLUTIONS + get_solver())
    • SOLUTIONS metadata schema (required class and method)
    • Judge and validation contract (JUDGE_FUNC, COMPARE_MODE priority)
    • Static and generated test requirements
    • Metadata consistency checklist
  • Update solution templates:
    • template_solution.py: Polymorphic single-solution template
    • template_solution_multi.py: Multiple classes with shared method name
    • Remove deprecated wrapper template and --wrapper CLI option
  • Update scripts:
    • new_problem.bat / new_problem.sh: Remove --wrapper option
    • run_tests.bat: Fix argument forwarding using %*
  • Update README.md and README_zh-TW.md to reflect the new architecture
  • Mark Phase 4 as complete in ARCHITECTURE_MIGRATION.md
  • Remove temporary debug files (test_debug.py, test_output.py)

Testing

  • Static tests: 33 / 33 passing
  • Generated tests: 59 / 59 passing
  • Combined tests: 99 / 99 passing
  • 7 problems skipped (no static tests)
  • No regressions observed

Root Cause

The compare module parses outputs with ast.literal_eval() before passing them to JUDGE_FUNC.

Numeric outputs are converted to int, which was not handled by several judge implementations.

Backward Compatibility

  • All solution algorithms remain unchanged
  • Existing solutions continue to work with updated judge logic
  • Wrapper-based templates and --wrapper option are removed (breaking change)
  • New solutions must follow the polymorphic SOLUTIONS + get_solver() contract

The test framework uses ast.literal_eval() to parse solution outputs,
which converts single numbers like "1" to integers. The judge functions
in 7 solutions were only checking for str or list types, causing test
failures when outputs were parsed as integers.

Fixed solutions:
- 0021_merge_two_sorted_lists: Handle int input + fix empty line parsing
- 0027_remove_element: Handle int input for k=0 single-line output case
- 0075_sort_colors: Handle int input for single-element arrays
- 0088_merge_sorted_array: Handle int input + fix empty array parsing
- 0283_move_zeroes: Handle int input for single-element arrays
- 0876_middle_of_the_linked_list: Handle int input for single-node lists
- 0905_sort_array_by_parity: Handle int input for single-element arrays

Changes:
- Added isinstance(actual, int) checks in all judge functions
- Convert single integers to single-element lists: [actual]
- Improved input parsing to preserve empty lines correctly
- No changes to solution algorithms (all were already correct)

Test Results:
- All 92 tests pass (7 skipped for problems without static tests)
- Static tests: 33/33 passing
- Generated tests: 59/59 passing
- Combined tests: 99/99 passing

Root Cause:
The compare.py module uses ast.literal_eval() to parse outputs before
passing to JUDGE_FUNC. Single numbers get parsed as int, not str,
which the judge functions didn't handle.
BREAKING CHANGE: Remove deprecated wrapper template and --wrapper option

- Create docs/solution_contract.md as canonical specification for:
  - Solution file structure (SOLUTIONS + get_solver() pattern)
  - SOLUTIONS metadata schema (required: class, method)
  - Judge/validation contract (JUDGE_FUNC, COMPARE_MODE priority)
  - Static/dynamic tests and generator requirements
  - Metadata layers and consistency checklist

- Update templates to polymorphic architecture:
  - template_solution.py: Add SOLUTIONS dict + get_solver()
  - template_solution_multi.py: Use multiple classes, same method name
  - Delete template_solution_wrapper.py (deprecated)

- Update scripts:
  - new_problem.bat/sh: Remove --wrapper option
  - run_tests.bat: Fix argument passing (use %* for all args)

- Update documentation:
  - README.md: Replace wrapper example with polymorphic pattern
  - README_zh-TW.md: Same updates in Chinese
  - ARCHITECTURE_MIGRATION.md: Mark Phase 4 complete

- Clean up temporary files:
  - Delete test_debug.py
  - Delete test_output.py
Copy link
Owner Author

@lufftw lufftw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are clear and well-structured. LGTM.

@lufftw lufftw merged commit 26c38a4 into main Dec 12, 2025
@lufftw lufftw deleted the fix/solutions-to-pass-tests branch December 12, 2025 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants