FSA Correction, overall backend pipeline#12
Conversation
…ectly, elimianted duplicated logic between validationa and correction
There was a problem hiding this comment.
Pull request overview
This PR introduces a new FSA Correction Module that compares student FSAs against expected FSAs by leveraging existing validation functions. The module provides detailed feedback about structural errors, language equivalence, and optionally minimality checking.
Changes:
- Added new correction module (
correction.pyand__init__.py) with pipeline for FSA comparison - Removed unused Answer schema (
answer.py) and related imports from schemas module - Added comprehensive test suite for the correction module (
test_correction.py) - Updated schema documentation to remove Answer schema references
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
evaluation_function/correction/correction.py |
Implements 4-step correction pipeline: validation, minimality check, structural analysis, and language equivalence |
evaluation_function/correction/__init__.py |
Exports correction module functions and classes |
evaluation_function/correction/README.md |
Documents correction module architecture and usage |
evaluation_function/test/test_correction.py |
Comprehensive test suite for correction functionality |
evaluation_function/schemas/answer.py |
Removed unused Answer schema definitions |
evaluation_function/schemas/__init__.py |
Removed Answer schema exports |
evaluation_function/schemas/README.md |
Removed Answer schema documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if is_valid_fsa(fsa): | ||
| return False, [] |
There was a problem hiding this comment.
The logic is inverted. is_valid_fsa(fsa) returns an empty list when valid and a list of errors when invalid. An empty list is falsy in Python, so this check returns False when the FSA is valid, which is incorrect. Should be if is_valid_fsa(fsa): to check if there are any errors (truthy non-empty list).
| if len(minimized.states) < len(fsa.states): | ||
| return False, ValidationError( | ||
| message=f"FSA is not minimal: has {len(fsa.states)} states but can be reduced to {len(minimized.states)}", | ||
| code=ErrorCode.NOT_COMPLETE, |
There was a problem hiding this comment.
Wrong error code for minimality check. The error is about minimality, not completeness. Should use a more appropriate error code or consider adding a NOT_MINIMAL error code if one doesn't exist.
| code=ErrorCode.NOT_COMPLETE, | |
| code=ErrorCode.NOT_MINIMAL, |
| if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa): | ||
| return False, None, None |
There was a problem hiding this comment.
Same inverted logic issue as in trace_string. Since is_valid_fsa() returns an empty list for valid FSAs (falsy), this condition will be False when both FSAs are valid, which is incorrect. The logic should check for errors: if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa) should be checking if there ARE errors (truthy non-empty list).
| def test_quick_equivalence_check_different(self, dfa_accepts_a, dfa_accepts_a_or_b): | ||
| is_equiv, hint, hint_type = quick_equivalence_check(dfa_accepts_a, dfa_accepts_a_or_b) | ||
| assert is_equiv is False | ||
| assert hint is not None |
There was a problem hiding this comment.
This test will likely fail due to the bug in quick_equivalence_check line 350-351 where the validation logic is inverted. Both test FSAs are valid, so the function will incorrectly return (False, None, None) immediately, but the test expects a hint to be present.
| def test_trace_accepted(self, dfa_accepts_a): | ||
| accepted, trace = trace_string(dfa_accepts_a, "a") | ||
| assert accepted is True | ||
| assert trace == ["q0", "q1"] |
There was a problem hiding this comment.
This test will fail due to the bug in trace_string line 120 where the validation check is inverted. Since dfa_accepts_a is valid, is_valid_fsa will return an empty list (falsy), making the condition truthy, causing the function to incorrectly return (False, []).
| def test_trace_rejected(self, dfa_accepts_a): | ||
| accepted, trace = trace_string(dfa_accepts_a, "b") | ||
| assert accepted is False | ||
| assert trace == ["q0", "q2"] |
There was a problem hiding this comment.
This test will also fail due to the same inverted validation logic bug in trace_string at line 120.
| def test_fsa_accepts(self, dfa_accepts_a): | ||
| assert fsa_accepts(dfa_accepts_a, "a") is True | ||
| assert fsa_accepts(dfa_accepts_a, "b") is False | ||
| assert fsa_accepts(dfa_accepts_a, "aa") is False |
There was a problem hiding this comment.
While fsa_accepts correctly uses accepts_string from the validation module, these tests verify the function works correctly, which is good. However, it's worth noting that this function could be considered redundant since it's just a thin wrapper around accepts_string.
| def test_empty_string(self, dfa_accepts_a): | ||
| accepted, trace = trace_string(dfa_accepts_a, "") | ||
| assert accepted is False | ||
| assert trace == ["q0"] |
There was a problem hiding this comment.
This test will fail due to the inverted validation logic bug in trace_string at line 120.
| """ | ||
|
|
||
| import pytest | ||
| from evaluation_function.schemas import FSA, ValidationError, ErrorCode |
There was a problem hiding this comment.
Import of 'FSA' is not used.
| from evaluation_function.schemas import FSA, ValidationError, ErrorCode | |
| from evaluation_function.schemas import ValidationError, ErrorCode |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if len(minimized.states) < len(fsa.states): | ||
| return False, ValidationError( | ||
| message=f"FSA is not minimal: has {len(fsa.states)} states but can be reduced to {len(minimized.states)}", | ||
| code=ErrorCode.NOT_COMPLETE, |
There was a problem hiding this comment.
The ErrorCode.NOT_COMPLETE is semantically incorrect for minimality errors. This error code is intended for completeness validation (missing transitions), not for FSA minimality. Consider using a more appropriate error code or adding a NOT_MINIMAL code to the ErrorCode enum.
| code=ErrorCode.NOT_COMPLETE, | |
| code=ErrorCode.NOT_MINIMAL, |
| def analyze_fsa_correction( | ||
| student_fsa: FSA, | ||
| expected_fsa: FSA, | ||
| check_minimality: bool = False |
There was a problem hiding this comment.
The parameter name 'check_minimality' shadows the function 'check_minimality' defined at line 132. This creates ambiguity and can lead to confusion. Consider renaming the parameter to 'require_minimality' or 'verify_minimality'.
| Returns: | ||
| (are_equivalent, counterexample_hint, hint_type) | ||
| """ | ||
| if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa): |
There was a problem hiding this comment.
The condition should return False when validation errors exist (truthy list), but the logic is inverted. The function is_valid_fsa returns a list of errors (empty list when valid). The condition should be 'if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):' but with a check like 'if len(is_valid_fsa(student_fsa)) > 0 or len(is_valid_fsa(expected_fsa)) > 0:' to correctly identify invalid FSAs.
| if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa): | |
| student_validation_errors = is_valid_fsa(student_fsa) | |
| expected_validation_errors = is_valid_fsa(expected_fsa) | |
| if len(student_validation_errors) > 0 or len(expected_validation_errors) > 0: |
see README.md in the correction folder