FSA Correction, overall backend pipeline by Ren-creater · Pull Request #12 · lambda-feedback/FSA-eval

Ren-creater · 2026-01-15T19:51:31Z

see README.md in the correction folder

…ction

…and validation

…s from schema

…ectly, elimianted duplicated logic between validationa and correction

Copilot

Pull request overview

This PR introduces a new FSA Correction Module that compares student FSAs against expected FSAs by leveraging existing validation functions. The module provides detailed feedback about structural errors, language equivalence, and optionally minimality checking.

Changes:

Added new correction module (correction.py and __init__.py) with pipeline for FSA comparison
Removed unused Answer schema (answer.py) and related imports from schemas module
Added comprehensive test suite for the correction module (test_correction.py)
Updated schema documentation to remove Answer schema references

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`evaluation_function/correction/correction.py`	Implements 4-step correction pipeline: validation, minimality check, structural analysis, and language equivalence
`evaluation_function/correction/__init__.py`	Exports correction module functions and classes
`evaluation_function/correction/README.md`	Documents correction module architecture and usage
`evaluation_function/test/test_correction.py`	Comprehensive test suite for correction functionality
`evaluation_function/schemas/answer.py`	Removed unused Answer schema definitions
`evaluation_function/schemas/__init__.py`	Removed Answer schema exports
`evaluation_function/schemas/README.md`	Removed Answer schema documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-15T19:54:31Z

+    if is_valid_fsa(fsa):
+        return False, []


The logic is inverted. is_valid_fsa(fsa) returns an empty list when valid and a list of errors when invalid. An empty list is falsy in Python, so this check returns False when the FSA is valid, which is incorrect. Should be if is_valid_fsa(fsa): to check if there are any errors (truthy non-empty list).

Copilot · 2026-01-15T19:54:32Z

+        if len(minimized.states) < len(fsa.states):
+            return False, ValidationError(
+                message=f"FSA is not minimal: has {len(fsa.states)} states but can be reduced to {len(minimized.states)}",
+                code=ErrorCode.NOT_COMPLETE,


Wrong error code for minimality check. The error is about minimality, not completeness. Should use a more appropriate error code or consider adding a NOT_MINIMAL error code if one doesn't exist.

Suggested change

code=ErrorCode.NOT_COMPLETE,

code=ErrorCode.NOT_MINIMAL,

Copilot · 2026-01-15T19:54:32Z

+    if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):
+        return False, None, None


Same inverted logic issue as in trace_string. Since is_valid_fsa() returns an empty list for valid FSAs (falsy), this condition will be False when both FSAs are valid, which is incorrect. The logic should check for errors: if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa) should be checking if there ARE errors (truthy non-empty list).

Copilot · 2026-01-15T19:54:32Z

+    def test_quick_equivalence_check_different(self, dfa_accepts_a, dfa_accepts_a_or_b):
+        is_equiv, hint, hint_type = quick_equivalence_check(dfa_accepts_a, dfa_accepts_a_or_b)
+        assert is_equiv is False
+        assert hint is not None


This test will likely fail due to the bug in quick_equivalence_check line 350-351 where the validation logic is inverted. Both test FSAs are valid, so the function will incorrectly return (False, None, None) immediately, but the test expects a hint to be present.

Copilot · 2026-01-15T19:54:33Z

+    def test_trace_accepted(self, dfa_accepts_a):
+        accepted, trace = trace_string(dfa_accepts_a, "a")
+        assert accepted is True
+        assert trace == ["q0", "q1"]


This test will fail due to the bug in trace_string line 120 where the validation check is inverted. Since dfa_accepts_a is valid, is_valid_fsa will return an empty list (falsy), making the condition truthy, causing the function to incorrectly return (False, []).

Copilot · 2026-01-15T19:54:33Z

+    def test_trace_rejected(self, dfa_accepts_a):
+        accepted, trace = trace_string(dfa_accepts_a, "b")
+        assert accepted is False
+        assert trace == ["q0", "q2"]


This test will also fail due to the same inverted validation logic bug in trace_string at line 120.

Copilot · 2026-01-15T19:54:33Z

+    def test_fsa_accepts(self, dfa_accepts_a):
+        assert fsa_accepts(dfa_accepts_a, "a") is True
+        assert fsa_accepts(dfa_accepts_a, "b") is False
+        assert fsa_accepts(dfa_accepts_a, "aa") is False


While fsa_accepts correctly uses accepts_string from the validation module, these tests verify the function works correctly, which is good. However, it's worth noting that this function could be considered redundant since it's just a thin wrapper around accepts_string.

Copilot · 2026-01-15T19:54:34Z

+    def test_empty_string(self, dfa_accepts_a):
+        accepted, trace = trace_string(dfa_accepts_a, "")
+        assert accepted is False
+        assert trace == ["q0"]


This test will fail due to the inverted validation logic bug in trace_string at line 120.

Copilot · 2026-01-15T19:54:34Z

+"""
+
+import pytest
+from evaluation_function.schemas import FSA, ValidationError, ErrorCode


Import of 'FSA' is not used.

Suggested change

from evaluation_function.schemas import FSA, ValidationError, ErrorCode

from evaluation_function.schemas import ValidationError, ErrorCode

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-15T20:03:49Z

+        if len(minimized.states) < len(fsa.states):
+            return False, ValidationError(
+                message=f"FSA is not minimal: has {len(fsa.states)} states but can be reduced to {len(minimized.states)}",
+                code=ErrorCode.NOT_COMPLETE,


The ErrorCode.NOT_COMPLETE is semantically incorrect for minimality errors. This error code is intended for completeness validation (missing transitions), not for FSA minimality. Consider using a more appropriate error code or adding a NOT_MINIMAL code to the ErrorCode enum.

Suggested change

code=ErrorCode.NOT_COMPLETE,

code=ErrorCode.NOT_MINIMAL,

Copilot · 2026-01-15T20:03:49Z

+def analyze_fsa_correction(
+    student_fsa: FSA,
+    expected_fsa: FSA,
+    check_minimality: bool = False


The parameter name 'check_minimality' shadows the function 'check_minimality' defined at line 132. This creates ambiguity and can lead to confusion. Consider renaming the parameter to 'require_minimality' or 'verify_minimality'.

Copilot · 2026-01-15T20:03:50Z

+    Returns:
+        (are_equivalent, counterexample_hint, hint_type)
+    """
+    if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):


The condition should return False when validation errors exist (truthy list), but the logic is inverted. The function is_valid_fsa returns a list of errors (empty list when valid). The condition should be 'if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):' but with a check like 'if len(is_valid_fsa(student_fsa)) > 0 or len(is_valid_fsa(expected_fsa)) > 0:' to correctly identify invalid FSAs.

Suggested change

if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):

student_validation_errors = is_valid_fsa(student_fsa)

expected_validation_errors = is_valid_fsa(expected_fsa)

if len(student_validation_errors) > 0 or len(expected_validation_errors) > 0:

Ren-creater added 13 commits January 15, 2026 09:41

initial correction implementation

c4efdc6

feat: moved correction.py under a new folder, added isomorphism corre…

99e0fac

…ction

fix: added correctly modified correction.py

161c5d5

feat: added init for correction

868c14b

feat: added readme for correction

a146fee

added testing (all passed) for correction

85c2ee1

fix: make sure correction fully leverge functions provided by schema …

6232cf7

…and validation

fix: make sure correction use model dump, make sure test use function…

8bcf001

…s from schema

fix: make trace_string and accepts use functions from validation corr…

19db49b

…ectly, elimianted duplicated logic between validationa and correction

feat: added answer, params, result schemas onto the whole pipeline

a1fa549

remove answer schema

67a77d5

fix: remove modules that depends on answer.py, streamlined correction

4cf5019

fix: updated readme

8bf7bbd

Copilot AI review requested due to automatic review settings January 15, 2026 19:51

Copilot started reviewing on behalf of Ren-creater January 15, 2026 19:52 View session

Ren-creater requested a review from JohnnyWan1123 January 15, 2026 19:53

Copilot AI reviewed Jan 15, 2026

View reviewed changes

fix: removed unused functions

128e848

Ren-creater requested a review from Copilot January 15, 2026 20:00

Copilot started reviewing on behalf of Ren-creater January 15, 2026 20:01 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

Ren-creater added 4 commits January 15, 2026 20:10

fix: removed unsued import

7ee4996

fix: removed convenience functions for testing

621e19b

feat: integrated correction into evaluation

4364427

fix: take all hints, added NO_MINIMAL to schema

a691681

JohnnyWan1123 approved these changes Jan 16, 2026

View reviewed changes

JohnnyWan1123 merged commit b071a80 into main Jan 16, 2026

		if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):
		return False, None, None

	from evaluation_function.schemas import FSA, ValidationError, ErrorCode
	from evaluation_function.schemas import ValidationError, ErrorCode

-    if is_valid_fsa(student_fsa) or is_valid_fsa(expected_fsa):
+    student_validation_errors = is_valid_fsa(student_fsa)
+    expected_validation_errors = is_valid_fsa(expected_fsa)
+    if len(student_validation_errors) > 0 or len(expected_validation_errors) > 0:

Conversation

Ren-creater commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants