Create CharsetDetector#148
Merged
Merged
Conversation
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Branch-free dispatch via set.update vs set.add bound at construction. Enables character-set accumulation for charset-style detectors without a per-call isinstance check.
to_state/from_state preserve the flag so reloaded trackers continue to use the same accumulation strategy. Old snapshots without the key default to False (backward compatible).
Closure factory threads the flag down to each per-variable SingleStabilityTracker without touching MultiTracker or EventTracker. Explicit named parameter at the public boundary.
EventTracker.load previously called EventTracker.__init__ directly, bypassing subclass closures (e.g. EventStabilityTracker's expand_value factory). Newly-encountered variables after a reload silently used default semantics. Route subclasses through cls(**kwargs) so the factory rebuilds; base class path unchanged.
The cls-is-EventTracker branch is the legacy reconstruction path; subclasses go through cls(**kwargs) so closure-based factories (like EventStabilityTracker's expand_value) can be rebuilt. Subclasses must therefore accept event_data_kwargs and not require positional args from __init__.
Wires expand_value=True for the main persistency so unique_set accumulates chars. auto_conf_persistency keeps default (whole-value stability) for variable selection. Also adds the missing _register_persistency call so the trained state participates in persist/load.
- Replace nested any(c in x for x in unique_set) loop with set(v) - unique_set - Combine all unknown chars per variable into a single alert message - Remove ignore_non_string_val config field and sys.exit guard (let upstream type errors surface naturally if a non-string slips through) - Strip ignore_non_string_val from test configs and pipeline_config_default.yaml
from_dict replaces the entire config object during auto-config rebuild, which previously wiped any persist setting set by an earlier config load. Save and restore old_persist, matching the pattern in NewValueDetector.set_configuration().
Type[SingleTracker] only accepts class objects, but closure factories (e.g. EventStabilityTracker.make_tracker) are equally valid producers. Widening to Callable[[], SingleTracker] removes the type: ignore at the EventStabilityTracker call site and documents the real contract. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #59