Open
Description
As a developer, I want to implement custom thresholds per recognizer in Presidio Analyzer so that different entities can have specific sensitivity levels for better accuracy and performance.
Acceptance Criteria
- Configuration: The system should allow setting custom thresholds for each recognizer and entity via a configuration file (e.g., YAML).
The configuration should be easily adjustable without modifying the core codebase. - Threshold Application: The system should apply the specified custom thresholds during the recognition process. Each recognizer should use its respective threshold for evaluating entities.
- Validation: Ensure that the custom thresholds are correctly loaded and applied by running unit tests.
Validate that the recognition results vary according to the specified thresholds. - Documentation: Update the documentation to include instructions on how to set and adjust custom thresholds for recognizers and entities. Provide examples of configuration files with custom thresholds.
- Error Handling: Implement error handling to manage cases where thresholds are not specified or incorrectly configured.
Ensure that meaningful error messages are provided to guide users in correcting configuration issues. - Backward compatibility: The recognizer-level thresholding should be backward compatible, and work even if the user doesn't apply recognizer level thresholding.