Skip to content

Unitxt 1.10.0

Compare
Choose a tag to compare
@elronbandel elronbandel released this 03 Jun 18:22
· 348 commits to main since this release
10ee34c

Main changes

  • Added support for handling sensitive data . When data is loaded from a data source using a Loader the user can specify the classification of the data (e.g. "public" or "proprietary"). Then Unitxt components such as metrics and inference engines checks if they are allowed to process the data based on their configuration. For example, an LLM as judge that sends data to remote services can be configured to only send "public" data to the remote services. This replaced the UNITXT_ALLOW_PASSING_DATA_TO_REMOTE_API option, which was a general flag that was not data dependent and hence error prone.
    See more details in https://unitxt.readthedocs.io/en/latest/docs/data_classification_policy.html
  • Added support for adding metric prefix. Each metric has a new optional string attribute "score_prefix", that is appended to all scores it generates. This allows the same metric to be used on different fields of the tasks, and distinguish the output score.
  • New Operators tutorial and Loaders documentation

Backward

  • StreamInstanceOperator was renamed to InstanceOperator

New Features

  • Support for handling sensitive data sent to remote services by @pawelknes in #806 , @yoavkatz in #868
  • Added new NER metric using fuzzywuzzy logic by @sarathsgvr in #808
  • Added loader from HF spaces by @pawelknes in #860
  • Add metric prefix in main by @yoavkatz in #878
  • add MinimumOneExamplePerLabelRefiner to allow ensuring at least one example of each labels appears in the training data. by @alonh in #867

Bug Fix

New Assets

Documentation

New Contributors

Full Changelog: 1.9.0...1.10.0