Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accepting several predictions/annotations for the same record #1630

Open
frascuchon opened this issue Jun 27, 2022 · 2 comments · Fixed by #1658
Open

Accepting several predictions/annotations for the same record #1630

frascuchon opened this issue Jun 27, 2022 · 2 comments · Fixed by #1658
Assignees
Labels
type: community request Indicates a feature requested by someone outside of the Argilla organization type: enhancement Indicates new feature requests type: popular request Indicates that several people outside of the Argilla organization are interested in this feature

Comments

@frascuchon
Copy link
Member

frascuchon commented Jun 27, 2022

Introduction

Currently, records annotations/predictions only support store annotation info for just one annotator agent. The idea is to support several agents, for both, annotations and predictions. This change will bring several feature enhancements such as annotations agreement flows, weak label materialization, multi-pipeline monitoring, and more.

We could give more annotation/prediction control if we combine this feature with roles and dataset settings. By defining a set of annotators (even expected predictors patterns), we can limit the number of agents that can annotate a dataset.

Design keys

The proposed design keeps the prediction/annotation fields and includes a new predictions/annotations one, a data dictionary where the key corresponds to the annotation agent, and the value includes the annotation information provided by the client.

predictions = { “agent-one” : { “labels”: [“A”], “score”: [“0.3”] } } 

This new structure will be enabled for search, providing a mechanism for fine-tuning the searches based on specific annotators/predictors. We can replicate all computed fields per annotation entry, so we could do things like:
annotations.agentA.annotated_as: FALSE or predictions.agent_b.predicted_as: TRUE

Backward compatibility

The new data model must tackle current record concepts, and provide a backward compatibility method to make both modes live.

Current fields such as predicted, predicted_as, and annotated_as could change the behavior since multiple values can be assigned. The only case where we can keep the old behavior should be when only an entry is provided.

Complete list of affected fields:

  • predicted: computed only when one single agent is defined. It will be deprecated and removed in future versions
  • predicted_as: computed only when one single agent is defined. It will be deprecated and removed in future versions
  • annotated_as: computed only when one single agent is defined. It will be deprecated and removed in future versions
  • predicted_by: showing all record agents
  • annotated_by: showing all record agents
  • scores: computed only when one single agent is defined (cc: @dvsrepo). It will be deprecated and removed in future versions
  • prediction: this field will be deprecated and removed in future versions
  • annotation: this field will be use as the "final/real annotation" (annotation agreement). Maybe a better naming in future versions.
  • explanation: (only for text classification) computed only when one single agent is defined. It will be deprecated and removed in future versions. The explanation must be defined at the prediction level.
  • token classification metrics: there are some metrics defined for annotations and predictions. Maybe does not make sense to build all agent metrics, but these fields will be totally affected by the new data model.

References

See recognai/rubrix-roadmap#59

@frascuchon frascuchon added the type: enhancement Indicates new feature requests label Jun 27, 2022
@frascuchon frascuchon added Manual Labeling type: community request Indicates a feature requested by someone outside of the Argilla organization and removed type: enhancement Indicates new feature requests labels Jul 12, 2022
@frascuchon frascuchon transferred this issue from another repository Jul 21, 2022
@frascuchon frascuchon added this to the v0.18.0 milestone Sep 13, 2022
@frascuchon
Copy link
Member Author

There are some task to finish before close this issue:

  • Allow log records with several annotations/predictions
  • Handle multiple annotations from UI (view, selected, remove, change,...)
  • Adapt related filters (backend and UI)
  • Adapt the definition of prediction ok/ko when multiple values can be present.

@cceyda
Copy link
Contributor

cceyda commented Mar 8, 2023

would this also solve the issue for token classification where searching a 'word' with 'annotated_as' returning results where that 'word' is not 'annotated_as' the 'selected tag' but all results that involve that word & tag(on a different word)

@nataliaElv nataliaElv removed this from the 2023 Q1 milestone Jul 4, 2023
@nataliaElv nataliaElv added type: popular request Indicates that several people outside of the Argilla organization are interested in this feature and removed Labeling labels Nov 23, 2023
@nataliaElv nataliaElv added the type: enhancement Indicates new feature requests label Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: community request Indicates a feature requested by someone outside of the Argilla organization type: enhancement Indicates new feature requests type: popular request Indicates that several people outside of the Argilla organization are interested in this feature
Projects
None yet
4 participants