New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add attributes to API data classes #150
Conversation
@cmwilhelm per @yoganandc suggestion, I'm now mapping what is the This should better fit the design of the annotation store. All previous considerations regarding attributes being typed are still valid. Let me know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was hoping for something much simpler...
can we just have a
class Annotation(BaseModel):
id: Optional[str]
type: Optional[str]
attributes: Dict[str, Any] = {}
class BoxGroup(Annotation):
class SpanGroup(Annotation):
and then in to_mmda()
pydantic_annotation.attributes = dict(mmda_annotation.metadata)
The tricky part in this is that, in The reason for such decision was that nor |
👍 a little more logic but still fits the overall structure im proposing right? |
@yoganandc the complexity here I think is @soldni responding to my direct request to have some sort of document schema for however a given model is using metadata. If model authors will be dumping principal inference data into them, we need a way to specify what the shape of that data is. Otherwise no one is going to be able to make sense of it later. |
explicitly removing `id`, `text`, and `type` is no longer reuqired bc they are automatically ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me then. @cmwilhelm maybe you have more thoughts since you're working on the client PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM but I had one small suggestion.
We'll also need to bump version in setup.py again because we've claimed 0.0.43 :)
ai2_internal/api.py
Outdated
if not hasattr(inherit_cls, "__annotations__"): | ||
continue | ||
if "attributes" in inherit_cls.__annotations__: | ||
return inherit_cls.__annotations__["attributes"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could do this via:
@classmethod
def get_metadata_cls(cls) -> Type[Attributes]:
return cls.__fields__["attributes"]["type"]
rather than using the low level python api and traversing the hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
# Conflicts: # setup.py
This PR adds metadata to API data classes.
API data classes and mmda types differ in a few significant aspects:
id
andtype
(andtext
forSpanGroup
) are stored inmetadata
for mmda types; in the APIs, they are part of the top-level attributes.metadata
can store arbitrary content in the mmda types; in the data API, all attributes that are not explicitly declared are dropped.metadata
entries are mapped to anattributes
field to match how data is stored in the Annotation Store