-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow modalities of multimodal docs to be accessed #425
Conversation
8c82cfb
to
64c2025
Compare
Codecov Report
@@ Coverage Diff @@
## main #425 +/- ##
==========================================
+ Coverage 84.45% 86.63% +2.18%
==========================================
Files 134 134
Lines 6486 6497 +11
==========================================
+ Hits 5478 5629 +151
+ Misses 1008 868 -140
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
29139f8
to
621f372
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I miss tests to validate and see how Documents
are changed from the Document
and not the DataClass
.
Saying:
How do I set d.header.text
? or d.header
? If Document
has header
as subdocument
mm_attr_da = self.get_multi_modal_attribute(attr) | ||
return mm_attr_da if len(mm_attr_da) > 1 else mm_attr_da[0] | ||
else: | ||
raise AttributeError(f'{self.__class__.__name__} has no attribute {attr}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should refer to the class here but rather to the object. An Python object can have an attribute without it being define in the class ( we can add attribute on the fly after object instantiation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to the error message? i don't think that the object itself has a __name__
, could you write a code change suggestion showing what you would do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comment
Please make sure to bump minor version before the release, this is a significant feature and worth a minor not patch release @JoanFM |
@hanxiao we know, we will work on the blog post and documentation before though |
This allows Documents that come from a multimodal dataclass to expose their multimodal attributes like any other 'native' attribute (check last line in the code snippet):
This also extends to use inside of an Executor:
Advanced usage example
This also handles list-types, nested dataclasses, and list-types of nested data classes:
Click to expand: Advanced example
This should solve the issue of multimodal features not being easily usable inside of Jina.
Still TODO:
make sure thatcontent
is always the desired return valuesupport custom gettersgetters are only called when performngMMDoc(d)
, not when doingd.attr
. so not relevant here