-
Notifications
You must be signed in to change notification settings - Fork 0
[DEVX-828] Added image summarization in multimodal pipeline #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
resp = self.model.predict(img_inputs) | ||
|
||
new_elements = [] | ||
for i, element in enumerate(resp.outputs): | ||
summary = "" | ||
if image_elements[i].text: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure that the index of image in image_elements is same as in resp.outputs?
image_data = meta.pop('image_base64', None) | ||
id = meta.get('input_id', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ: why are we adding this new field ID and will it be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is to identify the corresponding summary that was generated for the images in the PDF!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition! Left some comments!
new_elements = [] | ||
for i, element in enumerate(resp.outputs): | ||
summary = "" | ||
if image_elements[i].text: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe image elements will not have text, so why this check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I observed that some image elements had text too... it can be seen in the output of 9th cell in this notebook
""" Summarizes image elements. """ | ||
|
||
def __init__(self, | ||
pat, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PAT can be optional, if not passed, the Clarifai SDK itself will check in env and return a error if is not set in env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
https://clarifai.atlassian.net/browse/DEVX-828