-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: create DocumentArray index map lazily #3944
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3944 +/- ##
==========================================
+ Coverage 89.04% 89.08% +0.04%
==========================================
Files 180 180
Lines 12629 12642 +13
==========================================
+ Hits 11245 11262 +17
+ Misses 1384 1380 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Latency summaryCurrent PR yields:
Breakdown
Backed by latency-tracking. Further commits will update this comment. |
jina/types/arrays/document.py
Outdated
self._id_to_index = None | ||
|
||
@property | ||
def id_to_index(self) -> Dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it provate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's used in jina/types/document/graph.py so it can't be private (IDE used to show a warning)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets still make it private ignore graphDoc for now. graphDoc has been removed from our docs and will highly likely to be deprecated in 3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! i also think we can safely ignore graph document at the moment
4a2918d
to
5bf94c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test in the types than make sure that a DocumentArray has _id_to_index
to None at the beginning?
added |
assert da._id_to_index is None | ||
|
||
# build index map | ||
assert len(da._index_map.keys()) == 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change the test to show that the _index_map
is not None
after accessing by id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great job!
closes https://github.com/jina-ai/internal-tasks/issues/279
results:
on master:
on feature branch: