-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: cache data field #1878
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1878 +/- ##
==========================================
+ Coverage 84.32% 86.28% +1.96%
==========================================
Files 147 147
Lines 7016 7014 -2
==========================================
+ Hits 5916 6052 +136
+ Misses 1100 962 -138
Continue to review full report at Codecov.
|
Latency summaryCurrent PR yields:
Breakdown
Backed by latency-tracking. Further commits will update this comment. |
jina/executors/indexers/cache.py
Outdated
@@ -62,15 +62,12 @@ def __init__(self, index_filename: Optional[str] = None, field: Optional[str] = | |||
if self.field not in self.supported_fields: | |||
raise ValueError(f"Field '{self.field}' not in supported list of {self.supported_fields}") | |||
|
|||
def add(self, doc_id: str, *args, **kwargs): | |||
"""Add a document to the cache depending on `self.field`. | |||
def add(self, doc_id: str, data: str, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should break the API of the inherited method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inherited method from the BaseKVIndexer
is:
def add(self, keys: Iterable[str], values: Iterable[bytes], *args, **kwargs):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have the same interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed with @lusloher, adapting the interface should be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably I could help to update the docstrings when you finished this PR? 😄
def get_query_handler(self): | ||
def get_query_handler(self) -> 'ReadHandler': | ||
"""Get read file handler. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One-line Docstrings should be in one line. https://docs.jina.ai/chapters/docstring/docstring.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yongxuanzhang nice catch! The pr is merged already, but I will have that in mind next time.
""" | ||
if self.with_serialization: | ||
self.exec_fn(req_doc.id, req_doc.SerializeToString(), **{DATA_FIELD: data}) | ||
self.exec_fn([req_doc.id], req_doc.SerializeToString(), [value]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov said this line is not covered by tests. Not sure if Codecov is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. For now, we don't have a cache using this code. But you are right it should be tested in the future.
No description provided.