New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: sharding parallel one #1657
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1657 +/- ##
===========================================
- Coverage 84.95% 69.18% -15.78%
===========================================
Files 133 132 -1
Lines 6948 6831 -117
===========================================
- Hits 5903 4726 -1177
- Misses 1045 2105 +1060
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Latency summaryCurrent PR yields:
Breakdown
Backed by latency-tracking. Further commits will update this comment. |
I think this can be expected, pea_id is only used to define separate workspaces, and I think in executors the pea_id is checked against -1 |
jina/peapods/pods/__init__.py
Outdated
@@ -114,10 +114,6 @@ def _parse_args(self, args: Namespace) -> Dict[str, Optional[Union[List[Namespac | |||
self.is_tail_router = True | |||
peas_args['tail'] = _copy_to_tail_args(args) | |||
peas_args['peas'] = _set_peas_args(args, peas_args.get('head', None), peas_args.get('tail', None)) | |||
else: | |||
self.is_head_router = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic needs to remain there, are you sure it needs removal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I don't think it is used, but now I keep it there because it is out of scope for this pr
@florian-hoenicke it would be good to describe what is the expected value of would it be 1 or 0? |
Does this |
Yes, the If we have 2 shards, the indexes are stored at |
I think the correct way is to only consider the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add unittest with to check current_workspace
does not include any reference to pea_id
?
I don't update the pea_id anymore, but change the way the workspace path is created. The test validates that the index file is created in the right folder:
There exists another test validating that the workspace is set up correctly in case we have more than one shard. |
The following case leads to
pea_id == -1
: