Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: prepare changes to have batching for every executor #2110

Merged
merged 23 commits into from Mar 8, 2021

Conversation

JoanFM
Copy link
Member

@JoanFM JoanFM commented Mar 3, 2021

Changes introduced
The changes introduced include:

  • Changes in the way attributes are accessed in Document
  • Availability to extract full set of attributes from DocumentSet (beyond only embeddings so that it can be used for batching everywhere)
  • Extract multiple content from document in different batches (if I extract text and id from DocumentSet I want to extract a batch of text in one side and a batch of id in another)
  • Better testing and flat results for batching
  • Start integration test showing how different executors should implement batching

@JoanFM JoanFM requested a review from a team as a code owner March 3, 2021 10:47
@jina-bot jina-bot added size/S area/testing This issue/PR affects testing labels Mar 3, 2021
@codecov
Copy link

codecov bot commented Mar 3, 2021

Codecov Report

Merging #2110 (ea2b21e) into master (caae3f6) will increase coverage by 0.01%.
The diff coverage is 96.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2110      +/-   ##
==========================================
+ Coverage   89.78%   89.80%   +0.01%     
==========================================
  Files         211      211              
  Lines       11153    11209      +56     
==========================================
+ Hits        10014    10066      +52     
- Misses       1139     1143       +4     
Flag Coverage Δ
daemon 50.46% <19.75%> (-0.21%) ⬇️
jina 90.27% <96.29%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/types/sets/document.py 95.27% <90.90%> (-1.22%) ⬇️
jina/executors/decorators.py 91.70% <97.14%> (+0.69%) ⬆️
jina/types/document/__init__.py 91.89% <100.00%> (+0.31%) ⬆️
jina/clients/base.py 80.83% <0.00%> (-2.50%) ⬇️
jina/peapods/zmq/__init__.py 79.87% <0.00%> (-2.14%) ⬇️
jina/flow/base.py 92.58% <0.00%> (+2.30%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update caae3f6...d1ab9ec. Read the comment docs.

@github-actions
Copy link

github-actions bot commented Mar 3, 2021

Latency summary

Current PR yields:

  • 😶 index QPS at 1004, delta to last 3 avg.: +4%
  • 😶 query QPS at 14, delta to last 3 avg.: -2%

Breakdown

Version Index QPS Query QPS
current 1004 14
1.0.8 963 14
1.0.7 963 14

Backed by latency-tracking. Further commits will update this comment.

@jina-bot jina-bot added size/M area/core This issue/PR affects the core codebase component/type and removed size/S labels Mar 3, 2021
@JoanFM JoanFM changed the title test: test _extract content from docset refactor: prepare changes to have batching for every executor Mar 3, 2021
@JoanFM JoanFM force-pushed the test-get-content branch 2 times, most recently from 76e45cb to d8d05fc Compare March 3, 2021 17:21
@jina-bot jina-bot added size/L and removed size/M labels Mar 3, 2021
@jina-bot jina-bot added size/XL and removed size/L labels Mar 5, 2021
jina/executors/decorators.py Outdated Show resolved Hide resolved
jina/executors/decorators.py Show resolved Hide resolved
jina/executors/decorators.py Outdated Show resolved Hide resolved
jina/types/sets/document.py Outdated Show resolved Hide resolved
jina/types/sets/document.py Outdated Show resolved Hide resolved
jina/types/sets/document.py Outdated Show resolved Hide resolved
jina/types/sets/document.py Outdated Show resolved Hide resolved
JoanFM and others added 4 commits March 8, 2021 10:57
Co-authored-by: Nan Wang <nan.wang@jina.ai>
Co-authored-by: Nan Wang <nan.wang@jina.ai>
Co-authored-by: Nan Wang <nan.wang@jina.ai>
@JoanFM JoanFM requested a review from nan-wang March 8, 2021 10:08
Copy link
Member

@nan-wang nan-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

Co-authored-by: Nan Wang <nan.wang@jina.ai>
Co-authored-by: Nan Wang <nan.wang@jina.ai>
Copy link
Member

@nan-wang nan-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JoanFM JoanFM merged commit f0b6a44 into master Mar 8, 2021
@JoanFM JoanFM deleted the test-get-content branch March 8, 2021 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core This issue/PR affects the core codebase area/testing This issue/PR affects testing component/executor component/type executor/meta size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants