Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: benchmark dump test #2307

Merged
merged 1 commit into from Apr 14, 2021
Merged

Conversation

cristianmtr
Copy link
Contributor

@cristianmtr cristianmtr commented Apr 13, 2021

Fix dumprequest not being propagated to the send ctrl message

Baseline:

### baseline - query time with 1 on 100000 docs ...	         Client@86294[S]:connected to the gateway at 0.0.0.0:55803!
### baseline - query time with 1 on 100000 docs takes 1 second (1.57s)
### baseline - indexing: 100000 docs ...	         Client@86294[S]:connected to the gateway at 0.0.0.0:37613!
### baseline - indexing: 100000 docs takes 9 seconds (9.75s)

Dump / Reload

### indexing 100000 docs takes 17 seconds (17.56s)
### dumping 100000 docs takes 3 seconds (3.27s)
### dump path size: 112.75556 MBs
### reloading 100000 takes 2 seconds (2.50s)

The reason indexing is slower for the DBMSBinaryPb vs Baseline BinaryPb is because we need to pickle both the vector and the metadata before storing them in the underlying KV data structure the BinaryPb uses:

    def add(
        self, ids: List[str], vecs: List[np.array], metas: List[bytes], *args, **kwargs
    ):
        """Add to the DBMS Indexer, both vectors and metadata

        :param ids: the ids of the documents
        :param vecs: the vectors
        :param metas: the metadata, in binary format
        :param args: not used
        :param kwargs: not used
        """
        if not any(ids):
            return

        vecs_metas = [pickle.dumps((vec, meta)) for vec, meta in zip(vecs, metas)]
        self._add(ids, vecs_metas)

@cristianmtr cristianmtr requested a review from a team as a code owner April 13, 2021 15:26
@jina-bot jina-bot added size/S area/core This issue/PR affects the core codebase area/testing This issue/PR affects testing component/type labels Apr 13, 2021
@cristianmtr cristianmtr force-pushed the fix-test-dump-reload-benchmark branch from dced7ad to 1a92d82 Compare April 13, 2021 15:29
@codecov
Copy link

codecov bot commented Apr 13, 2021

Codecov Report

Merging #2307 (1a92d82) into master (82d12e1) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2307   +/-   ##
=======================================
  Coverage   90.83%   90.83%           
=======================================
  Files         222      222           
  Lines       11740    11740           
=======================================
  Hits        10664    10664           
  Misses       1076     1076           
Flag Coverage Δ
daemon 51.22% <100.00%> (ø)
jina 91.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/types/message/__init__.py 88.20% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82d12e1...1a92d82. Read the comment docs.

Copy link
Member

@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link

Latency summary

Current PR yields:

  • 😶 index QPS at 1127, delta to last 3 avg.: -3%
  • 😶 query QPS at 19, delta to last 3 avg.: -5%

Breakdown

Version Index QPS Query QPS
current 1127 19
1.1.5 1164 20
1.1.4 1172 20

Backed by latency-tracking. Further commits will update this comment.

@cristianmtr cristianmtr merged commit 03b6f21 into master Apr 14, 2021
@cristianmtr cristianmtr deleted the fix-test-dump-reload-benchmark branch April 14, 2021 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core This issue/PR affects the core codebase area/testing This issue/PR affects testing component/type size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants