Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partial Update: elasticsearch/solr vs vespa #4154

Closed
zhuxiang1981 opened this issue Nov 16, 2017 · 8 comments
Closed

partial Update: elasticsearch/solr vs vespa #4154

zhuxiang1981 opened this issue Nov 16, 2017 · 8 comments
Labels
Projects

Comments

@zhuxiang1981
Copy link

Our application need partial update a field(rank field) of 200million documents daily, but solr and es goes very slow

@jobergum
Copy link
Member

Vespa supports partial updates of existing indexed documents, fastest is for fields defined with 'attribute' and of type numeric. See http://docs.vespa.ai/documentation/reference/document-json-update-format.html for update json syntax.

@bratseth
Copy link
Member

200M a day is about 2k per second. That should work fine for any kind of field even on a single node.

@zhuxiang1981
Copy link
Author

vespa’s partial update just reindex the updated fields ? es will reindex all the fields

@jobergum
Copy link
Member

Just the fields that you want to update. That is why we call it a partial update. Numeric fields like byte, int, float etc is faster then string. Agree with @bratseth, 200M updates a day should be no match for even a single core machine.

@ddorian
Copy link

ddorian commented Nov 18, 2017

@jobergum
Copy link
Member

jobergum commented Jan 5, 2018

Any further questions on this topic @zhuxiang1981 ? Thanks

@vandit-thakkar
Copy link

vandit-thakkar commented May 22, 2018

We recently saw that 16k updates/sec were successful in one of our experiments with a cluster having 3 nodes, although all were integer updates. It's a good enough for now. We want to achieve 100k/sec updates which we would horizontally scale and achieve. Though we found that update throughput got very low (4k/sec) after we simultaneously ran benchmarking and hit the system with lots of queries. Any suggestions ?

@baldersheim
Copy link
Member

baldersheim commented May 23, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Support
Awaiting triage
Development

No branches or pull requests

6 participants