partial Update: elasticsearch/solr vs vespa #4154

zhuxiang1981 · 2017-11-16T07:48:20Z

Our application need partial update a field(rank field) of 200million documents daily, but solr and es goes very slow

jobergum · 2017-11-16T08:14:29Z

Vespa supports partial updates of existing indexed documents, fastest is for fields defined with 'attribute' and of type numeric. See http://docs.vespa.ai/documentation/reference/document-json-update-format.html for update json syntax.

bratseth · 2017-11-16T08:25:54Z

200M a day is about 2k per second. That should work fine for any kind of field even on a single node.

zhuxiang1981 · 2017-11-16T08:40:58Z

vespa’s partial update just reindex the updated fields ? es will reindex all the fields

jobergum · 2017-11-16T08:45:49Z

Just the fields that you want to update. That is why we call it a partial update. Numeric fields like byte, int, float etc is faster then string. Agree with @bratseth, 200M updates a day should be no match for even a single core machine.

ddorian · 2017-11-18T12:23:48Z

@zhuxiang1981 solr has in-place-updates but with some caveats (non indexed etc) https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates

jobergum · 2018-01-05T17:43:37Z

Any further questions on this topic @zhuxiang1981 ? Thanks

vandit-thakkar · 2018-05-22T20:47:04Z

We recently saw that 16k updates/sec were successful in one of our experiments with a cluster having 3 nodes, although all were integer updates. It's a good enough for now. We want to achieve 100k/sec updates which we would horizontally scale and achieve. Though we found that update throughput got very low (4k/sec) after we simultaneously ran benchmarking and hit the system with lots of queries. Any suggestions ?

baldersheim · 2018-05-23T06:28:11Z

In order to tell wether your numbers makes sense, I need need to know the machine config you are using. Also your search definition and services file would be helpful. There are some tricks that can be applied to push it even further up in some cases. Feed performance will go down during query load, how much depends on number of threads on your machine. As it is a search engine it is designed to favour queries over feed. It can be tuned, but that has not been done very often, so it must be experimented in each case. I also do not remeber how well documented it is.

…

On Tue, May 22, 2018 at 10:47 PM, Vandit Thakkar ***@***.***> wrote: We recently saw that 16k updates/sec were successful in one of our experiments with a cluster having 3 nodes, although all were integer updates. It's a good enough for now. We want to achieve 100k/sec updates which we would horizontally scale and achieve. Though we found that update throughput got very low after we simultaneously ran benchmarking and hit the system with lots of queries. Any suggestions ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4154 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AS8BfFSxSkDzI9LJBWbhBXjRmVRSg5tQks5t1HlNgaJpZM4QgFWB> .

bratseth added the question label Apr 11, 2019

bratseth closed this as completed Apr 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

partial Update: elasticsearch/solr vs vespa #4154

partial Update: elasticsearch/solr vs vespa #4154

zhuxiang1981 commented Nov 16, 2017

jobergum commented Nov 16, 2017

bratseth commented Nov 16, 2017

zhuxiang1981 commented Nov 16, 2017

jobergum commented Nov 16, 2017

ddorian commented Nov 18, 2017

jobergum commented Jan 5, 2018

vandit-thakkar commented May 22, 2018 •

edited

baldersheim commented May 23, 2018 via email

partial Update: elasticsearch/solr vs vespa #4154

partial Update: elasticsearch/solr vs vespa #4154

Comments

zhuxiang1981 commented Nov 16, 2017

jobergum commented Nov 16, 2017

bratseth commented Nov 16, 2017

zhuxiang1981 commented Nov 16, 2017

jobergum commented Nov 16, 2017

ddorian commented Nov 18, 2017

jobergum commented Jan 5, 2018

vandit-thakkar commented May 22, 2018 • edited

baldersheim commented May 23, 2018 via email

vandit-thakkar commented May 22, 2018 •

edited