Skip to content
This repository has been archived by the owner on Oct 13, 2021. It is now read-only.

创建持久化索引,每分钟只能写3000-5000条数据,请问正常吗?Create a persistent index, only write 3000-5000 data per minute. Is it normal? #48

Closed
duomi opened this issue Apr 7, 2018 · 5 comments
Labels
Milestone

Comments

@duomi
Copy link

duomi commented Apr 7, 2018

使用的是微博搜索那个例子,将数据改为从自己数据库中搜索,但是写入的速度很慢,每分钟只能写入3000-5000条,已经使用了协程,不知道是不是哪里写错了。代码如下
Using the Weibo search example, the data is changed to search from my own database, but the write speed is very slow, only 3000-5000 can be written per minute. Correspondence has been used,I don't know where it was wrong.Code show as below

for i := 0; i < 100; i++ {
	go indexXwz(xwzs)
}

func indexXwz(xwzs <-chan Xwz) {
	for xwz := range xwzs {
		searcher.IndexDoc(xwz.Id, types.DocIndexData{
			Content: xwz.Name,
			Fields: XwzScoringFields{
				Timestamp: xwz.LatestDate,
				CountNum:  xwz.CountNum,
			},
		}, true)
	}
        searcher.Flush()
}
@vcaesar
Copy link
Member

vcaesar commented Apr 7, 2018

First, searcher.Flush() only needs to be called once, and then you

searcher.Init(types.EngineOpts{
    // Using: using,
    StorageShards: storageShards,
    NumShards: numShards,
})

configure Coroutines.

@vcaesar vcaesar added this to the v0.20.0 milestone Apr 7, 2018
@duomi
Copy link
Author

duomi commented Apr 8, 2018

@vcaesar Do you mean that my coroutines are not running? Can you describe more clearly?

@vcaesar
Copy link
Member

vcaesar commented Apr 8, 2018

I mean is that you can configure the number of coroutines for storage to increase speed.

@duomi
Copy link
Author

duomi commented Apr 8, 2018

I has already use loop to run 100 coroutines, did i use it in a wrong way?
So what's the correct way,can you show me,please?@vcaesar

@karfield
Copy link

karfield commented Apr 25, 2018

@Cliff2016 Using internal sharding instead of fork routines with calling IndexDoc, it's not called "parallel processing". 并发的调用一个接口并不等于让内部分片产生效果,顶多是频繁调用接口,而且跟糟糕的是调用完了还去 flush 一下. You need to think like a program. 一个引擎 indexing 真正工作慢的原因往往在于io,所以没事不要去flush,这个引擎内部有分片机制,那就多用用这个机制,来提升效率。

@vcaesar vcaesar closed this as completed May 31, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants