memory leak problem #677

vsop-479 · 2019-10-28T11:07:35Z

Hi, i'm a rust rookie. my program suffer a memory leak problem, it could been killed by kernel
after running along time.
Getting data from redis，and adding to tantivy with rust:

fn main() {
    let mut index_writer = index.writer_with_num_threads(num_threads, buffer).unwrap();
    index_writer.set_merge_policy(mergePolicy);
    loop {
        let mut datas = RedisService::get_data();
        let len = datas.len();
        if len == 0 {
            //sleep
        }
        for data in datas {
            let mut doc = Document::new();
            let json_value = json::parse(&data).unwrap();
            //doc.add_text,add_u64;
            index_writer.add_document(doc);
        }
        index_writer.commit();
    }
}

petr-tik · 2019-10-28T22:37:55Z

Hey, since you say you are less experienced with rust, it would be good to understand a bit more about your problem.

How long is "a long time"?
Are you sure it's killed by the kernel? Specifically OOM or something else?
How much data is in your RedisService?

Also, I don't recommend you commit in a loop, that will slow down your indexing pipeline and lead to many small segments.

fulmicoton · 2019-10-28T23:46:56Z

Your polling loop looks like it batches but it does not.
I suspect you commit almost one doc at a time.

In that case you create a lot of tiny segment. Tantivy is a bit dumb and does not know how to wait for merging threads, so you end up with an evergrowing number of merging thread and this is the source of your OOM.

Can you batch your commit and see if your problem is solved?
If you need something quick and dirty, just sleep 1s within your loop to prevent 2 commit to happen in a 1s interval.

vsop-479 · 2019-10-29T07:05:31Z

@petr-tik @fulmicoton
My program writes 20 indices concurrently distributed in 10 disks(2 indices for each disk) .

After the program running several days, every indices has about 250GB data(totala about 5TB), it holds 180G memory in RES(SHR 300M).

Yes, it would been killed by oom-killer when the system's memory exhausted.

@fulmicoton
Maybe you are rigth, because i find there are 270 mergingthreads in the program.
But I don't understand why my program commits almost one doc at a time, and how to batch commit.

Actually, i use a counter to control batch(if it does).
loop{
for data in datas {
index_writer.add_document(doc);
}
i = i + 1;
if i == commit_interval {
index_writer.commit();
i = 0;
}
}

fulmicoton · 2019-10-29T07:12:09Z

@vsop-479 Sweet. It seems like you know how to monitor your program so we should not have too much trouble solving your issue.

It sounds like your index is large. The larger your commits the better your indexing throughput.
Don't be shy about making commits of 100K or 1M docs. The tricky part is that if a failure happens you need a way to resume from the last successful commit. The most reliable to do that is to add a payload to your commit. The payload should then work as a marker on your indexing queue.

Also, if you can share what kind of data you are trying to index, please let us know! This is always awesome to here about users.

vsop-479 · 2019-10-29T08:14:19Z

@fulmicoton
Actually, the program commits 200k docs every time for each index.
PS index writer's num_threads is 4, overall_heap_size_in_bytes is 500_000_000.

Consider there are 20 indices, maybe the amount fo merging threads(270) is normal?
I will pay more attention on the merging threads.

The data i am tring to index is some tcp flow data, which likes:
{
"status": "tmo",
"dst_mac": "11:11:11:11:11:11",
"sip": "1.1.1.1",
"downlink_length": 4177,
"down_payload": "485",
"proto": "openvpn",
"dtime": "2019-10-29 15:55:55.696",
"client_os": "openbsd3",
"up_payload": "504",
"server_os": "linux",
"summary": "165;3;1460;1380",
"stime": "2019-10-29 15:37:35.585",
"uplink_length": 882,
"dport": 11111,
"sport": 22222,
"dip": "2.2.2.2",
"src_mac": "22:22:22:22:22:22"
}

fulmicoton · 2019-10-29T08:20:45Z

So you have 20 indices in the same process? each with their own index writer?

vsop-479 · 2019-10-29T09:52:26Z

So you have 20 indices in the same process? each with their own index writer?

yes.

fulmicoton · 2019-10-29T10:03:30Z

Ok... How many threads do you have on your CPU?
If IO is your bottleneck or if you don't have 40 threads, chances are lowering the number of threads per indexwriter is a good idea.

Setting it to 1 per index writer for inatance, will force ensure that the segment you create are larger for the same amount of memory.
This should reduce the number of merging threads.

If this does not solve your problem...

Right now, no mechanism bounds the number of merging threads. If indexing outpaces merging, which happens as the amount of data you have gets larger, you end up with more merging threads running at the same time.

As your segments become larger and larger, there is less and less good reasons to actually merge them. One thing that would be reasonable to do with your use case is to write your own MergePolicy to avoid merging very large segments. It will save you CPU and IO.
For instance you could stop merging segments that are more than 4GB or something like that.

This is assuming that you do not have any deletes in your index.

vsop-479 · 2019-10-29T11:03:06Z

My cpu has 24 processors.
IO is not the bottleneck util the program occupied too much memory.

I think you are right about the merge policy, because there are many big segments in the index path,
may be merging these big segments result in the program hold too mush memory.

fulmicoton · 2019-10-29T11:09:31Z

It's fairly fun and easy to write your own Merge policy. Let us know if you have troubles doing it.

vsop-479 · 2019-10-30T09:15:46Z

@fulmicoton
I am trying to write my own merge policy, my plan is skipping big segments like lucene.
But i can't find segment's size info in SegmentMeta, there are only docs/deleted docs num info.

fulmicoton · 2019-10-30T09:54:22Z

@vsop-479 Can you rely on the number of documents?

vsop-479 · 2019-10-30T10:36:43Z

@fulmicoton
If there is no way to get the segment's size.
I would use the number of documents to skip big segments.

dearsxx0918 · 2019-10-30T11:48:21Z

Hi, i'm a rust rookie. my program suffer a memory leak problem, it could been killed by kernel
after running along time.
Getting data from redis，and adding to tantivy with rust:

fn main() {
    let mut index_writer = index.writer_with_num_threads(num_threads, buffer).unwrap();
    index_writer.set_merge_policy(mergePolicy);
    loop {
        let mut datas = RedisService::get_data();
        let len = datas.len();
        if len == 0 {
            //sleep
        }
        for data in datas {
            let mut doc = Document::new();
            let json_value = json::parse(&data).unwrap();
            //doc.add_text,add_u64;
            index_writer.add_document(doc);
        }
        index_writer.commit();
    }
}

I'm also face this issue, didn't solved yet.

fulmicoton · 2019-10-31T00:51:01Z

@dearsxx0918 are you sure you have the same issue?

On #666, the problem you described was very different. If you share your code, we can maybe help.

fulmicoton · 2019-10-31T00:52:07Z

@vsop-479 I am afraid there is no way to get the size of the segments in MB.

dearsxx0918 · 2019-10-31T06:25:30Z

@dearsxx0918 are you sure you have the same issue?

On #666, the problem you described was very different. If you share your code, we can maybe help.

I can't give you the source code, but I can give you a valgrind log(massif log).

fulmicoton · 2019-10-31T07:16:16Z

I am not interested in the valgrind log.

Did you manage to drop your IndexWriter as suggested in the other thread?
Can you confirm you are not committing after every single doc?
Can you check the number of merging thread you have?

vsop-479 · 2019-10-31T09:52:03Z

@fulmicoton
Thank you for helping me analyzed the problem i suffered.

Maybe tantivy should control the size of the segments choose to be merged, and the number of threads on which to execute merge task.

dearsxx0918 · 2019-10-31T14:27:12Z

I am not interested in the valgrind log.

Did you manage to drop your IndexWriter as suggested in the other thread?

Can you confirm you are not committing after every single doc?

Can you check the number of merging thread you have?

My current workaround is reload Index every commit, and there is no memory issue now.
I used 2 threads to add & search.

fulmicoton added the question label Oct 29, 2019

fulmicoton mentioned this issue Oct 31, 2019

Stale adding documents if there are too many ongoing merge #680

Open

fulmicoton mentioned this issue Nov 22, 2019

Memory leak in the DeleteQueue #712

Closed

fulmicoton closed this as completed Sep 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak problem #677

memory leak problem #677

vsop-479 commented Oct 28, 2019 •

edited by petr-tik

petr-tik commented Oct 28, 2019

fulmicoton commented Oct 28, 2019

vsop-479 commented Oct 29, 2019

fulmicoton commented Oct 29, 2019 •

edited

vsop-479 commented Oct 29, 2019

fulmicoton commented Oct 29, 2019

vsop-479 commented Oct 29, 2019

fulmicoton commented Oct 29, 2019

vsop-479 commented Oct 29, 2019 •

edited

fulmicoton commented Oct 29, 2019

vsop-479 commented Oct 30, 2019

fulmicoton commented Oct 30, 2019

vsop-479 commented Oct 30, 2019

dearsxx0918 commented Oct 30, 2019

fulmicoton commented Oct 31, 2019

fulmicoton commented Oct 31, 2019

dearsxx0918 commented Oct 31, 2019

fulmicoton commented Oct 31, 2019

vsop-479 commented Oct 31, 2019

dearsxx0918 commented Oct 31, 2019

memory leak problem #677

memory leak problem #677

Comments

vsop-479 commented Oct 28, 2019 • edited by petr-tik

petr-tik commented Oct 28, 2019

fulmicoton commented Oct 28, 2019

vsop-479 commented Oct 29, 2019

fulmicoton commented Oct 29, 2019 • edited

vsop-479 commented Oct 29, 2019

fulmicoton commented Oct 29, 2019

vsop-479 commented Oct 29, 2019

fulmicoton commented Oct 29, 2019

vsop-479 commented Oct 29, 2019 • edited

fulmicoton commented Oct 29, 2019

vsop-479 commented Oct 30, 2019

fulmicoton commented Oct 30, 2019

vsop-479 commented Oct 30, 2019

dearsxx0918 commented Oct 30, 2019

fulmicoton commented Oct 31, 2019

fulmicoton commented Oct 31, 2019

dearsxx0918 commented Oct 31, 2019

fulmicoton commented Oct 31, 2019

vsop-479 commented Oct 31, 2019

dearsxx0918 commented Oct 31, 2019

vsop-479 commented Oct 28, 2019 •

edited by petr-tik

fulmicoton commented Oct 29, 2019 •

edited

vsop-479 commented Oct 29, 2019 •

edited