Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak problem #677

Closed
vsop-479 opened this issue Oct 28, 2019 · 20 comments
Closed

memory leak problem #677

vsop-479 opened this issue Oct 28, 2019 · 20 comments
Labels

Comments

@vsop-479
Copy link
Contributor

vsop-479 commented Oct 28, 2019

Hi, i'm a rust rookie. my program suffer a memory leak problem, it could been killed by kernel
after running along time.
Getting data from redis,and adding to tantivy with rust:

fn main() {
    let mut index_writer = index.writer_with_num_threads(num_threads, buffer).unwrap();
    index_writer.set_merge_policy(mergePolicy);
    loop {
        let mut datas = RedisService::get_data();
        let len = datas.len();
        if len == 0 {
            //sleep
        }
        for data in datas {
            let mut doc = Document::new();
            let json_value = json::parse(&data).unwrap();
            //doc.add_text,add_u64;
            index_writer.add_document(doc);
        }
        index_writer.commit();
    }
}
@petr-tik
Copy link
Contributor

Hey, since you say you are less experienced with rust, it would be good to understand a bit more about your problem.

How long is "a long time"?
Are you sure it's killed by the kernel? Specifically OOM or something else?
How much data is in your RedisService?

Also, I don't recommend you commit in a loop, that will slow down your indexing pipeline and lead to many small segments.

@fulmicoton
Copy link
Collaborator

Your polling loop looks like it batches but it does not.
I suspect you commit almost one doc at a time.

In that case you create a lot of tiny segment. Tantivy is a bit dumb and does not know how to wait for merging threads, so you end up with an evergrowing number of merging thread and this is the source of your OOM.

Can you batch your commit and see if your problem is solved?
If you need something quick and dirty, just sleep 1s within your loop to prevent 2 commit to happen in a 1s interval.

@vsop-479
Copy link
Contributor Author

@petr-tik @fulmicoton
My program writes 20 indices concurrently distributed in 10 disks(2 indices for each disk) .

After the program running several days, every indices has about 250GB data(totala about 5TB), it holds 180G memory in RES(SHR 300M).

Yes, it would been killed by oom-killer when the system's memory exhausted.

@fulmicoton
Maybe you are rigth, because i find there are 270 mergingthreads in the program.
But I don't understand why my program commits almost one doc at a time, and how to batch commit.

Actually, i use a counter to control batch(if it does).
loop{
for data in datas {
index_writer.add_document(doc);
}
i = i + 1;
if i == commit_interval {
index_writer.commit();
i = 0;
}
}

@fulmicoton
Copy link
Collaborator

fulmicoton commented Oct 29, 2019

@vsop-479 Sweet. It seems like you know how to monitor your program so we should not have too much trouble solving your issue.

It sounds like your index is large. The larger your commits the better your indexing throughput.
Don't be shy about making commits of 100K or 1M docs. The tricky part is that if a failure happens you need a way to resume from the last successful commit. The most reliable to do that is to add a payload to your commit. The payload should then work as a marker on your indexing queue.

Also, if you can share what kind of data you are trying to index, please let us know! This is always awesome to here about users.

@vsop-479
Copy link
Contributor Author

@fulmicoton
Actually, the program commits 200k docs every time for each index.
PS index writer's num_threads is 4, overall_heap_size_in_bytes is 500_000_000.

Consider there are 20 indices, maybe the amount fo merging threads(270) is normal?
I will pay more attention on the merging threads.

The data i am tring to index is some tcp flow data, which likes:
{
"status": "tmo",
"dst_mac": "11:11:11:11:11:11",
"sip": "1.1.1.1",
"downlink_length": 4177,
"down_payload": "485",
"proto": "openvpn",
"dtime": "2019-10-29 15:55:55.696",
"client_os": "openbsd3",
"up_payload": "504",
"server_os": "linux",
"summary": "165;3;1460;1380",
"stime": "2019-10-29 15:37:35.585",
"uplink_length": 882,
"dport": 11111,
"sport": 22222,
"dip": "2.2.2.2",
"src_mac": "22:22:22:22:22:22"
}

@fulmicoton
Copy link
Collaborator

So you have 20 indices in the same process? each with their own index writer?

@vsop-479
Copy link
Contributor Author

So you have 20 indices in the same process? each with their own index writer?

yes.

@fulmicoton
Copy link
Collaborator

Ok... How many threads do you have on your CPU?
If IO is your bottleneck or if you don't have 40 threads, chances are lowering the number of threads per indexwriter is a good idea.

Setting it to 1 per index writer for inatance, will force ensure that the segment you create are larger for the same amount of memory.
This should reduce the number of merging threads.


If this does not solve your problem...

Right now, no mechanism bounds the number of merging threads. If indexing outpaces merging, which happens as the amount of data you have gets larger, you end up with more merging threads running at the same time.

As your segments become larger and larger, there is less and less good reasons to actually merge them. One thing that would be reasonable to do with your use case is to write your own MergePolicy to avoid merging very large segments. It will save you CPU and IO.
For instance you could stop merging segments that are more than 4GB or something like that.

This is assuming that you do not have any deletes in your index.

@vsop-479
Copy link
Contributor Author

vsop-479 commented Oct 29, 2019

My cpu has 24 processors.
IO is not the bottleneck util the program occupied too much memory.

I think you are right about the merge policy, because there are many big segments in the index path,
may be merging these big segments result in the program hold too mush memory.
image

@fulmicoton
Copy link
Collaborator

It's fairly fun and easy to write your own Merge policy. Let us know if you have troubles doing it.

@vsop-479
Copy link
Contributor Author

@fulmicoton
I am trying to write my own merge policy, my plan is skipping big segments like lucene.
But i can't find segment's size info in SegmentMeta, there are only docs/deleted docs num info.

@fulmicoton
Copy link
Collaborator

@vsop-479 Can you rely on the number of documents?

@vsop-479
Copy link
Contributor Author

@fulmicoton
If there is no way to get the segment's size.
I would use the number of documents to skip big segments.

@dearsxx0918
Copy link

Hi, i'm a rust rookie. my program suffer a memory leak problem, it could been killed by kernel
after running along time.
Getting data from redis,and adding to tantivy with rust:

fn main() {
    let mut index_writer = index.writer_with_num_threads(num_threads, buffer).unwrap();
    index_writer.set_merge_policy(mergePolicy);
    loop {
        let mut datas = RedisService::get_data();
        let len = datas.len();
        if len == 0 {
            //sleep
        }
        for data in datas {
            let mut doc = Document::new();
            let json_value = json::parse(&data).unwrap();
            //doc.add_text,add_u64;
            index_writer.add_document(doc);
        }
        index_writer.commit();
    }
}

I'm also face this issue, didn't solved yet.

@fulmicoton
Copy link
Collaborator

@dearsxx0918 are you sure you have the same issue?

On #666, the problem you described was very different. If you share your code, we can maybe help.

@fulmicoton
Copy link
Collaborator

@vsop-479 I am afraid there is no way to get the size of the segments in MB.

@dearsxx0918
Copy link

@dearsxx0918 are you sure you have the same issue?

On #666, the problem you described was very different. If you share your code, we can maybe help.

I can't give you the source code, but I can give you a valgrind log(massif log).

@fulmicoton
Copy link
Collaborator

I am not interested in the valgrind log.

  • Did you manage to drop your IndexWriter as suggested in the other thread?
  • Can you confirm you are not committing after every single doc?
  • Can you check the number of merging thread you have?

@vsop-479
Copy link
Contributor Author

@fulmicoton
Thank you for helping me analyzed the problem i suffered.

Maybe tantivy should control the size of the segments choose to be merged, and the number of threads on which to execute merge task.

@dearsxx0918
Copy link

I am not interested in the valgrind log.

  • Did you manage to drop your IndexWriter as suggested in the other thread?
  • Can you confirm you are not committing after every single doc?
  • Can you check the number of merging thread you have?

My current workaround is reload Index every commit, and there is no memory issue now.
I used 2 threads to add & search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants