Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

erase() crashes #53

Closed
AIIleG opened this issue Dec 21, 2023 · 10 comments
Closed

erase() crashes #53

AIIleG opened this issue Dec 21, 2023 · 10 comments

Comments

@AIIleG
Copy link

AIIleG commented Dec 21, 2023

Hi,

I get crashes with colony after it's running for a few hours on rhel 9. I get that with different compilers so the compiler shouldn't be the problem. Here's the trace:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff72af91e in free () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff72af91e in free () from /lib64/libc.so.6
#1  0x00000000004044d5 in nclient::~nclient (this=0x516f10) at netcheck.cxx:57
#2  0x00000000004044b9 in __gnu_cxx::new_allocator<nclient>::destroy<nclient> (this=0x7fffffffe1d0,
    __p=0x516f10)
    at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/ext/new_allocator.h:168
#3  0x000000000040448d in std::allocator_traits<std::allocator<nclient> >::destroy<nclient> (__a=...,
    __p=0x516f10)
    at /usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/bits/alloc_traits.h:535
#4  0x000000000040415d in plf::colony<nclient, std::allocator<nclient>, (plf::priority)1>::destroy_element (
    this=0x7fffffffe1d0, element=0x516f10) at ./plf_colony.h:1159
#5  0x0000000000403192 in plf::colony<nclient, std::allocator<nclient>, (plf::priority)1>::erase (
    this=0x7fffffffe1d0, it=...) at ./plf_colony.h:2411
#6  0x000000000040293a in main () at netcheck.cxx:120

The program in question is a network related tool but unfortunately I'm not at liberty to show the full source. However, here are the relevant parts:

struct nclient {
	string ip;
	unsigned port;
	time_t lastcheck;
};
...
plf::colony<struct nclient> nclients;
plf::colony<struct nclient>::iterator it;
struct nclient dummy;
...
for (it = nclients.begin(); it != nclients.end(); ++it) {
	if (it->lastcheck < cutoff)
		it = nclients.erase(it); // <== CRASHES HERE
}

the content gets added like this later on:

dummy.ip = row[0];
dummy.port = port;
dummy.lastcheck = now;
nclients.insert(dummy);

I ran the same program with a vector before I switched to colony which worked fine. The only difference I noticed was that accessing the struct's members with colony is via pointer (->).
This happens with 7.39 i checked out from github after you fixed the exceptions bug i reported.

Any help would of course be welcome ...

@mattreecebentley
Copy link
Owner

mattreecebentley commented Dec 21, 2023

Unfortunately impossible to diagnose without some dummy code which replicates the issue.
I could spend hours mocking up tests but without the exact match, hard to know if I'd hit the issue.
Any idea if it's a memory leak issue?

[EDIT: only thing I can see that you might be doing wrong there is that if your code erases the last element in the colony, it's going to iterate past .end() and cause a segmentation fault. To fix:

for (it = nclients.begin(); it != nclients.end(); ) {
	if (it->lastcheck < cutoff)
		it = nclients.erase(it);
        else ++it;
}

@AIIleG
Copy link
Author

AIIleG commented Dec 21, 2023

Thanks for the quick reply. I tested the loop issue you mentioned:

struct nclient {
	string ip;
	unsigned port;
	time_t lastcheck;
};

int main() {
	plf::colony<struct nclient> nclients;
	plf::colony<struct nclient>::iterator it;
	struct nclient dummy;

	dummy.ip = "1.2.3.4";
	dummy.port = 53;
	dummy.lastcheck = 381846;
	nclients.insert(dummy);
	dummy.ip = "1.2.5.4";
	dummy.port = 5333;
	dummy.lastcheck = 3846;
	nclients.insert(dummy);
	printf("count: %u\n", nclients.size());

	for (it = nclients.begin(); it != nclients.end(); ++it) {
		printf("%u\n", it->port);
		it = nclients.erase(it);
	}

	printf("count: %u\n", nclients.size());
	nclients.clear();
	printf("count: %u\n", nclients.size());
	exit(0);
}

Result:

count: 2
53
count: 1
count: 0

And no crash. After all, that kind of loop is the same as you have in the examples on your website. I guess I have to add more debug logging and investigate further. Thanks either way so far!

@mattreecebentley
Copy link
Owner

That's not the loop I wrote.
For what you've got there, not only will your loop skip the element after the one you've erased during the loop, it will crash in the manner I've described if you end up erasing the back element.
Closing until you've re-written your code with a working loop (I don't have any examples like yours in my code) and can prove that it's not the result of going past end().

@AIIleG
Copy link
Author

AIIleG commented Dec 21, 2023

The loop you wrote there: https://plflib.org/colony.htm#functions

 for (plf::colony<int>::iterator it = i_colony.begin(); it != i_colony.end(); ++it)
  {
    it = i_colony.erase(it);
  }

The purpose of my test was to see whether this causes a crash and it did not. Since that's a quick and simple test I did that first before trying the different loop you suggested in this very thread here previously.

@mattreecebentley
Copy link
Owner

Yeah, that loop explicitly states that it's trying to erase half the elements, in a use-case where I know that the number of elements is even, therefore it will reach the end condition.

@AIIleG
Copy link
Author

AIIleG commented Dec 22, 2023

Yes okay, my test above with 3 elements does crash. Might not hurt to make that more clear on the website.
Either way, taking the safe route you suggested above seems to prevent the crashes. Thanks!

@mattreecebentley
Copy link
Owner

For any std:: container, erase returns the element after the element you're erasing, so the situation will be the same for any standard container. It just might take longer to reach a crash state for a vector because non-back erasure is so much slower. Cheers-

@AIIleG
Copy link
Author

AIIleG commented Dec 22, 2023

Thanks for the insight. That might be why it didn't crash for me with a vector because I didn't catch the return. I did it like this:

vs = nclients.size();

for (i = 0; i < vs; ++i) {
	nclients.erase(nclients.begin() + i);
}

Running this with 3 elements doesn't crash unlike the colony (or other container) version with an iterator.

@mattreecebentley
Copy link
Owner

Ah, I see. BTW using indexes in that way will always be slower than using an iterator,
because it creates two operations (begin() +i, ++i) instead of one (++it).
Merry Christmas!

@AIIleG
Copy link
Author

AIIleG commented Dec 25, 2023

Thanks again for the info and Merry Christmas to you, too :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants