Skip to content

table_collection_copy() heap buffer overflow #179

@bhaller

Description

@bhaller

I am not sure this is a bug, but I think it might be. Either that, or I'm using the tskit APIs wrong. Or both.

The end result is that in table_collection_copy() it goes into the if (table_collection_is_indexed(self)) case, and the first malloc() call crashes. It is trying to copy 48588 bytes from self->indexes.edge_insertion_order, which is a malloced buffer that is only 47100 bytes long. This causes a segfault.

The 47100-byte buffer is allocated by an earlier call to table_collection_copy() when executing this code:

	table_collection_t immutable_tables;
	
	int ret = table_collection_alloc(&immutable_tables, 0);
	if (ret != 0) handle_error("table_collection_alloc", ret);
	
	ret = table_collection_load(&immutable_tables, p_file, 0);
	if (ret != 0) handle_error("table_collection_load", ret);
	
	// copy the immutable table collection to make a mutable collection
	ret = table_collection_alloc(&tables_, MSP_ALLOC_TABLES);
	if (ret != 0) handle_error("table_collection_alloc", ret);
	RecordTablePosition();
	
	ret = table_collection_copy(&immutable_tables, &tables_);
	if (ret < 0) handle_error("table_collection_copy", ret);

I guess immutable_tables has these indexes set up on it; I'm not sure why. Then we try to make a mutable copy of the table collection, as the comment says. (The point of this shuffle is that I guess immutable_tables has buffers that are straight from kastore and are not malloced, so the table collection can't be modified, so we need to copy it). That call appears to copy the indexes, which allocates the 47100-byte buffer. Then we go on to use the table collection to run the simulation forward, and we add some edges, but the index buffers don't get resized and they don't get discarded, so they end up being smaller than the edge table (and presumably invalid, too). Then eventually we try to copy that table collection, and it sees the index buffers, assumes they are of the expected size, and tries to copy out of them; but they are smaller than expected, so boom.

This is the first time I've encountered these index buffers, and I'm not sure what they're used for. As far as I know we're not using them in SLiM and don't need/want them. I see that there's a table_collection_drop_indexes() function I could call, and I'm thinking I probably ought to call it on immutable_tables before I copy it. So if that diagnosis is correct, maybe this is entirely our bug in SLiM. (Is there a way to just not even create the indexes in the first place, rather than dropping them later? Does this matter?)

However, it seems weird to me that SLiM can add edges to the indexed table collection without getting any kind of pushback. It seems like that should either (a) silently drop the indexes since they are no longer valid, or (b) raise an error condition like "tried to modify an indexed table collection". Letting itself get into a state where it is invalid, with such a common/obvious usage pattern, seems bad.

If checking for indexes every time an edge gets added would be too much of a speed hit, maybe it ought to (a) save the malloced size of the indexes when it sets them up, and then (b) check that that size still makes sense when it intends to use/copy them; if it no longer makes sense compared to the size of the edge table, raise an error or drop the indexes at that point...?

@petrelharp, you might be interested in this; remarkably, we appear to have had this crashing bug since SLiM 3.0, and it is not hard to trigger, but nobody has ever seen it (or reported it, at least) in the wild! I just stumbled on it while trying to debug an (apparently) unrelated issue – the test case I wrote for that other issue was segfaulting before it hit the issue I was trying to debug!

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions