Skip to content

The size of subnet_msg_cache calculation mistake cause memory usage increased beyond expectations. #1032

@jfb8856606

Description

@jfb8856606

Describe the bug
After enabling subnetcache, memory usage increased beyond expectations.

To reproduce
Steps to reproduce the behavior:

  1. Enable subnetcache in unbound.conf, module-config: "subnetcache iterator"
  2. Run unound
  3. Sent dns query to unbound with ecs.
  4. The memory usage of Unbound continues to grow even after reaching the set value of msg-buf-size.

Expected behavior
A clear and concise description of what you expected to happen.

System:

  • Unbound version: 1.10.1-1.19.1
  • OS: OpenCloudOS 8/RedHat 8
  • unbound -V output:
 Version 1.17.1

Configure line: LIBS=-ljemalloc CFLAGS=-g -fno-omit-frame-pointer -flto -O2 --enable-subnet --with-libevent --with-pthreads --with-ssl
Linked libs: libevent 2.1.8-stable (it uses epoll), OpenSSL 1.1.1k  FIPS 25 Mar 2021
Linked modules: dns64 subnetcache respip validator iterator

BSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl or https://github.com/NLnetLabs/unbound/issues

Additional information
subnet config:

module-config: "subnetcache iterator"
client-subnet-always-forward: yes
max-client-subnet-ipv6: 56
max-client-subnet-ipv4: 24
min-client-subnet-ipv6: 0
min-client-subnet-ipv4: 0
max-ecs-tree-size-ipv4: 256
max-ecs-tree-size-ipv6: 10

I debug this it, and two main issues have been identified:

  1. This function sizefunc has under-calculated the memory used by rrsets' key and data.

It appears that the main reason is that it copied the calculation process of msg_cache's msgreply_sizefunc. However, the actual data for msg_cache is stored in the rrset cache, whereas the actual records for subnet_msg_cache are stored in memory allocated by itself.

Therefore, it needs to be modified to call ub_rrset_sizefunc for each rrset to calculate the key and size of the record, and below is part of a patch.

        ......
	for (i = 0; i < elem->rrset_count; i++) {
		struct ub_packed_rrset_key *key = elem->rrsets[i];
		struct packed_rrset_data *data = key->entry.data;
		s += ub_rrset_sizefunc(key, data);
	} 
  1. In the subnet_msg_cache's function update_cache, even if an LRU entry is not inserted, the size of the slab table containing the LRU entr should still be updated. Otherwise, inserting new cache entries will not increase the size of the slabs, resulting in this function reclaim_space of lruhash not being able to correctly be called to delete cache entry that exceeds the limit.

Any below is part of a patch:
slabhash and lruash add a new function like this:

void
lruhash_update_space_used(struct lruhash* table, hashvalue_type hash,
        struct lruhash_entry* entry, void* data, void* cb_arg, size_t diff_size)
{
	struct lruhash_entry *reclaimlist = NULL;
	(void)hash;
	(void)entry;
	(void)data;

	fptr_ok(fptr_whitelist_hash_sizefunc(table->sizefunc));
	fptr_ok(fptr_whitelist_hash_delkeyfunc(table->delkeyfunc));
	fptr_ok(fptr_whitelist_hash_deldatafunc(table->deldatafunc));
	fptr_ok(fptr_whitelist_hash_compfunc(table->compfunc));
	fptr_ok(fptr_whitelist_hash_markdelfunc(table->markdelfunc));

	if(cb_arg == NULL) cb_arg = table->cb_arg;

	/* find bin */
	lock_quick_lock(&table->lock);

	table->space_used += diff_size;

	if(table->space_used > table->space_max)
		reclaim_space(table, &reclaimlist);

	lock_quick_unlock(&table->lock);

	/* finish reclaim if any (outside of critical region) */
	while(reclaimlist) {
		struct lruhash_entry* n = reclaimlist->overflow_next;
		void* d = reclaimlist->data;
		(*table->delkeyfunc)(reclaimlist->key, cb_arg);
		(*table->deldatafunc)(d, cb_arg);
		reclaimlist = n;
	}
}

And subnet_msg_cache's update_cache called it when need_to_insert is 0

        ......
	diff_size = tree->size_bytes;
	addrtree_insert(tree, (addrkey_t*)edns->subnet_addr,
		edns->subnet_source_mask, sq->max_scope, rep,
		rep->ttl, *qstate->env->now, only_match_scope_zero);
	diff_size = tree->size_bytes - diff_size;

	lock_rw_unlock(&lru_entry->lock);
	if (need_to_insert) {
		slabhash_insert(subnet_msg_cache, h, lru_entry, lru_entry->data,
			NULL);
	} else {
		 slabhash_update_space_used(subnet_msg_cache, h, lru_entry, lru_entry->data,
			NULL, diff_size);
	}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions