Skip to content

xnn_weights_cache_provider look_up doesn't work? #6257

Open
@mcr229

Description

@mcr229

I've recently been experimenting with XNNPACK's weight cache to reduce load time by caching packed weights and also reduce memory pressure for repeated weights across the same kernels.

I was experiementing with fully-connected operator and found that the weight cache was never being hit. I noticed that when using the apis to create the xnn_weights_cache_t we set the look up function to be xnn_internal_weights_cache_look_up:

cache_provider->look_up = (size_t(*)(void*, const struct xnn_weights_cache_look_up_key*))xnn_internal_weights_cache_look_up;

looking at this function, it looks like a placeholder function which would always return XNN_CACHE_NOT_FOUND:

XNNPACK/src/cache.c

Lines 491 to 496 in 85071b8

size_t xnn_internal_weights_cache_look_up(
struct xnn_internal_weights_cache* cache, const struct xnn_weights_cache_look_up_key* cache_key)
{
// The default implementation does not support this query.
return XNN_CACHE_NOT_FOUND;
}

Now when I'm using the weights cache to create a runtime_t with only a fully connected operator, in the flow of creating the fully-connected operator, we look up the cache to see if the weights have been packed before, using xnn_weights_cache_look_up:

if (use_weights_cache(fully_connected_op)) {
cache_offset = xnn_weights_cache_look_up(
fully_connected_op->weights_cache, &cache_key);
}

However this just uses the the placeholder function above, returning XNN_CACHE_NOT_FOUND:

XNNPACK/src/cache.c

Lines 530 to 534 in 85071b8

size_t xnn_weights_cache_look_up(
xnn_weights_cache_t cache, const struct xnn_weights_cache_look_up_key* cache_key)
{
return cache->look_up(cache->context, cache_key);
}

As a result, every look up would then fall to XNN_CACHE_NOT_FOUND, in which weights have to be repacked, and memory has to be allocated for the newly packed weights:

if (cache_offset == XNN_CACHE_NOT_FOUND) {
void* weights_ptr = xnn_get_pointer_to_write_weights(
fully_connected_op, aligned_total_weights_size, packed_weights_padding_byte);
if (weights_ptr == NULL) {
xnn_log_error(
"failed to allocate %zu bytes for %s operator packed weights",
packed_weights_size, xnn_operator_type_to_string(operator_type));
goto error;
}
xnn_log_debug("allocated %zu bytes for packed weights in %s operator",
aligned_total_weights_size, xnn_operator_type_to_string(operator_type));
if (flags & XNN_FLAG_TRANSPOSE_WEIGHTS) {
pack_gemm_gio_w(
/*groups=*/1, output_channels, input_channels,
nr, kr, sr,
output_channels,
kernel, bias, /*scale=*/NULL,
weights_ptr,
gemm_config->nr * extra_weights_bytes,
packing_params);

Am I looking at this incorrectly? Or is this a feature that is still a wip? Or is this a bug that is meant to be fixed in the future?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions