Skip to content

MultiArrayList for tagged unions #14496

@hryx

Description

@hryx

MultiArrayList is a convenient way to store a list of structs in a memory-friendly way by storing separate contiguous arrays for each field. This issue is a proposal to add a standard library API for storing tagged unions using the same technique.

There are some use cases for this in the Zig codebase. The pattern is implemented verbatim in src/Zir.zig, and to a lesser extent in std.zig.Ast.

Overview

Whereas MultiArrayList splits structs into N arrays (one for each field), tagged unions would be split into two arrays: one for active tags and one for bare unions containing payloads/data. If it is desirable and possible to keep the same API as MultiArrayList of structs, these can be accessed with .items(.tags) and .items(.data), otherwise via specialized methods .tags() and .data().

The tags array would be nicely compact in the common case, where a tag can often fit in a single byte. The payload array would mainly see benefits in ReleaseFast and ReleaseSmall builds, where the invisible safety tag for a bare union is not stored.

Example

const U = union(enum) {
    a,
    b: u8,
    c: f32,
    d: bool,
};

test U {
    const std = @import("std");
    const expect = std.testing.expect;
    const allocator = std.testing.allocator;

    // Even in optimized builds, the union tag may add padding.
    expect(@sizeOf(U) == 8);

    var list = std.MultiArrayList(U);
    try list.append(allocator, .a);
    try list.append(allocator, .{ .b = 12 });
    try list.append(allocator, .{ .c = 3.14 });
    try list.append(allocator, .{ .d = false });

    expect(list.items(.tags)[1] == .b);
    expect(list.items(.data)[2].c == 3.14);
    expect(list.get(3) == U{ .d = false });

    if (builtin.mode == .ReleaseFast or builtin.mode == .ReleaseSmall) {
        // Total size required per item is now 5.
        expect(@sizeOf(@TypeOf(list.items(.tags)[0])) == 1);
        expect(@sizeOf(@TypeOf(list.items(.data)[0])) == 4);
    }
}

Questions

  • Should the existing MultiArrayList type be extended to support both structs and tagged unions, or should the standard library define a new type? I suspect it should be possible to share the majority of existing MultiArrayList code and API.

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedThis proposal is planned.proposalThis issue suggests modifications. If it also has the "accepted" label then it is planned.standard libraryThis issue involves writing Zig code for the standard library.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions