New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtin function @reify to create a type from a TypeInfo instance #383

Open
andrewrk opened this Issue May 30, 2017 · 28 comments

Comments

Projects
None yet
@andrewrk
Member

andrewrk commented May 30, 2017

  • for arrays, pointers, error unions, nullables: T.child_type field.
  • for functions, T.return_type field
  • for functions, T.is_var_args and
  • T.arg_types which is of type [N]type
  • @fieldsOf(T) where T is a struct or enum. Returns anonymous struct { T: type, name: []const u8 }
  • accessing a field by comptime string name
  • implement @scurest proposal below
    • @typeInfo
    • @reify

cc @raulgrell

@andrewrk andrewrk added this to the 0.2.0 milestone May 30, 2017

@ranma42

This comment has been minimized.

ranma42 commented May 30, 2017

For functions, would it make sense to also expose the number of arguments and make them individually accessible? (cc @AndreaOrru )

@AndreaOrru

This comment has been minimized.

Member

AndreaOrru commented May 30, 2017

Yes, that would be super useful to implement the IPC in my kernel, for example.

@andrewrk

This comment has been minimized.

Member

andrewrk commented May 30, 2017

Added a third item for this above

@andrewrk

This comment has been minimized.

Member

andrewrk commented Sep 21, 2017

cc @hasenj

<hasenj> for comptime is there a way to perform property access by string or something similar?
<hasenj> for example, say, @field(object, field_name)
@hasenj

This comment has been minimized.

hasenj commented Sep 21, 2017

@fieldsOf(T) where T is a struct or enum. Returns anonymous struct { T: type, name: []const u8 }

Is the type even needed? I think a list of names as strings should suffice, given that there are other builtin functions that can be used to reflect on the type. For example:

@field(some_struct, name); // expands to some_struct.<value of name>
// e.g.
@field(a, "foo"); // expands to `a.foo`

@typeOf(@field(a, "foo")); // gives type of a.foo
@tiehuis

This comment has been minimized.

Member

tiehuis commented Sep 22, 2017

It seems like with just these features we could implement a structural based interface mechanism, similar to go.

Here is a motivating example to implement a print trait of sorts, allowing any struct which implements a print field with the appropriate type to have it called. I don't think anything is too glaringly out of place here, but feel free to correct.

const std = @import("std");
const builtin = @import("builtin");
const TypeId = builtin.TypeId;

fn getField(comptime T: type, x: var, comptime name: []const u8) ->
    (fn (self: T) -> %void)
{
    for (@fieldsOf(x)) |field_name| {
        if (field_name == name) {
            const field = @field(x, name);
            const field_type = @typeOf(field);

            if (@typeId(field_type) != TypeId.Fn) {
                @panic("field is not a function");
            }

            if (field_type.is_var_args) {
                @panic("cannot handle varargs function");
            }

            // Would need to be a bit more in-depth to handle a &T self arg
            const expected_args = []type { T };
            if (!std.mem.eql(type, field_type.arg_types, expected_arg_types)) {
                @panic("prototype does not match");
            }

            if (@typeId(field_type.return_type) != TypeId.Error
                and field_type.return_type.child_type != void)
            {
                @panic("return type does not match");
            }

            return field;
        }
    }

    null
}

// Expects a struct with a field method of type `print(self) -> %void`.
pub fn printTrait(x: var) -> %void {
    const T = @typeOf(x);
    const Id = @typeId(T);

    if (Id != TypeId.Struct) {
        @panic("expected a struct");
    }

    if (getField(T, x, "print")) |func| {
        func(x);
    } else {
        @panic("no print field found!");
    }
}

@andrewrk

This comment has been minimized.

Member

andrewrk commented Sep 22, 2017

With this example, if you were going to do printTrait(x) couldn't you instead do x.print() ?

@tiehuis

This comment has been minimized.

Member

tiehuis commented Sep 28, 2017

You're right. This example would really only provide slightly more targeted error messages.

A better example would be a printDebug function which could recursively print fields of structs and enums, similar to println!("{:?}", x) in Rust. Another example as well that could be very useful would be for generic serialization code by inspection of field names.

andrewrk added a commit that referenced this issue Nov 3, 2017

andrewrk added a commit that referenced this issue Nov 4, 2017

@scurest

This comment has been minimized.

Contributor

scurest commented Nov 5, 2017

Since zig can store types as regular data, I was wondering if it would be better to, rather than have magic fields that make types seem like structs, expose the reflected data in actual structs. Example:

const NullableType = struct {
    child: type,
};

@reflect(?u32) ==> NullableType { .child = u32 }

const ArrayType = struct {
    child: type,
    len: u64,
};

@reflect([4]u32) ==> ArrayType { .child = u32, .len = 4 }

// Possible example for a struct

const StructType = struct {
    field_names: [][]const u8,
    field_types: []type,
    field_offsets: []u64,
}

const S = struct { x: i32, y: u8 };

@reflect(S) ==> StructType {
    .field_names = [][]const u8 { "x", "y" },
    .field_types = []type { i32, u8 },
    .field_offsets = []u64 { 0, 4 },
}

etc. You get the idea. You could also add a souped-up version of @typeId that returns an enum with variants like Nullable: NullableType, etc.

Some pros: possibly easier documentation (you can now lookup what fields you can access just like for regular structs), fewer special cases in the field-lookup code.

There could also be a @deReflect (deflect? unreflect?) that turns one of these reflected structs into a type, so you could generate eg. a struct programmatically at compile time.

Of course, @reflect could also be implemented as a regular function in userland (I hit #586 when trying it) as long as some reflection mechanism exists.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Nov 6, 2017

I like this idea. I ran into an issue which is related which I will type up and link here.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Nov 6, 2017

I think #588 has to be solved before the idea @scurest outlined here can be implemented.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Nov 6, 2017

We could potentially even remove these functions:

  • @sizeOf
  • @alignOf
  • @memberCount
  • @minValue
  • @maxValue
  • @offsetOf
  • @typeId
  • @typeName
  • @IntType

Instead these could all be fields or member functions of the struct returned from @reflect, or replaced with @MakeType (un-reflect, deflect, whatever it's called).

@Ilariel

This comment has been minimized.

Ilariel commented Nov 6, 2017

I like the idea of @reflect since it exists in another namespace as a builtin and can exist as part of the language instead of being a standard library hack.

The @MakeType name could be @reify given the meaning of the word: "to regard something abstract as if it were a concrete material thing"
However whether it is fitting to use it given the CS meaning related meaning is another question
https://en.wikipedia.org/wiki/Reification_(computer_science)

With the "maketype" functionality we could write type safe compile time type generators that can give proper compiler errors and flexibly generate types. I think we need to be able to name the types too if we want to export them to C as a visible structs. On Zig side we can just use aliases to address the generated types if we want to.

andrewrk added a commit that referenced this issue Nov 7, 2017

add @memberType and @memberName builtin functions
see #383

there is a plan to unify most of the reflection into 2
builtin functions, as outlined in the above issue,
but this gives us needed features for now, and we can
iterate on the design in future commits

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Nov 7, 2017

scurest added a commit to scurest/zig that referenced this issue Nov 8, 2017

add @memberType and @memberName builtin functions
see ziglang#383

there is a plan to unify most of the reflection into 2
builtin functions, as outlined in the above issue,
but this gives us needed features for now, and we can
iterate on the design in future commits

@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018

@tgschultz

This comment has been minimized.

Contributor

tgschultz commented Mar 8, 2018

I had this same idea while looking over some older reflection code I'd written and threw together an example of what I imagine a TypeInfo struct (returned by @reify) might look like. This is with no knowledge of compiler internals, it's just spitballing.

This example uses pointer reform syntax as described by #770

pub const TypeId = enum {
    Void,
    Type,
    NoReturn,
    Pointer,
    Bool,
    Integer,
    Float,
    Array,
    Slice,
    Struct,
    Union,
    Enum,
    ErrorSet,
    Promise,
    Function,
    Literal,
    Namespace,
    Block,
};

pub const LiteralTypeId = enum {
    Null,
    Undefined,
    Integer,
    Float,
    String,
    CString,
};

pub const PointerTypeId = enum {
    Single,
    Block,
    NullTerminatedBlock,
}

pub const TypeInfo = struct {
    isNullable: bool
    isErrorable: bool
    Id: TypeId,
    Type: type,
    Name: []const u8,
    Size: usize,
    Alignment: u29,
    Detail: TypeInfoDetail,
};

pub const TypeInfoDetail = union(TypeId) {
    Void: void,
    Type: void,
    NoReturn: void,
    Pointer: PointerTypeInfoDetail,
    Bool: void,
    Integer: IntegerTypeInfoDetail,
    Float: FloatTypeInfoDetail,
    Array: ArrayTypeInfoDetail,
    Slice: ArrayTypeInfoDetail,
    Struct: StructTypeInfoDetail,
    Union: UnionTypeInfoDetail,
    Enum: EnumTypeInfoDetail,
    ErrorSet: EnumTypeInfoDetail,
    Promise: PromiseTypeInfoDetail,
    Function: FunctionTypeInfoDetail,
    Literal: LiteralTypeId,
    Namespace: void,
    Block: void
    Opaque: void,
};

pub const PointerTypeInfoDetail = struct {
    Id: PointerTypeId,
    Child: *TypeInfo,
};

pub const IntegerTypeInfoDetail = struct {
    isSigned: bool,
    bits: u8,
    maxValue: usize,
    minValue: usize,
};

pub const ErrorTypeInfoDetail = struct {
    ParentSet: *TypeInfo,
    value: usize;
};

pub const FloatTypeInfoDetail = struct {
    bits: u8,
    maxValue: f64,
    minValue: f64,
    epsilon: f64,
    //potentially maxExp, hasSubnorm, etc?
};

pub const ArrayTypeInfoDetail = struct {
    isNullTerminated: bool,
    Child: *TypeInfo,
    length: usize,
};

pub const StructTypeInfoDetail = struct {
    isPacked: bool,
    memberNames: [][]const u8,
    memberOffsets: []const usize,
    Members: []const *TypeInfo,
};

pub const UnionTypeInfoDetail = struct {
    Tag: *TypeInfo,
    memberNames: [][]const u8,
    Members: []const *TypeInfo,
};

pub const EnumTypeInfoDetail = struct {
    isErrorSet: bool,
    Tag: *TypeInfo,
    memberNames: [][]const u8,
    MemberValues: []const usize,
};

pub const PromiseTypeInfoDetail = struct {
    //???
};

pub const FunctionTypeInfoDetail = struct {
    Return: *TypeInfo,
    Args: []const *TypeInfo,
};

@alexnask alexnask referenced this issue Apr 25, 2018

Merged

Metaprogramming - @typeInfo [DONE] #951

17 of 17 tasks complete
@binary132

This comment has been minimized.

binary132 commented May 1, 2018

Hi, I'm not sure if this is the right place to comment, but it would be nice if this included namespaced fn's for the type (as opposed to only fns implemented as fields of the struct type.) I am not that familiar with Zig semantics, but MajorLag from the IRC channel suggested this might be useful for type validation at the call-site of comptime fns.

For example:

CallQuack(someDog) fails somewhere down in the call graph because Dog doesn't have a Quack implementation. But you could make a macro Validate (@trait?) that checks the member fns and namespaced fns of Dog against the methods and namespaced fns of some interface type Quacker. The syntax I'm imagining for that would be something a bit like one of either:

CallQuack(@trait(Duck, someDog)) or CallQuack(someDog) with @trait at the top of the call graph.

I do recognize this doesn't really solve the typeclass problem but it could be a convenient application of TypeInfo. Hopefully this is at least somewhat in the spirit of Zig.

@andrewrk andrewrk changed the title from more compile-time type reflection to builtin function @reify to create a type from a TypeInfo instance Jun 1, 2018

@andrewrk

This comment has been minimized.

Member

andrewrk commented Jul 20, 2018

I want to apologize for miscommunicating here. I've had this proposal marked as accepted for a long time, and it led @alexnask to write code with that assumption, when I've actually been considering @reify and whether we will have it and how it will work.

I'd like to discuss with @thejoshwolfe and figure this proposal out before accepting it.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Aug 1, 2018

Here's an argument in favor of @reify - ability to select on multiple channels. Let's say I have Channel(KeyboardEvent) and Channel(CompilerEvent). I want to do something like this:

switch (Channel.select(keyboard_channel, compiler_channel)) {
    .KeyboardEvent => |ev| handleKeyboardEv(ev),
    .CompilerEvent => |ev| handleCompilerEv(ev),
}

This would require the ability to construct a union(enum) based on the type name of the child type of each Channel. This also depends on #683.

On the other hand, what if the child type of the Channels were the same name (but different namespace?) Channel(Keyboard.Event) and Channel(Compiler.Event). Now it's unclear what the names of the generated union(enum) should be. As an alternative, one could provide the type at the callsite:

const SelectType = union(enum) {
    KeyboardEvent: Keyboard.Event,
    CompilerEvent: Keyboard.Event,
};
switch (Channel.select(SelectType, keyboard_channel, compiler_channel)) {
    .KeyboardEvent => |ev| handleKeyboardEv(ev),
    .CompilerEvent => |ev| handleCompilerEv(ev),
}

Now everything is crystal clear how it would work and @reify is not actually required.

@ChengCat

This comment has been minimized.

ChengCat commented Sep 15, 2018

Hi, I am new to Zig, and I like this language :) Having worked with Coq, I really appreciate the "Software Should be Perfect" idea. I have the impression that not many people out there really cares about robustness/correctness of software, which is sad.

@reify feels to me a deliberate omission from a complete programming language. It is so simple and fits so naturally with the rest of the language. If you don't see what I mean, I suggest you look at the Terra language. Run-time types are compile-time values in both Zig and Terra. Types are truly first-class citizens in Terra, you can create/manipulate/inspect or do anything you want with them. In Zig, @reify is the last missing building block.

The only reason to exclude @reify is to limit the programmer not to use the more powerful tools. I agree with this reasoning, and am also afraid that @reify might be abused, leading to too much unreadable code. But I feel that @reify should be a soft limit rather than a hard one: we should discourage people from using @reify, but leaves the power for those who are building architecture or abstractions.

Meta-programming or expressive power is not directly expressed in Zig's goals. However, in good hands, they are helpful to build abstractions which allow client code's intent to be expressed more clearly (i.e. clarity). For example, try compare the two Channel.select examples, which one is easier to read? This example is just a simple one, and let me suggest two big use case of @reify:

  1. generate types from protobuf's .proto files at compile time;
  2. generate types in an ORM library.

With the power of @reify, library authors have full control of the types they return to the user, and would probably result in better library interface.

@thejoshwolfe

This comment has been minimized.

Member

thejoshwolfe commented Sep 15, 2018

Clarity is a complicated subject. Perhaps the reader wants to know the author's intent; perhaps the reader wants to know the reality of what the code actually does. Both of these are important, and reify obscures the latter in favor of the former.

Consider that you're reading code that calls a method named readUintUnsafe(), and grepping the source for that name indicates that that name is not defined anywhere. You try to determine the type that contains the method, and you discover the type is the result of a reify call. The method readUintUnsafe() is generated by comptime code; the method name is constructed with "read" ++ type_name ++ "Unsafe". The implementation of the method is nowhere to be found except by mentally stepping through metaprogramming code that is variable on the type being read and whether or not bounds checks are performed. And also, for the sake of the argument, imagine that the metaprogramming is also variable on the source of data being read from: buffer vs stream vs fd, etc.

This seems like a very reasonable application of reify, and it makes me dislike the feature. While the intent of readUintUnsafe seems relatively clear by reading the name, the reality of the method is too hard to discern. My counter proposal to reify is enabling source code generation as a build step. This is probably going to be a controversial stance, but consider that with source code generation, you have the intent clearly laid out in the input to the code generator and the reality clearly laid out in the output of the generator. Also consider that stepping through code in a debugger would be much more sane if you had a source file on disk somewhere that bore a resemblance to the machine code being debugged.

I don't have anything concrete to propose yet, but the common pitfalls in source code generation should be mitigatable with the right utility libraries. For example, knowing when to put parentheses around subexpressions depending on the operators around it -- that's a hard problem, but we already want to solve it for zig fmt.

It should be reasonable to offer an official utility library for source code generation, and that should address the usecases for reify.

@BarabasGitHub

This comment has been minimized.

Contributor

BarabasGitHub commented Sep 15, 2018

I agree with the source code generation. Seems like a much more readable and clear approach than any meta programming.

@scurest

This comment has been minimized.

Contributor

scurest commented Sep 15, 2018

Consider that you're reading...

(Can you define a method with @reify? The Definition in a TypeInfo struct doesn't appear to contain the actual function that can be called. In any case...) This is exactly dual to the problem of finding some field in a struct which grepping indicates is not used anywhere because its name is constructed using comptime code and accessed using (say) @offsetOf.

IOW reflection inhibits "find uses" and reification inhibits "find definition".

Both of these could be addressed by source code generation. It seems artificial to do it for only one. Which doesn't mean it shouldn't be done only for reification, but the argument for why it should be done only for reification needs to break the symmetry somehow. That could be as simple as (for example) arguing that reflection is useful sufficiently often that tolerating its faults is acceptable and reification isn't, or arguing that "find definition" is more important than "find uses". But as it stands, the exact dual of the argument above is an argument for using code generation instead of reflection.

(For the record, I only originally mentioned reify because like ChengCat says it seems to fit so naturally into the picture. I was actually really only concerned about @typeInfo :x)

@ChengCat

This comment has been minimized.

ChengCat commented Sep 16, 2018

@thejoshwolfe Your major concern seems to be that, generated code by meta-programming is not that 'touchable' unlike other code present in a textual form. This is true for meta-programming in most other languages, but in Zig, this problem can be addressed.

Since we can fully inspect the generated types at compile-time, it is totally viable to have a std.meta.printType to print out type information at compile time. This can include docstrings and even method implementation code associated with the type. Printing function code at compile time has been done in Terra. I strongly recommend that you have a full understanding of Terra before making any comptime-related design decisions. Terra is very similar to Zig on this front, and it's where Terra really shines.

Now, you may think it's still too much trouble to add a printType to see the generated code, and this is where a docgen tool can help. Docgen tool can see all the generated code, and we can, for example, decide that for any module-level and struct-level types, whether generated or not, print out documentation and even source code in docgen results. This should be good enough for many practical uses.

On the other hand, if I want to see how a library generates code behind the scene, I would rather read the one using meta-programming instead of source code generation. I would also prefer to maintain the meta-programming one.

@Sahnvour

This comment has been minimized.

Contributor

Sahnvour commented Sep 16, 2018

Find uses and find definition scenarios can be addressed by proper tooling. A service compiler could very well make use of what ChengCat describes and provide accurate point of definition, generated code and whatnot. Agreed, it does not cover all use cases.

However, source code generation as a build step feels like a workaround. That's the way it is used in C and C++ for this kind of problems because these languages lack the ability to act on the programs directly. And they may actually go the in opposite direction now with the metaclasses proposal to get rid of code generation.
Sure, at least Zig source code generation could be done in pure Zig with no external preprocessor and macro language, but maybe it can do better without the constraints old languages have.

@thejoshwolfe

This comment has been minimized.

Member

thejoshwolfe commented Sep 20, 2018

A printType function is a fine idea, and we can have that. It'd be neat to see an ide support that too. We have plans to have the zig compiler run as a server that can act as the backed of an ide with commands like renaming a variable, reordering parameters, and fun stuff like that. printType would fit nicely into that system. I really like that idea.

But part of Zig's design philosophy, and i'm just now realizing that this isn't formally documented anywhere, is that we want to be friendly to unsophisticated static analysis, like a human using grep and a simple text editor. We definitely want power tools for zig development, but we don't want to create a dependency on them.

i don't think printType is an acceptable solution to my objections to reify.

@kyle-github

This comment has been minimized.

kyle-github commented Sep 21, 2018

@thejoshwolfe sorry reading this late, much travel lately.

I am intrigued by your comment about code generation. Are you thinking something along the lines of macros that operate on ASTs or something different? Having such a thing that is an explicit pass that returns source code would be quite interesting.

Depending on where you are going with that, could this simply replace comptime?

I would love to hear more about what you are thinking. Perhaps in a different issue?

@kyle-github

This comment has been minimized.

kyle-github commented Sep 21, 2018

@thejoshwolfe to the topic at hand, @reify...

Zig has forged a nice path of providing simple, orthogonal and powerful abstractions at a very low level. There are many features that could be misused. I think @reify is possibly one of those that can be misused, but may enable a whole new class of programs to be simple and clean.

Why is comptime OK but @reify is not? I am not trying to be snarky, I really want to understand what I am missing here because there is clearly some dividing line!

@Sahnvour

This comment has been minimized.

Contributor

Sahnvour commented Oct 8, 2018

I think Zig can make metaprogramming clear enough that's it is not a burden to understand it and thus require code generation instead.

Reification can take another form that resembles more closely regular Zig code, and the language features are already mostly here.

Consider the toy (but useful) example of going back and forth from AOS to SOA. A library making use of this may want to allow the user to provide a struct (or, similarly, a list of types) containing the members that it will use and store internally as SOA.

A naïve implementation may look something like this:

fn AOSToSOA(comptime S: type) type {
    const memberCount = @memberCount(S);

    return struct {
        const Self = @This();

        a: [5]u32,
        b: [5]f32,

        fn retrieve(self: *const Self, index: usize) S {
            var s: S = undefined;

            comptime var i = 0;
            inline while (i < memberCount) : (i += 1) {
                @field(s, @memberName(S, i)) = @field(self, @memberName(Self, i))[index];
            }

            return s;
        }

        fn store(self: *Self, s: S, index: usize) void {
            comptime var i = 0;
            inline while (i < memberCount) : (i += 1) {
                @field(self, @memberName(Self, i))[index] = @field(s, @memberName(S, i));
            }
        }
    };
}

const Foo = struct {
    a: u32,
    b: f32,
};

pub fn main() void {
    const SOA = AOSToSOA(Foo);
    var soa: SOA = undefined;

    var foo = Foo{.a = 18, .b = 32.5478};
    soa.store(foo, 0);
    var bar = soa.retrieve(0);
    warn("{}\n", bar);
}

Obviously, this works because I manually hardcoded the correct members that mirror Foo's. But everything else that uses the members is generic because Zig already offers reflection and handy features such as @field.
The only thing that's missing is a way to programatically add members (and functions) to a struct. By:

  • allowing comptime blocks inside struct definitions
  • having some mechanism to "emit" expressions (definitions ?) in the enclosing struct
    this can be written in an almost native way. For example
return struct {
    // ...
        comptime {
            var i = 0;
            inline while (i < memberCount) : (i += 1) {
                @memberName(S, i): []@memberType(S, i),
            }
        }
    // ...

The line @memberName(S, i): []@memberType(S, i), probably needs some love. Maybe it would need a keyword to indicate it's a member definition. Maybe it can be a compiler intrinsic instead. Maybe it only needs to specify that @memberName(S, i) is an identifier for the grammar to work, for example using an enhanced version of @"identifier".

The same logic could be applied to function definitions.

I'm no language designer so take everything with a grain of salt, but this is in my opinion an obvious way one would want to do metaprogramming, staying as close as possible to "normal" syntax.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Oct 8, 2018

Consider the toy (but useful) example of going back and forth from AOS to SOA.

I don't have a more complete response to your comment yet, but I just wanted to validate this use case. This is a compelling use case that I want to have a clear solution in Zig, and if the best way for this to happen is with @reify, that's a pretty compelling argument for @reify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment