Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a unified approach to inheritance and Go style struct embedding #2938

Closed
mikdusan opened this issue Jul 23, 2019 · 20 comments
Closed

a unified approach to inheritance and Go style struct embedding #2938

mikdusan opened this issue Jul 23, 2019 · 20 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@mikdusan
Copy link
Member

for reference see #130, #1214

I was inspired after reading http://pling.jondgoodwin.com/post/favor-composition-over-inheritance/ to write this proposal. Please see the following code example for how this could work for Zig.

  • collision of inherited items will be resolved statically
  • inheriting undefined functions from multiple paths is allowed as long as the signatures match exactly; and up to 1 definition is allowed
  • inheriting the same def from multiple paths is allowed because it can be resolved to the same originating definition
// interfaces are just structs without fields
const Dump = struct {
    // declare an undefined (unimplemented) method
    // invoking an undefined method is a compile error
    // self is specified with type `Dump` but is just a placeholder
    pub fn dump(self: @This()) anyerror!void = undefined;
};

// another interface
const QuantifySize = struct {
    pub fn quantifySize(self: @This()) Size;
    pub const Size = enum { small, medium, big };
};

// this is an implementation of QuantifySize interface
const Small = struct {
    // inherit QuantifySize
    pub inherit QuantifySize;

    // this particular impl is static and does not require any fields
    pub fn quantifySize(self: Small) Size {
        return Size.small;
    };
};

// this struct will be inherited
const Vechicle = struct {
    pub mass: f64,
    pub volume: f64,
    pub color: []const u8,

    fn init() {
        return Vehicle{
            .mass = 0,
            .volume = 0,
            .color = "blue",
        };
    }
};

const Spaceship = struct {
    // inherit all pub members: { dump }
    // field name not required: Dump has no fields
    // `*Spaceship` is implicit-castable to `*Dump`
    pub inherit Dump,

    // implement Dump interface
    // self is now typed
    pub fn dump(self: *Spaceship) anyerror!void {
        std.debug.warn("this ship's size is {}\n",  self.quantifySize());
    }

    // inherit all pub members: { QuantifySize, quantifySize, Size }
    // field name not required: Small has no fields
    pub inherit Small,

    // inherit all pub members except: { shunned }
    // `*Spaceship` will not implicit-cast to `*Vehicle` because we did NOT inherit all members
    pub inherit ship: Vehicle { mass, value },

    // another possible inherit but not public
    inherit engine: ImpulseDrive { thrust, engage },

    crew_max: u32,

    fn init() {
        return Spaceship {
            .ship = Vehicle.init(),
            .engine = ImpulseDrive.init(),
            .crew_max = 4,
        };
    }
};

pub fn main() !void {
    var roci = Spaceship.init();
    try call_dump(roci); // implicit-cast to `Dump`
    std.debug.warn("ship mass is {}\n", roci.mass);
    std.debug.warn("ship color is {}\n", roci.color); // compile error: `color` not inherited
    std.debug.warn("ship color is {}\n", roci.ship.color); // ok: explicit access as aggregate
    try roci.engage();
    try roci.thrust();
}

fn call_dump(obj: Dump) !void {
    try object.dump();
}
@ghost
Copy link

ghost commented Jul 23, 2019

Maybe usingnamespace would be used here?

const S1 = struct{
  x: u32;

  fn s1Func(self: S1) void{
   // do stuff with x
  }
}

const S2 = struct{
  usingnamespace S1;
  y : u32;
  
  fn s2Func() void{
    // do stuff with y
  }
}

But what about the division between namespace members and instance members. Embed both? Embed only instance members..?

s2instance.s1Func( ..expects S1..) // example of issue

Usually when you are dealing with inheritance in OOP languages, it's all about instances.

@mikdusan
Copy link
Member Author

Maybe usingnamespace would be used here?

Overloading that keyword which has the word "namespace" implies that it does not include instance members.

But what about the division between namespace members and instance members. Embed both?

I'm inclined that the default inherit is to bring in all (pub) fields, methods and defs.

But if differentiation between { fields, methods } and { defs } is to be made, we could allow defs syntax for inherit:

pub inherit QuantifySize; // inherit defs: { Size }

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jul 24, 2019
@andrewrk andrewrk added this to the 0.6.0 milestone Jul 24, 2019
@emekoi
Copy link
Contributor

emekoi commented Jul 27, 2019

this is just bikeshedding, but would rather have usingnamespace changed to using and have it import defs (including functions/methods) and then have an embed keyword which import fields. this way data and behavior are not so tightly coupled.

@kyle-github
Copy link

If you take the "be explicit" goal of Zig, perhaps you should just use methods in unions. Then you have a very explicit set of code that shows exactly what you mean.

const AUnion = union { typeA: AStruct, typeB: BStruct, ... fn foo(self: AUnion) anyerror!bool { .... } }
Comptime code can probably help here as well as tagged unions to sweeten up the coding a bit. The nice part about this is that AStruct and BStruct do not necessarily have anything to do with each other.

Inheritance fails in some fundamental ways. Consider points. A struct with x and y, right? What about polar coordinates? They are points. They have a similar interface, but their interior data does not look at all similar and their methods' implementations are quite different. Perfect use for unions if I want to be able to pass around a generic point.

DRY and similar ideas like normalization in relational databases sound good on the surface, but when you actually maintain a code base for a few years it gets a lot more nuanced and you start regretting being too tricky. Note how heavily performance-oriented code bases such as game code use things like ECS rather than "normal" OO decomposition. Code spends a lot more time being extended and maintained than it does being created. Designing Zig to be maintainable first is where the win is IMHO.

Sorry for the rant. We now return you to your regularly scheduled programming.

@Tetralux
Copy link
Contributor

Tetralux commented Jul 27, 2019

@kyle-github The issue with the union idea is how do you introduce a new type to that union when you didn't create the union? Whereas with composition, it doesn't matter what the thing is that you're embedding and where it came from.

const ExtendedDuration = struct {
    embed time.Duration;
    // ... new fields etc here //
}

// ... //

fn sleep(d: time.Duration) void { ... }
sleep(ExtendedDuration.init(...));

@kyle-github
Copy link

@Tetralux that is precisely what I am talking about. So much of C, C++ and Java language platforms and tooling are designed to help break your code into small separate pieces that everyone assumes that this is a good idea. It isn't IMHO.

In your example, you can just use time.Duration as a normal field in the struct and you can provide the sleep function taking an ExtendedDuration. Your implementation (which could call the underlying one). Easy to find. Explicit encapsulation. Hard to break.

What happens in your example when the owner of time.Duration adds a field that conflicts with a field you added? What happens in your example when the owner of time.Duration adds a method that conflicts with a method you added? This is the fragile base class problem.

I get that this sounds a lot like I am just putting off the issue, but it is a very real consideration for large scale projects. I do not see a reason for Zig to support and encourage patterns that are anti-patterns for long term software maintenance.

Sorry for the rant :-(

@kyle-github
Copy link

The astute reader will note that I may be flip-flopping on this issue. I remember commenting over a year ago on the issue of struct embedding and how that was used fairly often in C. That is still true, but only needed for C interop and now there is a whole set of attributes that can be applied on a struct that will be shared with C.

Since then I have been looking more info things like the fragile base class and specific classes of maintenance problems we've been running into that cause us (at work) to do stupid things like having library owners and downstream library users run the same (copied) set of API tests so that the library users can figure out when the API subtly changed and broke something downstream.

@Tetralux
Copy link
Contributor

Tetralux commented Jul 28, 2019

@kyle-github
I'd expect trying to embed a struct with fields that have the same name as fields in the struct you are embedding into, to be a compile error, since the whole point is that you can deref fields of the embedded struct on the enclosing struct.

Though, you could also have some sort of annotation you could apply to the embed time.Duration declaration which would cause the embedded fields to become shadowed by the ones in the struct instead, in cases where you explicitly know that is what you want.


You can also name the embedded struct: embed base: time.Duration so that you could refer to the embedded part explicitly.
Indeed, maybe in order for you to add the annotation for shadowing, you have to name the embedded struct.

Perhaps you should also be able to set default values on embedded fields.

const Duration = struct {
    seconds: usize,
    // ...
};
const ExtendedDuration = struct  {
    embed base: time.Duration;
    seconds = 60; // `f=V` not `f:T` or `f:T=v`
};

You could also require that you name the embedded struct and do base.seconds = 60 for explicitness.

@kyle-github
Copy link

This got long, sorry :-(

@Tetralux I think we are talking past each other here a bit. The fragile base class problem is well known, and well understood and the methods that are proposed here provide the ability to code the fragile base class problem in Zig.

The question is not whether or not you can embed, but whether you should. Based on my experience with this exact problem, to me this is an area where Zig has an opportunity to fall back to its goal of explicit, clear behavior and make it much more difficult to create the fragile base class situation.

More points:

  • OK, so now you have name spaced your fields by having the embedded struct just be a normal field. But you still have the same problem for methods, right? If you name space those, then you are back to pure composition without any inheritance.
  • Again, it is 2AM, the code broke on a commit that seems to be completely unrelated, and you have 1.5M lines of code to deal with that you have not looked at much for a year. What do you want: all the methods associated with the top level struct definitions, or a game of turning over rocks to find where the actual implementation is?
  • how are these magic methods documented? Where do I go to find out what I can call on a struct that embeds other structs like this? How is that documentation updated when someone adds a method to the embedded struct? This isn't explicit. It is not clear. There is no single place to look for change.
  • what are the performance implications of this? The last time I looked at Go's implementation (which was a couple of years ago), it was not all that pretty (trampolines with offsets and other fun stuff) and with Spectre and Meltdown making all indirect function calls much slower, the overhead is growing not shrinking. So if I call method M directly on struct A it works very quickly, but if I call it on struct B which embeds struct A then it is 4x slower? How can I tell when this is going to happen?
  • what is the use case you are trying to solve and are there more Zen-of-Zig ways to do it instead? Zig has powerful tools like comptime that can make a lot of this much less painful and clear. You do not write Rust like C and you shouldn't write Zig like C (or Go). It is its own thing with its own best practices.
  • is this something that should be in Zig 1.0? Even if you find a solid use case for this, shouldn't the language get stable first and then add these edge case issues? Build a solid foundation first, then put in the gingerbread.

Systems languages need to work for the long term. There are many code bases in C today that are 20+ years old and are regularly used: Linux, Apache, Gnome, X Windows, macOS kernel... These are the kinds of things that Zig is aiming at. Initial coding should not be painful, but it also isn't the place you spend most of your time.

I think it was Dennis Ritchie who said that debugging is much harder that writing code, so that if you spend all your brain power writing the code, by definition you cannot debug it. I am paraphrasing a bit. The key point is that clarity and simplicity matter when the code is going to live for decades. Tricks and little things that helped you save 5% of your lines of code when writing the project usually are things you regret later.

I have high hopes for Zig. It seems like it takes so many of the lessons that we have learned over the last five decades of systems programming and uses the best practices and careful, but not intrusive, means to enforce those.

I think we should probably take this offline. I cannot tell if anyone else cares or is just eating popcorn :-)

@daurnimator
Copy link
Collaborator

I think it was Dennis Ritchie who said that debugging is much harder that writing code, so that if you spend all your brain power writing the code, by definition you cannot debug it. I am paraphrasing a bit.

Kernighan's Law:

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

@Tetralux
Copy link
Contributor

@kyle-github
It'll take me a while to take all that in, but the part about the implementation.
The compiler knows statically where the embedded field is and just uses that as a comptime offset to access the fields. It's no different than if you dereference a field that literally in the struct.

@Tetralux
Copy link
Contributor

Also, the laws you are quoting sound like you should spend little effort thinking about how to write the code because you need to spend half your effort on debugging. Which I agree with - just to be clear.

@Tetralux
Copy link
Contributor

Further, all I'm looking for here is that I have some common functionality about several types and I want it to be easy to just say "use the same impl as this other struct."
Or do you have a better suggestion to make as to how Zig could fix or otherwise address the need for that @kyle-github?

@kyle-github
Copy link

Kernighan's Law:

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Thanks @daurnimator! Apparently my memory for quotes is not that good before my morning caffeine :-)

@Tetralux, Where you say:

Further, all I'm looking for here is that I have some common functionality about several types and I want it to be easy to just say "use the same impl as this other struct."

Then, yes, there is a better (IMHO) way to do it. First, this is inheritance, not composition, that you are talking about. You are trying to inherit implementation. That is exactly what is now understood to be relatively fragile.

The way I would normally do this is something like:

const StructA = struct {
    ... some fields ...
    fn foo(self:StructA) niftyError!i32 { ... }
    ... more methods ...
}

const StructB = struct {
    ... some fields ...
    fieldA: StructA,
    ... some methods ...
    fn foo(self:StructB) niftyError!i32 { return self.fieldA.foo(self.fieldA); }
    ... other stuff ...
}

You make it explicit. If someone adds fields or methods to StructA, nothing happens to StructB users. This seems like a violation of DRY, but it really isn't. You are encapsulating StructA in StructB.

Could there be some syntactic sugar at some point for this? Perhaps, but languages are moving in the direction of making things a lot more explicit (see all the @OverRide stuff from Java).

Thanks for the comments and pushing me on this. It has helped me refine why I am not fond of this idea. The core of what I object to in the examples is that the exporting of the functions of StructA (in the example) is implicit. If there was some way to make them explicit so that you did not have to write the boilerplate above, but still had complete control over which functions you supported, then I could get behind the idea.

@mikdusan
Copy link
Member Author

mikdusan commented Jul 29, 2019

Firstly thank you everyone for input and replies. It is truly appreciated.

DELEGATION PROPOSAL v2.1

Summary changes from original proposal:

  • drop keyword inherit
  • add keyword delegate
  • add keyword implement

ITEMS TO BE ADDRESSED IN THIS PROPOSAL

  • can interfaces bring in other interfaces via implement and does keyword become unclear at that point?
  • public vs. private applicability
  • vtable details
  • how are interface references stored?

GLOSSARY

TERM DESCRIPTION
base delegation subject
derived @This()
field struct data member
struct method regular struct method with param[0] self
method const struct method
member field or method

PREAMBLE

This proposal was inspired by article Favor Composition Over Inheritance. Some shared goals are to reduce fragile inheritance issues, and be more explicit. With that said, while I intended to keep this proposal on point as much as possible, if there are ancillary changes to Zig I will include them and mark as such.

PROPOSAL

  1. delegation: add language support for promoted resolution of base fields/methods to derived namespace. Related keyword is delegate.

If base has any fields they are aggregated just as with regular struct aggregation. The key difference is that delegation fields/methods are also directly resolvable to derived namespace. The type and address of delegated members are delegated to the aggregate at comptime.

To combat the fragile nature of namespace collision, either full or explicit delegation may be performed. When full delegation is performed, all base fields/methods are delegated and is probably appropriate for only very simple cases where multiple bases are not involved. Explicit delegation is more verbose and locks down exactly what is delegated and is recommended for more complex composition cases. Also, a powerful consequence of full delegation is that implicit casting to base type is gained.

  1. interfaces: add language support for functional interfaces to be specified via structs. The structs are restricted in that they cannot have fields and methods must be undefined. Related keyword is implement.

Multiple interfaces with the same method names may be implemented in the same derived. However, the method signatures must exactly match.

A derived struct which implements an interface gains implicit casting to interface.

[ANCILLARY] #1717 (an accepted proposal) syntax has been incorporated into this proposal.

DELEGATION SYNTAX

delegation = 'delegate' name ':' base [member_list] ',' ;
delegation = 'delegate' base [member_list] ';' ;
name = identifier ;
base = identifier ;
member_list = '{' member [',' member]... [','] '}' ;
member = identifier ['=' default] ;

delegate is a new keyword for delegation.

name must be specified if base has fields or omitted if base has no fields. If base has fields, derived will aggregate base. A colon is required after name for consistency with regular aggregation syntax.

base is the source struct for member delegation.

member_list is optional and specifies which members are to be delegated, otherwise all members are delegated.

member may optionally provide a default value that is available to derived or descendants of derived.

const A = struct {
    one: u32,
};
const B = struct {
    padding: [8]u8 = undefined,
    delegate a: A,
};
fn foo() void {
    var b = B{ .one = 1 };
    _ = b.one; // 1
    _ = &b; // 0x1000
    _ = &b.a; // 0x1008
    _ = &b.one; // 0x1008
}

INTERFACE SYNTAX

implementation = 'implement' interface ';' ;
interface = identifier ;

implement is a new keyword for which declares this struct implements an interface.

interface must be a struct which does not contain fields, and contains undefined methods. It may contain other namespace defs but only undefined methods are candidates for implementation.

const Format = struct {
    const Error = error{FormatError};
    const format = fn(self: @This(), out: *Stream) Error!void;
};
const Data = struct {
    implement Format;
    one: u32,
    two: u32,
    const format = fn(self: @This(), out: *Stream) Error!void { ... };
};

IMPLICIT CASTING

A struct which delegates all members from base can implicitly cast to base.

const A = struct {
    one: u32,
    two: u32,
};
const B = struct {
    delegate a: A,
};
const C = struct {
    delegate a: A { .two },
};
const bar = fn(a: A) void { ... }
const foo = fn() void {
    var b = B{...};
    var c = C{...};
    bar(b); // ok
    bar(b.a); // ok
    bar(c); // error
    bar(c.a); // ok
}

EXAMPLE

const Format = struct {
    const Error = error{FormatError};
    const format = fn(self: @This(), out: *Stream) Error!void;
};
const Cursor = struct {
    implement Format;
    pos: Size = 0,
    const Error = error{SeekError};
    const seek = fn(self: *@This(), where: Where) Error!void;
    const Where = union(enum) {
        pos: Size,
        forward: Size,
        backward: Size,
        begin: {},
        end: {},
    };
    const Size = u64;
    const format = fn(self: @This(), out: *Stream) Error!void {
        out.print("position={}", self.pos) catch return error.FormatError;
    };
};
const Read = struct {
    implement Format;
    delegate cursor: Cursor, // error: Cursor.format collision
    delegate cursor: Cursor { .pos, seek },
    const Where = Cursor.Where;
    const Error = error{ReadError};
    const read = fn(self: *@This(), buffer: []u8) Error!void { ... };
    const format = fn(self: @This(), out: *Stream) Error!void {
        out.print("read@{*} { ", &self) catch return error.FormatError;
        try self.cursor.format(out);
        out.print(" }") catch return error.FormatError;
    };
};

With this example we see that both Cursor and Read make use of a format interface. At first it might seem disappointing that Read cannot fully delegate Cursor but this is actually a good thing. In object parlance, a "reader" is not a "cursor" but it has a cursor which works very well for aggregation and delegation. In fact, we can see that through delegation the following is now possible:

const foo = fn() !void {
    var read: Read = ...
    out.warn("read position before {}\n", read.pos);
    try read.seek(Read.Where{ .begin = {} });
    out.warn("read position after {}\n", read.pos);
    dump(read);
}
const dump = fn(subject: Format) !void {
    try out.print("DUMP:\n");
    try out.print("    ");
    try subject.format(out);
    try out.print("\n");
}

IMPLEMENTATION DETAILS

RESTRICTIONS ON METHOD RETURN TYPE

A method return type cannot be inferred. An error or value type must be explicit.

RESTRICTIONS ON UNDEFINED METHODS

Undefined methods may only exist in structs without fields. They are essentially interfaces.

NAMESPACE COLLISION

Namespace collision resulting from keyword delegate is a compile error.

Namespace collision resulting from keyword implement is a compile error.


POSSIBLE ORTHOGONAL PROPOSAL: FIELD DECL DOT-SYNTAX

old_syntax = name [':' type] ['=' default] ',' ;
new_syntax = '.' name [':' type] ['=' default] ';' ;
  1. struct member decls of fields vs. namespace defs is currently resolved by using a comma terminator for fields. I think it makes sense to switch to new_syntax to differentiate.

  2. delegation syntax terminates with either a comma or semicolon to stay consistent with old_syntax. However, this can be simplified to always terminate with semicolon if new_syntax is adopted.

const Foo = struct {
    .some_flag;
    .one: u32 = 1;
    delegate .truck: Truck;
    delegate Format;

    const init = fn(backing: *std.mem.Allocator) Foo {
        return Foo{
            .some_flag = true,
            .truck = Truck.init(),
        };
    };
};

@kyle-github
Copy link

@mikdusan While this is a lot nicer, I still have the same objections. By making explicit sharing/delegation optional and making the easy option one that allows the fragile base class case to exist more easily, it seems like the proposal actively encourages coders learning Zig to immediately use the bad habits they picked up in other languages.

Using one of @andrewrk's favorite expressions, if we look at the null hypothesis here, what happens if you do your examples in Zig of today? What do you lose? Is that loss important?

What is the problem that this solves? What needs struct inheritance and substitution but cannot be solved in a way that uses Zig today?

My perspective from several decades of coding is that time saved writing the code the first time is usually lost many times over in time you spend debugging and maintaining code because you used short cuts when writing the code.

@MasterQ32
Copy link
Contributor

I really like that approach on interfaces, both semantic-wise and syntax-wise.

But a few things:

on syntax:

with the new function syntax, this would still be a function type assignment:

const Format = struct {
    const format = fn(self: @This(), out: *Stream) Error!void; // format is type, not function
};

i like the approach of defining an interface function as "undefined":

const Format = struct {
    const format : fn(self: @This(), out: *Stream) Error!void = undefined; 
};

But: this is already valid Zig code which defines an undefined function pointer named format.
As we impose several restrictions on interfaces ("only undefined functions, nothing else"), i propose to introduce a new type interface:

const Format = interface {
    fn format(self: @This(), out: *Stream) Error!void;
};

pro:

  • structs are not interfaces and are not mixed together (as in C++)
  • you don't have to read the whole struct to recognize the type as an interface

cons:

  • introduces a new type and keyword

on implementation detail:

We need a way to pass interfaces to functions or store references to them. Imho it does not make sense to pass interfaces by-value, but only by-reference. Thus, an interface object could be implemented as a pointer tuple of a type-erased pointer to the structure as well as a pointer to a static v-table.

As struct that implement a certain interface must implement all of the members and no "real" inheritance exists, the type erasure should not be a problem, if all interface functions take the interface by-reference

@Tetralux
Copy link
Contributor

I'll note for future reference, to avoid the Rustian problem of things like
Peekable(Rewinder(Reader(T))), which is obviously messy, leads to code bloat, but also requires the order to be that which you gave.

You should be able to compose interfaces together:

const Stream = Reader + Rewinder + Peekable;

// or...
fn Stream(comptime T: type) type {
    return Reader(T) + Rewinder(T) + Peekable(T);
}

fn read(s: *Stream) !usize { ... }

Something like that.

@MasterQ32
Copy link
Contributor

I just read the proposal again and found another problem:

A struct which delegates all members from base can implicitly cast to base.

This would imply we accept slicing which is usually a problem and/or programming error.
I would recommend to allow implicit casting only on pointer types and not objects itself. If we actually want to slice to object, we can just access the derived object field and make it explicit.

@mikdusan
Copy link
Member Author

closed: I have some more ideas and they diverge enough from this proposal to warrant closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

7 participants