Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What generated methods should structs provide? #2372

Closed
Tracked by #2360
leafpetersen opened this issue Jul 30, 2022 · 15 comments
Closed
Tracked by #2360

What generated methods should structs provide? #2372

leafpetersen opened this issue Jul 30, 2022 · 15 comments

Comments

@leafpetersen
Copy link
Member

leafpetersen commented Jul 30, 2022

The struct proposal (#2360) specifies that a number of methods are automatically generated for structs. The core set, which I think are relatively uncontroversial, are equality, hashCode, copyWith method, and a default constructor are generated. In addition, I proposed a debugToString method to support debug printing in a way that is not tied to object interpolation and hence is less likely to accidentally cause type information to be retained unnecessarily (this is a common source of code size bloat in large applications).

The debugToString specification is only loosely thought through, @rakudrama had suggestions around using a more structured format.

It may also be desirable to include other generated methods, e.g. some way of providing general coding and decoding members (cc @jakemac53 )

This issue is for discussing the general question of what generated members to provide, and with what types and semantics.

It may prove useful to file sub-issues to discuss specific proposals, which can be linked in here.

@leafpetersen
Copy link
Member Author

@lrhn
Copy link
Member

lrhn commented Jul 31, 2022

I wouldn't start adding toDebugString() (which is what I'd name it) unless we can also add it to other objects. If we can add it to Object, I'd be delighted, but just having it on struct objects seems too arbitrary.

(A more generally useful functionality would be a List<Object?> get fields which provides the value of all the fields of the object. That's still reflection, and not something I actually want.)

@mmcdon20
Copy link

Why not have toString() implement this behavior instead of adding an additional toDebugString()?

Other languages with a similar feature use this approach, for example:

Data classes in Kotlin:

data class Coordinate(val x: Int, val y: Int)
fun main() {
    println(Coordinate(5, 10)) // Coordinate(x=5, y=10)
}

Case classes in Scala:

case class Coordinate(x: Int, y: Int)
println(Coordinate(5, 10)) // Coordinate(5,10)

Structs in Swift:

struct Coordinate {
  let x: Int
  let y: Int
}
print(Coordinate(x: 5, y: 10)) // Coordinate(x: 5, y: 10)

Records in Java:

record Coordinate(int x, int y) {}

public class Example {
    public static void main(String args[]) {
        System.out.println(new Coordinate(5, 10)); // Coordinate[x=5, y=10]
    }
}

Records in C#:

using System;

record Coordinate(int X, int Y);

public class Program
{
    public static void Main()
    {
        Console.WriteLine(new Coordinate(5, 10)); // Coordinate { X = 5, Y = 10 }
    }
}

@rrousselGit
Copy link

About the generated copyWith, will it support struct.copyWith(property: null)?

@lrhn
Copy link
Member

lrhn commented Aug 1, 2022

@rrousselGit As described, it will support setting a property to null.

As with constructors, this method cannot be generated as a strict de-sugaring, since the ability to detect whether or not an argument has been passed is not available in Dart.

@rrousselGit
Copy link

Nice! I saw that line but wasn't sure about the implications

I suppose two remaining useful "functions" would be a fromJson/toJson.
But I suppose we're unlikely to get those?

@lrhn
Copy link
Member

lrhn commented Aug 1, 2022

I definitely do not expect toJson/fromJson because there is no general way to go from any value to or from JSON, and the fields of a struct can contain any value.

@mraleph
Copy link
Member

mraleph commented Aug 2, 2022

I think we should consider splitting this into its own feature, I suggest to look at this as an interface auto derivation feature which works for structs and classes alike, e.g.

// dart.core

/// If you implement this interface compiler would derive implementations of
/// its methods in the class. 
abstract class Equatable {
  int get hashCode;

  bool operator == (Object other);
}

/// If you implement this interface compiler would derive implementations of
/// its methods in the class. 
abstract class Encodable {
  Map<String, dynamic> toJson();
}

/// If you implement this interface compiler would derive implementations of
/// its methods in the class. 
abstract class Encodable {
  void encodeTo(Encoder e);
}

abstract class Encoder {
  void addProperty(String key, Object? value);
  void addEncodable(String key, Object? value);
}


class X implements Equatable, Encodable {
  final int x;
  final int y;
  final Z z;  // class Z implements Encodable

  // derived by the compiler
  bool operator == (Object other) {
    return other is X && this.x == other.x && this.y == other.y && this.z == other.z;
  }

  // derived by the compiler
  void encodeTo(Encoder e) {
    e.addProperty('x', x);
    e.addProperty('y', y);
    e.addEncodable('z', z);
  }
}

This is a bit similar to macros but bakes in the support in the language.

I think one of the open questions here is deep vs shallow equality for things like lists and other collections and how to control derivation of that.

@Levi-Lesches
Copy link

Levi-Lesches commented Aug 3, 2022

Wouldn't methods that iterate over fields of the class be better handled by macros? Instead of inheriting a magically generated method, you'd apply the macro yourself to show that the code is being generated. Then you could also choose between the default (ie, most popular or built-in) implementation or something else (a macro on pub.dev or your own). The logic for generating these methods would be more transparent, as anyone could inspect the macro's source code or the generated augmentation files for themselves.

@leafpetersen
Copy link
Member Author

Why not have toString() implement this behavior instead of adding an additional toDebugString()?

Because overriding toString to provide debugging information is a common cause of code bloat in large applications (the compiler often cannot prove that the information which is used to generate the toString output is never actually used, because if there are any string interpolations anywhere (for example), it's usually too hard to show that specific objects don't flow there).

@leafpetersen
Copy link
Member Author

I think we should consider splitting this into its own feature, I suggest to look at this as an interface auto derivation feature which works for structs and classes alike, e.g.

I like this feature, but I'm not sure it gets you where you want in terms of brevity. Compare:

data class Foo(int x, int y);

vs

class Foo(int x, int y) implements Equatable, Hashable, FunctionalUpdatable, DebugStringable, Encodable;

Starts to get a bit wordy.

@mraleph
Copy link
Member

mraleph commented Aug 3, 2022

I like this feature, but I'm not sure it gets you where you want in terms of brevity. Compare:

For data class specifically we just declare that it implicitly implements a number of these (e.g. at least Equatable, Hashable and few others like serialisation one --- btw you might not really need formatting one if you have generic enough Encodable because it can be implemented on top of it).

One interesting question here is what to do when you can't auto-implement an interface, e.g. Encodable requires all fields to be Encodable. We could do this:

data class Foo(int x, int y);
data class Bar(int x, Baz baz);

Foo foo;
foo.encodeTo(e);  // ok

Bar bar;
bar.encodeTo(e); // *helpful* compile time error:
// bar does not implement Encodable because type Baz (of field baz) does not implement Encodable.
// Instead of default compile time error
// bar has no method encodeTo.

We could also make an effort to provide similarly helpful noSuchMethod errors for the dynamic invocations and type casts (at least on the VM).

My main reason for suggesting this feature is because I think there is a natural expectation that it should be generalisable to normal classes.

Wouldn't methods that iterate over fields of the class be better handled by macros?

Well, it depends. There are pros and cons. Autoderivation of interface conformance is a batteries included approach. It does not cover all use-cases and does not address all corner cases - but it handles common path well. No need to shop for a macro on pub, etc. It's part of the language, it's easy to understand, it works the same way in any code-base. It has less overhead than macros and it does not depend on macros shipping.

@lrhn
Copy link
Member

lrhn commented Aug 3, 2022

An issue with using interfaces for automatically implemented class contracts that forces restrictions on the class, is that interfaces are inherited, and so are implementations.

If I do class Foo(int x, int y) implements Encodable {}, then class Bar(super.x, super.y, Object? something) extends Foo {}, what happens?

  • Nothing, the class already has an inherited implementation of Encodable, so all is well, but something is not part of the encoding.
  • You get a compile-time error because the default implementation of Encodable.encode does not handle all primary members, and it won't override the inherited one?
  • You get a compile-time error because the default implementation of Encodable.encode does not handle all primary members, and it can't encode Object??
  • Number four?

I'd prefer a more explicit opt-in/out for each class. Maybe:

class Foo(int x, int y; ==, toString) { ... }

or

class Foo(int x, int y) { 
  auto operator==, toString;
}

@lrhn
Copy link
Member

lrhn commented Aug 3, 2022

We have a very hard time getting agreement on what the canonical == implementation is.

For structs I'd be happy with:

/// Use some efficient hash for the runtime type.
int get hashCode => Object.hash($runtimetypeHash(this), this.foo, this.bar, ...);

bool _fieldEquality(ThisType other) => this.foo == other.foo && this.bar == other.bar && ...;
bool operator==(Object other) {
  // Efficient, non-overridable `(this.runtimeType == other.runtimeType);`
  return $sameRuntimeType(this, other) && _fieldEquality(safeCast<ThisType>(other));
}

That is, equality that would agree with canonicalization (if field values do).

(I'd be fine with making the $... functions public so everyone can use them, and have typeHash<T> as well. Does not have to agree with Type.hashCode, I prefer not having to reflect runtime types into Type objects before using them.)

For classes, a default implementation might be the same. If you need subtypes to inherit equality (always tricky), you can write it yourself.

(Maybe add this to Type:

 static bool sameRuntimeType(Object? a, Object? b);
 static int runtimeTypeHashCode(Object? a);
 static int typeHashCode<T>();

)

@munificent
Copy link
Member

Perhaps we could do what Slava suggests without needing special language support by:

  1. Define a couple of macros in the core library like @equatable, etc. Those macros add an appropriate implements clause to the applied class and then synthesize an implementation of the appropriate method.

  2. Declare that struct types implicitly apply @equatable, etc. to the type.

This way there isn't a second unrelated mechanism for synthesizing methods, but users also don't have to verbosely declare that all of their struct types implement these protocols.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants