Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request: distinct types #1595

Open
emekoi opened this issue Sep 27, 2018 · 84 comments
Open

request: distinct types #1595

emekoi opened this issue Sep 27, 2018 · 84 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@emekoi
Copy link
Contributor

emekoi commented Sep 27, 2018

would it be possible to add distinct types? for example, to make it an error to pass a GLuint representing a shader as the program for glAttachShader.

@ghost
Copy link

ghost commented Sep 27, 2018

do you mean something like strong typedefs? https://arne-mertz.de/2016/11/stronger-types/

I'm strongly for this 😃 👍

@nodefish
Copy link

From the article posted by @monouser7dig:

They do not change the runtime code, but they can prevent a lot of errors at compile time.

Sounds like a job for comptime.

@XVilka
Copy link
Sponsor

XVilka commented Sep 27, 2018

Certainly a good thing to have.

@emekoi
Copy link
Contributor Author

emekoi commented Sep 27, 2018

how would comptime provide this feature? what i mean is we could do something like const ShaderProgram = distinct u32; and it would be an compiler time error to pass a plain u32 as a ShaderProgram and vice versa.

@ghost
Copy link

ghost commented Sep 27, 2018

The current workaround is (like c) to use a Struct with just one member and then always pass the Struct instead of the wrapped value.
The big downside is that setting and getting the member is always boilerplate and does discourage the use of such a typesafe feature.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 27, 2018
@andrewrk andrewrk added this to the 0.4.0 milestone Sep 27, 2018
@andrewrk
Copy link
Member

Without yet commenting on the feature itself, if we were to do it, I would propose not changing any syntax, and instead adding a new builtin:

const ShaderProgram = @distinct(u32);

@nodefish
Copy link

nodefish commented Sep 27, 2018

how would comptime provide this feature?

You're right, I conflated this with the "strong typedefs" described in the article posted above. They are distinct concepts after all, no pun intended.

@emekoi
Copy link
Contributor Author

emekoi commented Sep 27, 2018

yeah, i think @distinct is a better than distinct.

@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Sep 28, 2018
@raulgrell
Copy link
Contributor

raulgrell commented Sep 28, 2018

I'm actually quite fond of this idea. Would there be any issues if we took it further and allowed functions to be declared inside?

// Pass a block like in @cImport()
const ShaderProgram = @distinct(u32, {
    pub fn bind() void { ... }
    pub fn unbind() void { ... }
});

or an alternative way with minimal changes to syntax that is consistent with enum semantics of 'underlying type'.

const ShaderProgram = struct(u32) {
    pub fn bind(sp: ShaderProgram) void { ... }
    pub fn unbind(sp: ShaderProgram) void { ... }
};

EDIT: Just a bit further - this could allow for explicit UFCS. The blocks below would be equivalent:

ShaderProgram(0).bind();
ShaderProgram(0).unbind();

var sp: = ShaderProgram(0);
sp.bind();
sp.unbind();

ShaderProgram.bind(0);
ShaderProgram.unbind(0);

@PavelVozenilek
Copy link

Nim has such feature and it is rather clumsy. Distinct type looses all associated operations (for example, distinct array type lacked even element access by []). All these operations have to be added, and there is lot of special syntax to "clone" them from original type. Nice idea was butchered by implementation.

@emekoi
Copy link
Contributor Author

emekoi commented Sep 28, 2018

@PavelVozenilek but thats because nim has operator overloading. in zig all the operators are known at compile time, so wouldn't the compiler be able to use the implementation of the distinct type's base type?
for example:

const ShaderProgram = @distinct(u32); // produces a Distinct struct

const Distinct = struct {
    cont base_type = // ...
    value: base_type
};

and when an operator is invoked on the type the compiler can basically insert an @intCast(ShaderProgram.base_type, value.value) or the equivalent.

@Ilariel
Copy link

Ilariel commented Sep 28, 2018

I think distinct types are useful, but I don't think they fit in Zig. There should be only one obvious way to do things if possible and reasonable.

The problem with distinct types is that you most likely don't want all of the operators or methods of the underlying type.

For example:

var first : = ShaderProgram(1);
var second : = ShaderProgram(2);
//This should be an error with all math operators
var nonsense : ShaderProgram = first *insert any operator here* second; 

@thejoshwolfe
Copy link
Sponsor Contributor

void glAttachShader(GLuint program, GLuint shader);

I'm not familiar with the gl api, but I assume that program and shader are effectively opaque types. Despite being integers, it would not make sense to do any arithmetic on them, right? They're more like fd's in posix.

Perhaps we can scope this down to enable specifying the in-memory representation of an otherwise opaque type. There are two features that we want at the same time:

  • A library will provide and accepts objects of a type that the client isn't supposed to do anything else with. These objects function as handles.
  • The handle must have some concrete in-memory representation so that the client and library can communicate coherently.

The recommended way to do this is to make a type with @OpaqueType(), and then use single-item pointers to the type as the handle.

const Program = @OpaqueType();
const Shader = @OpaqueType();

pub fn glAttachShader(program: *Program, shader: *Shader) void {}

But this mandates that the in-memory representation of the handle is a pointer, which is equivalent to a usize. This is not always appropriate. Sometimes the handle type must be c_int instead, such as with posix fd's, and c_int and usize often have different size. You have to use the correct handle type, so a pointer to an opaque type is not appropriate with these handle types.

Proposal

A new builtin @OpaqueHandle(comptime T: type) type.

const H = @OpaqueHandle(T);
const G = @OpaqueHandle(T);

var t = somethingNormal();
var h = getH();
var h2 = getAnotherH();
var g = getG();
  • assert(H != T); - You get a different type than you passed in.
  • assert(G != H); - Similar to @OpaqueType(), each time you call it, you get a different type.
  • assert(@sizeOf(H) == @sizeOf(T) and @alignOf(H) == @alignOf(T)); - Same in-memory representation.
  • H is guaranteed to behave identically to T in the extern calling convention. This includes when it is part of a larger type, such as a field in an extern struct.
  • h = t; t = h; h = g; // all errors - The handle types don't implicitly cast to or from any other type.
  • if (h != h2) { h = h2; } - Handles can be copied and equality-compared.
  • h + 1, h + h2, h < h2 // all errors - Whether T supported arithmetic or not, the handle types do not support any kind of arithmetic.
  • t = @bitcast(T, h); - If you really need to get at the underlying representation, I think @bitcast() should be the way to do that. Or maybe we should add special builtins for this, idk.

This is an exciting idea. I think this fits nicely into the Zig philosophy of beating C at its own game - Zig is preferable to C even when interfacing with C libraries. If you translate your GL and Posix apis into Zig extern function declarations with opaque handle types, then interfacing with the api gets cleaner, clearer, less error prone, etc.

@tgschultz
Copy link
Contributor

tgschultz commented Sep 29, 2018

One objection I can think of to handling these as opaque types is that @distinct(T) as envisioned originally would be useful for C-style flags, and @OpaqueHandle(T) wouldn't because you can't use & and | with them without verbose casting.

Consider the following constants from win32 api

pub const WS_GROUP = c_long(131072);
pub const WS_HSCROLL = c_long(1048576);
pub const WS_ICONIC = WS_MINIMIZE;
pub const WS_MAXIMIZE = c_long(16777216);
pub const WS_MAXIMIZEBOX = c_long(65536);
pub const WS_MINIMIZE = c_long(536870912);
pub const WS_MINIMIZEBOX = c_long(131072);
pub const WS_OVERLAPPED = c_long(0);
pub const WS_OVERLAPPEDWINDOW = (WS_OVERLAPPED | WS_CAPTION | WS_SYSMENU | WS_THICKFRAME | WS_MINIMIZEBOX | WS_MAXIMIZEBOX);
pub const WS_POPUP = c_long(-2147483648);
pub const WS_SIZEBOX = WS_THICKFRAME;
pub const WS_SYSMENU = c_long(524288);
pub const WS_TABSTOP = c_long(65536);
pub const WS_THICKFRAME = c_long(262144);
pub const WS_TILED = WS_OVERLAPPED;
pub const WS_VISIBLE = c_long(268435456);
pub const WS_VSCROLL = c_long(2097152);

and the following window creation code:

var winTest = CreateWindowExA(
    0,
    wcTest.lpszClassName,
    c"Zig Window Test",
    @intCast(c_ulong, WS_OVERLAPPED | WS_MINIMIZEBOX | WS_SYSMENU),
    CW_USEDEFAULT,
    CW_USEDEFAULT,
    800,
    600,
    null,
    null,
    hModule,
    null
) orelse exitErr("Couldn't create winTest");

It may be desirable ensure that APIs like this use a @distinct() type instead of a normal int constant, to ensure that you do not accidentally pass something like WS_S_ASYNC, which is completely unrelated, or a (perhaps mis-spelled) variable containing an unrelated integer.

With @OpaqueHandle(T), the user could not use the function properly without casting the handles in a very verbose manner. This could be abstracted away by the API, though, by providing a varargs fn that would do that for you. Just something to consider since this is the usecase that sprang immediately to mind when I read the original proposal.

@BarabasGitHub
Copy link
Contributor

BarabasGitHub commented Sep 29, 2018 via email

@raulgrell
Copy link
Contributor

Another use case: const Vec2 = [2]f32;

@ghost
Copy link

ghost commented Sep 29, 2018

In go, the new type inherits the operations but not the methods of the old type, I think this is a good way to do it as it provides the benefit without great complexity, type system does not need to catch every possible error, but just help us.
https://golang.org/ref/spec#Type_declarations

@thejoshwolfe
Copy link
Sponsor Contributor

I agree that bitflags are a separate issue. I've partially typed up a proposal for bitflags in Zig including extern bitflags appropriate for interfacing with C. Those WS_GROUP etc constants as well as http://wiki.libsdl.org/SDL_WindowFlags could be represented in Zig as this new bitflags type, and that would also lead to beating C at its own game. The proposal ended up being pretty complicated, so I haven't posted it anywhere yet.

I think the usecase for a handle type is still valid separate from the flags case.

@Meai1
Copy link

Meai1 commented Oct 18, 2018

Wouldn't it be great if you can say that a function can receive either type A or B in a type safe way?

pub fn foo(myparam : u32 or []const u8) {
}

I know what the critique against it is: "just make a parent struct". But that's not the point, this gets the job done so much faster without all the boilerplate of constantly writing structs and naming them and setting them up even though I don't actually need a struct.
I'm usually always against adding any kind of type shenanigans but this is actually something I use and need all the time in languages like Typescript.

@Hejsil
Copy link
Sponsor Contributor

Hejsil commented Oct 18, 2018

@Meai1 I'm not sure how this solves the issue. We're talking about allowing two names A and B, to have the same underlying type (usize or something else) but disallow implicit casts between them.

I think what you're proposing fits with #1268.

@Meai1
Copy link

Meai1 commented Oct 18, 2018

@Hejsil Because I think that what is described in this issue is just a tiny subset of the general problem/solution of "type refinement":
https://flow.org/en/docs/lang/refinements/

edit: I guess they call it 'subtyping' when it is used to define something, in my opinion they look identical: https://flow.org/en/docs/lang/subtypes/

@emekoi
Copy link
Contributor Author

emekoi commented Oct 18, 2018

what the first article talks about are sum types which can be achieved through union types. as for subtypes i don't see how that relates to distinct types. what i meant by distinct types is that when B is a distinct A, that are the same type but A cannot implicitly cast to B and vice versa. this means calling a function with the signature fn foo(bar: A) void with an argument that is of type B is an error, despite the fact types A and B are identical.

@andrewrk
Copy link
Member

andrewrk commented Oct 18, 2018

I think @thejoshwolfe's proposal is promising. One modification though:

t = @bitcast(T, h); - If you really need to get at the underlying representation, I think @bitcast() should be the way to do that. Or maybe we should add special builtins for this, idk.

Following with the established pattern, opaque handles would have their own casting functions, unique to opaque handles. @fromOpaqueHandle(x) to get the value, @toOpaqueHandle(T, x) to get the opaque handle.

The strongest argument against this I can think of is that it is More Language 👎 . The counter-argument is that it Clearly Prevents Bugs 👍

Here's a question to consider: in a pure zig codebase, would there be a reason to use @OpaqueHandle?

@ghost
Copy link

ghost commented Oct 18, 2018

Whether T supported arithmetic or not, the handle types do not support any kind of arithmetic.

I think this is a bad idea, because its very verbose so people will not use it (enough)
#1595 (comment)

Here's a question to consider: in a pure zig codebase, would there be a reason to use @OpaqueHandle?

everywhere you use an int/ float type

@Hejsil
Copy link
Sponsor Contributor

Hejsil commented Nov 2, 2018

If we're not gonna support the operators for the type, then we are pretty close to be able to have this in userland:

const std = @import("std");
const debug = std.debug;

pub fn OpaqueHandle(comptime T: type, comptime hack_around_comptime_cache: comptime_int) type {
    return packed struct.{
        // We could store this variable as a @IntType(false, @sizeOf(T) * 8)
        // but we lose the exact size in bits this way. If we had @sizeOfBits,
        // this would work better.
        ____________: T,

        pub fn init(v: T) @This() {
            return @This().{.____________ = v};
        }

        pub fn cast(self: @This()) T {
            return self.____________;
        }
    };
}


test "OpaqueHandle" {
    // I know that the 0,1 things is not ideal, but really, you're not gonna have
    // 10 or more of these types, so it's probably fine.
    const A = OpaqueHandle(u64, 0);
    const B = OpaqueHandle(u64, 1);
    debug.assert(A != B);
    const a = A.init(10);
    const b = B.init(10);
    debug.assert(a.cast() == b.cast());
}

Here's a question to consider: in a pure zig codebase, would there be a reason to use @OpaqueHandle?

I'm pretty sure I'd never use this.

@andrewrk
Copy link
Member

andrewrk commented Nov 2, 2018

comptime hack_around_comptime_cache: comptime_int this could be a type and then you pass @OpaqueType() rather than 0, 1, etc.

@daurnimator
Copy link
Collaborator

@alunbestor could you write a helper function for your tests? It should work at comptime: (untested):

const DistinctIDType = enum(IDType) {
    _,
    const self = @This(),
    pub fn fromInt(a: IDType) self {
        return @intToEnum(self, a)
    }
    pub fn fromIntArray(a: anytype) [@TypeOf(a).len]DistinctIDType {
        var r: [@TypeOf(a).len]DistinctIDType = undefined;
        inline for (a) |x, i| {
            r[i] = @intToEnum(self, x);
        }
        return r;
    }
};

Which would then make your boilerplate at usage go away:

const distinct_id = DistinctIDType.fromInt(123);
const distinct_ids = DistinctIDType.fromIntArray(.{ 0, 1, 2, 3 });

@Vexu
Copy link
Member

Vexu commented Jan 5, 2023

I once again spent nearly an hour looking for a bug that would have entirely been prevented by this proposal.

@wooster0
Copy link
Contributor

wooster0 commented Jan 20, 2023

What's the problem with changing existing semantics and making distinct types the default?
The existing @as() would be used for coercion and has to be done explicitly.

const Hello = u32; // distinct type; not an alias

// takes Hello, not u32
fn x(hello: Hello) void {
    _ = hello;
}

pub fn main() void {
    x(5); // ok; @as(comptime_int, 5) coerces to both u32 and Hello (or maybe we can require an explicit `@as(Hello, 5)` for this too, even for comptime_int)
    x(@as(u32, 5)); // bad; type is not Hello
    x(@as(Hello, 5)); // ok; type is Hello
    const y: u32 = 5;
    x(y); // bad; type is not Hello
    x(@as(Hello, y)); // ok; type u32 coerced to Hello explicitly
}

This basically disallows type aliases.

And then what if instead of a "distinct" keyword or builtin, we add an "alias" one?
So distinct types would be the default like above and if you really need a type alias, do probably this:

alias Apple = u8;

So now the safer thing, distinct types, would be the default and the implicit and less safe thing, type aliases, would be something you have to reach for purposely using a keyword that will be seen far less than const.

And I do think using a keyword for this would be better rather than @alias(u8) or something because it makes type alias creation more limited. They'd basically only be created using exactly the syntax alias {type name} = {type};.

We could however also just not have type aliases in the languages and only have distinct types. But we should probably have type aliases.

@zzyxyzz
Copy link
Contributor

zzyxyzz commented Jan 20, 2023

@r00ster91
That would defeat the whole point of Zig's unified const assignment syntax, which always acts as an alias, regardless of the kind of object assigned.

And anyway, distinct types should not be encouraged as a default, IMO. They have their uses, but the gain in safety is offset by the boilerplate to convert back and forth between the new type and the underlying value, so their desirability in any particular case is far from obvious.

@cryptocode
Copy link
Sponsor Contributor

Just anecdotal evidence, but I've also seen the class of bugs this prevents in the wild many many times (in C++)
The userland kludges to implement strong typedefs are less than inviting.

@zzyxyzz
Copy link
Contributor

zzyxyzz commented Mar 29, 2023

Let's not forget about the good ol' wrapper type solution. Intuitively, it feels like it ought to be more cumbersome than a dedicated distinct type facility, but look at this:

// A
const MyFloat = struct { val: f32 };
const x = MyFloat { .val = 5 };
const v = x.val;

// B
const MyFloat = @Distinct(f32);
const x = @as(MyFloat, 5);
const v = @as(f32, x);

// C
const MyFloat = @OpaqueHandle(f32);
const x = @toOpaqueHandle(MyFloat, 5);
const v = @fromOpaqueHandle(x);

You could call option B slightly more elegant, but the advantage is paper-thin at best. And it turns negative if we want to do more than create and pass around values. For example, adding two MyFloats would look like this, respectively:

MyFloat { .val = x.val + y.val }
@as(MyFloat, @as(f32, x) + @as(f32, y))
@toOpaqueHandle(MyFloat, @fromOpaqueHandle(x) + @fromOpaqueHandle(y))

Structs are clearly superior in this case, at least if distinct types have black box semantics. If they inherit operators (and methods?) from the underlying type, adding two values of the same distinct type would be as simple as x + y.

However... It's not at all obvious that this semantics is actually desirable. Sometimes you want to inherit operators and sometimes you don't. And even if you do, you might only want to support some of them. Adding apples to apples is good, but multiplying or xoring them probably isn't. We could try to make the necessary operations selectable in some way (see @user00e00's comment for example), but this quickly leads into too-much-complexity-for-too-little-gain territory, IMO.

In addition, some limitations would remain even with inheritance. For example, multiplying a MyFloat by 2 would still require @as(MyFloat, 2) * x instead of 2 * x. Automatic coercion can't be allowed because that would erode the distinctness of distinct types pretty badly.


TL;DR: Some of the use-cases for this proposal have now been subsumed by non-exhaustive enums. For the rest, manual struct wrapping is a surprisingly viable and flexible solution. Proper distinct types, as discussed so far, seem to be either a) equivalent b) worse or c) too complicated. As things stand, I don't think Zig needs this functionality.

@cryptocode
Copy link
Sponsor Contributor

cryptocode commented Mar 30, 2023

@zzyxyzz Struct wrapping doesn't actually solve the problem when accessing through .val
And people will use .val directly at use-sites.

const Meter = struct {val: u32};
const EntityID = struct {val: u64};

[...]

// compiles, but wrong parent after refactoring, should've used entity.parent = root.id()
entity.parent.val = found_parent;

// compiles, but timing-related end variable used by mistake 
// (imagine this is the midst of some complex function)
len = end - origin.val;

@distinct types would catch this class of bugs (which is not uncommon in the wild)

const Meter = @distinct(u32);
const EntityID = @distinct(u64);

[...]

// compile error
entity.parent = found_parent;

// compile error, must fix using end + cast expr to len's type
len = end - origin;

Granted, setting .val directly when re-initializing should raise red flags, but that's weak protection.
Expressions like the second example would be the more common source of bugs.

@zzyxyzz
Copy link
Contributor

zzyxyzz commented Mar 30, 2023

@cryptocode,
Could you expand this example a little bit? I don't quite understand what it's supposed to do and what error is to be prevented here.

@cryptocode
Copy link
Sponsor Contributor

cryptocode commented Mar 30, 2023

@zzyxyzz I'll try. Using the last example:

len = end - origin.val;

Imagine len is u32, and is going to be used to serialize some length in meters to a file. Thus we need to go from the world of Meter to the world of u32.

This compiles because 1) end happens to be u32 as well, but it's a completely unrelated variable - for instance for timing purposes, and because 2) origin.val is accessed directly (which makes sense/is too tempting in such expressions)

Since struct wrapping invites the use of direct access to .val in expressions, you no longer have distinct types where this is done.

With distinct types, this won't compile:

len = end - origin;

for two reasons: 1) len (to be serialized to file) is of a different type, and 2) end is a different type and in fact the wrong variable. Potentially a long debug session averted :)

And that's the class of bugs distinct types catch: by introducing more types you reduce the chance for these mixups. I've seen this in monetary related apps for instance; the current discussion thread contains some other examples.

The fix (where pos.x is also Meter):

len = @as(u32, pos.x - origin);

Also, I personally don't think these casts are noisy in practice, because you'll stay in the world of the distinct type most of the time.

I find making the point in such small examples hard, but hope it makes sense.

Clearly this must be balanced with the added complexity of the compiler etc, but I do think it'll catch some otherwise hard-to-find bugs.

@zzyxyzz
Copy link
Contributor

zzyxyzz commented Mar 30, 2023

Thanks, this makes it a bit clearer.

Though I feel there's a bit of an apples-to-oranges comparison involved here. len = end - origin would fail to compile with structs too, while the equivalent of end - origin.val with distinct types (end - @as(u32, origin)) would fail to catch the double bug involved here as well. So what exactly is the difference?

Also, since you are assuming that arithmetic is inherited, the struct-based solution would probably add some helper methods to Meter, so that the actual solution should look something like this:

len = pos.x.minus(origin).val;

which is more ergonomic and less error-prone.

@cryptocode
Copy link
Sponsor Contributor

cryptocode commented Mar 30, 2023

Well len = end - origin would fail to compile for a different reason.

The compile error "fix" would be len = end - origin.val and now you're back to the hard-to-find bug.

Adding methods could help, but I wouldn't rely on people doing that given how verbose it gets in more complex expressions.

So what exactly is the difference?

But why would you write that as (end - @as(u32, origin) ? That feels like constructed to introduce the bug.

You could add the last cast because of the compile error, but at that point you're more likely to realize the real problem, right? Because you would think... "why doesn't this compile... end and origin are bother meters", and then you realize that's not the case. It's not like it's a panacea of course.

@InKryption
Copy link
Contributor

The compile error "fix" would be len = end - origin.val and now you're back to the hard-to-find bug.

Isn't that the same with len = @as(u32, pos.x - origin); though? If you cast to the base type mistakenly, it hasn't actually solved the issue of using the data type inappropriately.

@zzyxyzz
Copy link
Contributor

zzyxyzz commented Mar 30, 2023

@cryptocode,

But why would you write that as (end - @as(u32, origin) ? That feels like constructed to introduce the bug.

That's sort of my point. Given len = end - origin, the compiler will error out in both cases, and report that origin of type Meter cannot be subtracted from u32. But now you are for some reason assuming that with struct wrappers, the programmer will incorrectly "fix" this by simply unwrapping the value, while with distinct types they will realize that the line is totally wrong and rewrite is correctly. Why?

@cryptocode
Copy link
Sponsor Contributor

cryptocode commented Mar 30, 2023

@zzyxyzz Yeah I get what you're saying, but the point of the example, len = end - origin.val, is that it's relatively easy to end up with code like that to begin with. And that compiles with a bug. Maybe my assumption is wrong, but that's the thinking.

The original point was that with the struct approach you'll have a lot of instances of accessing the wrapped value directly, which is the same thing as not using distinct types at all, right?

len = end - origin.val
If you cast to the base type mistakenly

@InKryption Right, I should've written a "fix" instead of the "fix" as the discussion derailed. The point, as mentioned above, was really that with wrapping structs you can end up with such code in first place (not just have it as a bad fix)

@presentfactory
Copy link

presentfactory commented Apr 12, 2024

Has there been any progress on implementing something like this? I feel like something like this would be a fairly simple thing to add to the compiler, but it'd help prevent a lot of bugs as others have noted (though I'm not a compiler dev so maybe I am misunderstanding the complexity here). Just curious because it has been over 5 years now since the original proposal.

@leecannon
Copy link
Contributor

@presentfactory one reason this has not really progressed is that it is already possible in status quo to represent distinct integer types.

Just combining enum backing type + non-exhaustive enum + @intFromEnum & @enumFromInt gets you basically the same behaviour as @distinct would have.

const Program = enum(u32) { _ };
const Shader = enum(u32) { _ };

pub fn glAttachShader(program: Program, shader: Shader) void {
    // use `@intFromEnum` to get the values
}

@presentfactory
Copy link

@leecannon I mean sure but that's kinda ugly and imo not a good solution to the issue more generally since it does not work for more complex types like structs.

Often I have something like a Vec4 which say represents a quaternion and not a position and I'd like to make a distinction there so I don't accidentally pass something intended as a rotation to a function expecting say a position or a color. This like others have said has caused me preventable bugs in the past, so a more general solution is needed.

@SuperAuguste
Copy link
Sponsor Contributor

does not work for more complex types like structs

This is a non-problem for structs and unions, though. The solution for those types is to just make distinct structures for each different representation, which is what you should be doing regardless.

For example, both types below are distinct:

const Rgba = struct {r: f32, g: f32, b: f32, a: f32};
const Vec4 = struct {x: f32, y: f32, z: f32, w: f32};

If I have

pub fn myFunc(color: Rgba) void { ... }

calling myFunc(Vec4{ ... }) is not permissible.

@presentfactory
Copy link

presentfactory commented Apr 12, 2024

@SuperAuguste It is a problem though, I don't want to have to re-type the struct redundantly every time like that, that's just WET and bad practice.

Also the point is that while all distinct types a Color, Position and Quaternion are all a Vec4, meaning they can still use the base Vec4 functions for linear algebra operations. With the approach you propose you'd have to duplicate all the functions across all these structs, or pass them to non-member functions taking anytype which is just bad.

There's simply no way around this, distinct types are needed and that's that. The assumption that all you need is aliasing on type assignment is incorrect and there needs to be a mechanism to control this behavior. It'd be like saying all you need in a programming language is references, obviously this is untrue, copies are needed sometimes.

@Beyley
Copy link

Beyley commented Apr 12, 2024

@SuperAuguste It is a problem though, I don't want to have to re-type the struct redundantly every time like that, that's just WET and bad practice.

Its not re-typing the same struct though, RGBA should have its fields be r, g, b, a and Vec4 should have its fields be x, y, z, w these are not only distinct in the fact they represent different data, but they also should have different field naems, and also multiplying colours is not always the same as multiplying vectors. You can also use usingnamespace here with some comptime to dedupe the member functions aswell

@presentfactory
Copy link

@Beyley In this isolated case sure it has different member names but that's irrelevant, usually people implement it with a normal vector type because colors are fundamentally vectors. They fundamentally have the same primitive operations too because they are again, vectors. I do not understand why people are trying to poke holes in this, it'd be incredibly useful to have this feature and it does not matter if it does not cover every single conceivable use case.

Also yes I'm sure there's many ways to hack this like the enum method for integers but I do not want ugly hacky things to do what should be a trivial operation in the compiler. usingnamespace is not meant for this nor would anyone find that method intuitive or easy to understand, same with the enum method for integers.

@presentfactory
Copy link

presentfactory commented Apr 13, 2024

Thinking on it some more I do actually think the difficulty in "inheriting" behavior with distinct types like I propose is determining what say the return value of something is. I think though as long as things are annotated it is useful still, and when you do not want this sort of behavior some sort of function to make a total copy of the type instead would be good too (as really there's type different types of behavior here one might want). So something like:

const Position = @inherit(Vec3);
const Direction = @inherit(Vec3);

fn addPositionDirection(p: Position, d: Direction) Position {
  // Fine, p/d can call fn add(a: Vec3, b: Vec3) Vec3 as they can coerce to Vec3,
  // and the returned Vec3 can coerce back to a Position to return from this function
  return p.add(d);
}

var p: Position = ...;
var d: Direction = ...;

const v = p.add(d); // Fine, returns a Vec3
const p2: Position = p.add(d); // Fine, returns a Vec3 but coerces back to a Position
const p3 = addPositionDirection(p, d); // Fine
const p4 = addPositionDirection(d, p); // Error

And then for when this behavior is not desired (more useful for things like handles where you actually don't want them to be compatible with their base type):

const Handle = @clone(u32);

var h1: Handle = ...;
var h2: Handle = ...;

const h3 = h1 + h2; // Fine, the addition operator conceptually is a member of this type and is cloned with it, calling fn add(a: Handle, b: Handle) Handle essentially, resulting in another Handle
const h4 = h1 + 5; // Error, even though Handle is cloned from an integer it's not able to coerce like this

The issue though with cloning things like this however as you lose the ability to do any operations on the type really with say normal integers. This is especially a problem with primitive types like this as you cannot actually add new behavior to them (as they aren't really structs you can add new methods to unlike a user-defined type). To solve that there would probably need to be some sort of cast operator I think to allow for explicit casting between compatible clones of the type (rather than the implicit coercion of the inheritance based method).

Something like this:

const h4 = h1 + @similarCast(5); // Casts 5 to a Handle to allow it to be added
const bar = bars[@similarCast(h1)]; // Casts the Handle to a usize to allow for indexing with it

With user defined types you could probably just do this via some sort of anytype "conversion constructor" I guess which gets cloned into each instance and allows for conversions between them:

const Vec = struct {
  x: f32, y: f32,

  fn new(other: anytype) Self {
    return .{ other.x, other.y };
  }
};

const Position = @clone(Vec);
const Direction = @clone(Vec);

var p: Position = ...;
var d: Direction = ...;

// Does the same thing as what the inheriting sort of distinct types would, just a lot more verbosely, and again this only works for user defined types where this sort of anytype thing can be added
const p = Position.new(Vec.new(p) + Vec.new(d));

Overall it is a pretty tricky problem as there are multiple ways of making distinct types like this and multiple ways of solving the issues with each approach...but hopefully this bit of rambling is useful in figuring out what Zig should do. Might also be worth looking at some other languages that do this, I don't know of any myself but Nim seems to with its own method where it clones the type but without any of the methods/fields for some reason in favor of having to explicitly borrow them, and relying on explicit casts to go between similar types: https://nim-by-example.github.io/types/distinct/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests