Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sum types are not instantiable using C interface #2074

Closed
dcz-self opened this issue Jan 8, 2024 · 15 comments · Fixed by #2086
Closed

Sum types are not instantiable using C interface #2074

dcz-self opened this issue Jan 8, 2024 · 15 comments · Fixed by #2086

Comments

@dcz-self
Copy link
Contributor

dcz-self commented Jan 8, 2024

The manual doesn't list them at all in https://futhark.readthedocs.io/en/stable/c-api.html#opaque-values , but for an argument (f: (#none | #some u16)) the header file contains some procedures:

"types":{"(#none | #some u16)":{"ctype":"struct futhark_opaque_d8e... *","kind":"opaque","ops":{"free":"futhark_free_opaque_d8e..","restore":"futhark_restore_opaque_d8e...","store":"futhark_store_opaque_d8e..."}}

This is not enough to create or read values of this type, though.

@dcz-self
Copy link
Contributor Author

dcz-self commented Jan 8, 2024

I propose that the C interface for the above type as an example looks something like this (words "futhark" and "opaque" skipped for brevity:

int new_mysum_none(struct futhark_context *ctx, struct mysum **out);
int new_mysum_some(struct futhark_context *ctx, struct mysum **out, struct mysum_variant_some *value);
/// Returns an error value when the type does not contain the variant
int futhark_project_mysum_none(struct futhark_context *ctx, const struct mysum *obj);
int futhark_project_mysum_some(struct futhark_context *ctx, const struct mysum *obj, struct mysum_variant_some **out);
/// Puts the ordinal number of the current variant in `out`
int getvariant_mysum(struct futhark_context *ctx, const struct mysum *obj, unsigned int *out);

The manifest would list in a predefined order the names and arguments of each variant - not sure where is appropriate.

@athas
Copy link
Member

athas commented Jan 8, 2024

This is clearly something that would be very useful, particularly for bridges that can then provide a completely ergonomic interface on top. The proposed interface matches the corresponding one for records. I'm not sure about the getvariant function. When is that useful?

@dcz-self
Copy link
Contributor Author

dcz-self commented Jan 8, 2024

I was thinking of using getvariant when there's N variants, to optimize from having to call N functions to calling getvariant and knowing immediately which code path to take.

@athas
Copy link
Member

athas commented Jan 8, 2024

The number wouldn't tell you which of the projection functions would work.

@dcz-self
Copy link
Contributor Author

dcz-self commented Jan 8, 2024

Why not? Each projection function is based on the variant, not the type you intend to get out, so the number describing which variant is there would automatically point to the valid function.

@athas
Copy link
Member

athas commented Jan 8, 2024

How does the number indicate which function works?

@dcz-self
Copy link
Contributor Author

dcz-self commented Jan 8, 2024

Suppose we have a type (#none | #some u16 | #foo []u8). It's described in the manifest as having 3 variants in the following order: variants: [none, some, foo].
When dealing with an instance of #none, futhark_project_mysum_none returns success when called on the instance, futhark_project_mysum_some and _foo both return failure. getvariant (maybe more accurately called whichvariant) returns success and 0 (the index in the variants manifest field) as the out value.
With #some 123, futhark_project_mysum_some returns success and 123 as the out value. whichvariant returns success and 1 as the out value.
With #foo [1,2,3], whichvariant returns success and 2 as the out value.

The whichvariant function returns just a simple discriminant, without any special logic behind it.

@athas
Copy link
Member

athas commented Jan 8, 2024

So the manifest needs to store the constructors as an ordered list, rather than as a dictionary.

@dcz-self
Copy link
Contributor Author

dcz-self commented Jan 8, 2024

Correct.

@athas
Copy link
Member

athas commented Jan 8, 2024

Oh, now I remember why this is a bit tricky to implement: we deduplicate constructor payloads. That does not make this facility impossible, but we have to be quite careful about preserving how the logical representation maps to the physical one.

@athas
Copy link
Member

athas commented Jan 8, 2024

See also #1960.

@athas
Copy link
Member

athas commented Jan 9, 2024

I don't think the variant function can possibly fail, so I'm inclining towards this type for simplicity:

int futhark_variant_opaque_sum(struct futhark_context *ctx, const struct futhark_opaque_sum *v);

@athas
Copy link
Member

athas commented Jan 10, 2024

While destruction (what is called "projection" above) is straightforward to implement, construction turns out to have a complicated wrinkle. The reason is that when we have a type such as

type sum [n][m] = #foo ([n]i32) | #bar ([m]i32)

then even when we construct just the #foo variant, we must also provide a size m. In the language itself, this is handled by type inference, but of course we don't have access to inference in the C API.

We can't just default unknown sizes to 0, because it is possible a value of this sum type may be passed to an entry point that requires e.g. sum [3] [4] - constraints are imposed on the types of all variants; not just the one a specific value happens to use.

This is ultimately due to our value representation, which also physically represents even those variants that are not in active use. This representation is a tradeoff, and cannot/should-not be changed.

We have some options.

  1. Start with a partial implementation that supports destruction, but not construction.
  2. We can also allow construction of sum types that do not contain arrays, because they don't have size parameters. If we really push it, we can also allow arrays with fixed sizes, but this will require some extra plumbing to be laid, as we currently don't track size information/constraints at the entry point type level.
  3. Require that the construction functions pass the necessary sizes. This is a bit awkward, because it's probably not going to be very clear to the user in which order the sizes must be provided. I'm also not sure how I would encode this information in the manifest.

@dcz-self
Copy link
Contributor Author

For what it's worth, I don't think this problem is avoidable, unless you hide possible members behind a pointer. Otherwise, space big enough for the largest member needs to be allocted, not just the instantiated one.
I think ultimately tracking size information is the most elegant idea: it's the one that results in amost complete and user-friendly interface (the API consumer doesn't have to track sizes). I think that covers a similar area to #2070

@athas
Copy link
Member

athas commented Jan 12, 2024

We have full control over the representation, so we can hide whatever we want. Currently sum types do not have any special representation, however.

I think I will add the API discussed above, as it is the simplest and most elegant thing. Destruction works and is completely reliable, but construction results in types with size 0 for arrays in the the unused variants. This will not result in memory-unsafety, but may lead to controlled errors when they are passed to entry points that are picky about those sizes. This is certainly not great, but it can be worked around by the programmer (by making the entry points less picky), and maybe we can make it better in the future, without changing the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants