Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursive enum testcase #26

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

tyt2y3
Copy link

@tyt2y3 tyt2y3 commented Jun 1, 2024

This used to work before "Merge branch 'ijackson-precise-bounds'" (2813ba0)

And fails to compile afterwards.

@tyt2y3 tyt2y3 mentioned this pull request Jun 1, 2024
@ijackson
Copy link
Contributor

ijackson commented Jul 8, 2024

Hrm. ISTM that this ought to work. Sorry for breaking it. I'll look into it.

(I have been away, so sorry for the delay.)

@ijackson
Copy link
Contributor

ijackson commented Jul 8, 2024

So. I wrote in #19

I think this is not a semver break: the bounds we now require are precisely correct, so the effect is to make code compile (and work) which would previously not compile.

That turns out to have been wrong. I hadn't considered the recursive case. My apologies.

With 0.6.0, there is a workaround:

    #[derive(Educe)]
    #[educe(Hash(bound()))]

I think needing that syntax is rather poor. I think it ought to realise that when there are no generics, there is no need for bounds. I'm not sure if doing that would be a breaking change.

It would definitely be a breaking change, and maybe even an inadviseable one, for the case where the struct has generic lifetimes. Consider

struct S<'r>(&'r ());
impl Clone for S<'static> { ... }

#[educe(Clone)]
struct T<'r>(S);

Where there aren't any generics I think it should be OK? But it would be nice to be sure.

@ijackson
Copy link
Contributor

ijackson commented Jul 8, 2024

I think I see two possible behaviours:

  1. Change the default bounds to be none, if the type has no generics. Recursive types with generics would have to explicitly state the bounds (eg via bound() if none are needed). This is the thing I suspect this is not a breaking change - but I could be wrong again.
  2. Change the default bounds to be none in every case (like in 0.4), and maybe provide a syntax for field-type-based ("precise") bounds.

I don't think the ad-hoc identifier-scanning approach taken in 0.5.11 and earlier is a great idea, even as a default behaviour. It's rather unpredictable. And each improvement to it is a breaking change.

@ijackson
Copy link
Contributor

ijackson commented Jul 9, 2024

Having slept on it, I think I prefer 2, which is basically going back to the default bounds semantics of educe 0.4.

It would be nice to provide an option for precise field-based bounds but I don't know what the syntax should be. bound(.*) is possibly too obscure, but it could be extended later to be bound(.field, .field2) to mean "generate a bound type-of-field: Trait, type-of-field2: Trait".

@magiclen, what do you think? I'm happy to implement either of the options I suggest above.

@tyt2y3
Copy link
Author

tyt2y3 commented Jul 22, 2024

Thank you @ijackson this is a difficult problem indeed. Sorry if it's been written somewhere, what's the original idea behind ijackson-precise-bounds? Would it make compilation faster?

I'd say if there is no obvious solution, may be we can have a special tag like:

#[derive(Educe)]
#[educe(Hash(resursive))]

that's basically an alias to the old behavior.

I hadn't considered the recursive case

It's Hyrum's Law haha!

@ijackson
Copy link
Contributor

Thank you @ijackson this is a difficult problem indeed. Sorry if it's been written somewhere, what's the original idea behind ijackson-precise-bounds? Would it make compilation faster?

I was trying to fix the scheme in 0.5 which basically tried to grep the field types to see which generics ought to get bounds. I think that approach is fundamentally wrong.

The "precise bounds" thing gets this exactly right, when the compiler can cope. We basically write where <field-type>: Trait for each field. The problem with recursive structs is this: struct Recursive(Box<Recursive>) generates a bound like impl Clone ... for Recursive where Box<Recursive>: Clone. When the compiler tries to figure out whether Recursive is Clone, it ends up trying to figure out whether Box<Recursive> is Clone and gets stuck in a loop. It can't see that it could just say "ok well if it is Clone then everything is fine and consistent".

I hadn't considered the recursive case

It's Hyrum's Law haha!

Well, I think in fact it's just me making a mistake. If someone had said "what about recursive structs" I would have realised this would be a breaking change, and probably also thought it was a poor default.

So, sorry.

I'd say if there is no obvious solution, may be we can have a special tag like:

#[derive(Educe)]
#[educe(Hash(resursive))]

that's basically an alias to the old behavior.

I'm not a huge fan of this syntax suggestion.

There's already a workaround: write #[educe(Hash(bound())] (or possibly with something more complex in bound()).

I think we should take a step back.

I think we have the following possible behaviours for bounds:

  1. No bounds (default in educe 0.4).
  2. Every generic parameter gets a bound (default for std's derives)
  3. Every field gets a bound for that field's type (default in educe after my "precise bounds" MR)
  4. Apply bounds explicitly provided by the user (all versions of educe support this, with a similar syntax; for obvious reasons this can't be the default)
  5. Some kind of ad-hoc name-based searching of field names (default in early educe 0.5.x)

Of these, I think 5 is incoherent and error-prone ought to not to be offered. For example it can be easily confused by type aliases and can't see into generic field types properly.

I think the main question is which of 1-3 ought to be the default.

Previously I thought that 3 ("precise" aka field-type-based) bounds was the best default. It works right in almost every situation - except recursive types (and maybe some other weird cases, I guess). But the error is very confusing and tickets we've seen suggest that users don't know that they could fix it by specifying the bounds precisely with the bound() feature.

I now think that 1 or 2 ought to be the default. 1 (no bounds by default) has the virtue of simplicity and a complete lack of magic ; and the error, if bounds are needed, is fairly easy to figure out. It is also nice because a main reason for using educe, rather than std, is precisely that we want a derived impl which isn't bounded by the generics. IME many of use sites of educe are for this reason, and certainly the vast majority very happy with the 0.4 "no bounds by default" behaviour. So I suggest going back to "no bounds" by default (option 1).

The next question, is what syntax to provide for specifying bounds.

The #[educe(Clone(bound(T: Clone)))] syntax for explicitly specified bounds (4 above) has been available for a long time and is nice. If "all generics" (2) and "precise field-based" (3) aren't provided with an explicit syntax, the same thing can be written out by the user by hand, although it can be repetitive.

Do we want to provide a syntax for "all bounds"? Do we want to provide a syntax for "make a bound from this field type" or "from all field types"?

#[educe(bound(*))] is a possible syntax for "make a bound for every generic". Possibly the field type based ("precise") bounds feature is too subtle and doesn't need to be provided.

@tyt2y3
Copy link
Author

tyt2y3 commented Aug 5, 2024

thank you for your detailed explanation! I agree with 1 or 2 being the default, and between 1&2, I think I'd pick the one being 1) simpler and 2) less surprising.

I think #[educe(bound(*))] looks good.

@ijackson
Copy link
Contributor

ijackson commented Aug 5, 2024

thank you for your detailed explanation! I agree with 1 or 2 being the default, and between 1&2, I think I'd pick the one being 1) simpler and 2) less surprising.

I think #[educe(bound(*))] looks good.

Thanks for the feedback. I'd be happy to make an MR along those lines.

But, I think it would make sense to complete (or reject) the refactoring I'm proposing in #31 first. That work will radically reduce the number of different places that needed to be edited. (The last time I touched this there were 18 almost-identical hunks...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants