Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ascent should omit trait bounds on the generated program's type declaration #23

Closed
regexident opened this issue Dec 14, 2023 · 5 comments · Fixed by #25
Closed

Ascent should omit trait bounds on the generated program's type declaration #23

regexident opened this issue Dec 14, 2023 · 5 comments · Fixed by #25

Comments

@regexident
Copy link
Contributor

For generic programs ascent currently requires something along the following:

ascent! {
    pub struct AscentProgram<T: Clone + Eq + Hash>;

    relation dummy(T);

    // ...
}

or

ascent! {
    pub struct AscentProgram<T> where T: Clone + Eq + Hash;

    relation dummy(T);

    // ...
}

which then generates code that looks something like this:

pub struct AscentProgram<T>
where
    T: Clone + Eq + Hash,
{
    pub dummy: Vec<(T,), Global>,

    // ...
}

impl<T> AscentProgram<T>
where
    T: Clone + Eq + Hash,
{
    pub fn run(&mut self){
        // ...
    }

    // ...
}

// ...

This unfortunately tends to lead to problems with rather nasty effects one one's code and is thus generally considered an anti-pattern.

As such the official Rust API guidelines has a chapter on the topic (it talks specifically about derived trait bounds, but the viral breaking change effect is not limited to derived traits):

Data structures do not duplicate derived trait bounds

Generic data structures should not use trait bounds that can be derived or do not otherwise add semantic value. […]

// Prefer this:
#[derive(Clone, Debug, PartialEq)]
struct Good<T> { /* ... */ }

// Over this:
#[derive(Clone, Debug, PartialEq)]
struct Bad<T: Clone + Debug + PartialEq> { /* ... */ }

Generally speaking, adding a trait bound to a data structure is a breaking change because every consumer of that structure will need to start satisfying the additional bound. […]

There is a grey area around other non-derivable trait bounds that are not strictly required by the structure definition, like Read or Write. They may communicate the intended behavior of the type better in its definition but also limits future extensibility. Including semantically useful trait bounds on data structures is still less problematic than including derivable traits as bounds. […]

In short the problem is that trait bounds attached to a type's declaration spread virally (i.e. they require your surrounding code to now also be bounded by these traits), while those only attached to a type's implementation generally don't.

As such the generated code should preferably look like this instead:

pub struct AscentProgram<T>
// notice the lack of a where clause here!
{
    pub dummy: Vec<(T,), Global>,

    // ...
}

impl<T> AscentProgram<T>
where
    T: Clone + Eq + Hash,
{
    pub fn run(&mut self){
        // ...
    }

    // ...
}

// ...

This change should have no affect on ascent itself, but make integrating it much more pleasant and less intrusive.

@s-arash
Copy link
Owner

s-arash commented Dec 15, 2023

Hmm, do you have a concrete use case where this is an issue?

I don't think the guideline applies here, for a number of reasons:

  • The bounds are specified explicitly by the user! How should Ascent go about figuring out what bounds are intended to be on the type definition, and what bounds are intended only for impl blocks?
  • In your specific example, the bounds do very much add semantic value, as the Ascent program would be useless without the required bounds.
  • Ascent programs should not be in a library's public API! They should be implementation detail. This means the point about introducing breaking changes does not apply.

Again, if you have a concrete and realistic use case where this becomes an issue, please share it here, so I can take a closer look.

@regexident
Copy link
Contributor Author

regexident commented Dec 15, 2023

Hmm, do you have a concrete use case where this is an issue?

Hiding the program as an implementation detail within another generic type, without leaking its trait bounds.

tl;dr: The trait bounds on a type's declaration deal with its structure. The trait bounds on a type's impls deal with its behavior. And having those two separated was very conscious decision in Rust: Just because parts of a type's behavior (even if the part is actually its whole behavior) require a certain trait you shouldn't be blocked from adding it to another type.

  • The bounds are specified explicitly by the user! How should Ascent go about figuring out what bounds are intended to be on the type definition, and what bounds are intended only for impl blocks?

Yes, that is a fair point, but more a limitation of ascent! { … }'s custom struct support, than an argument for adding trait bounds to the type declaration.

Preferably there should be a way for the macro body to declare where to add the trait bounds to, for sure. One possible approach might be to allow this:

ascent!{
    struct AscentProgram<T> where T: Clone + Eq + Hash;

    // ...
}

but also this:

ascent!{
    struct AscentProgram<T>;
    impl<T> AscentProgram<T> where T: Clone + Eq + Hash;
    // ...
}

If you think this struct … + impl … syntax is reasonable, then I'd be happy do refactor #25 accordingly.

Update: support for aforementioned syntax has now been implemented on the PR: #25 (comment)

  • In your specific example, the bounds do very much add semantic value, as the Ascent program would be useless without the required bounds.

The same can be said about HashMap (and HashSet, BTreeMap, BTreeMap, …), yet this is how it's defined in stdlib:

pub struct HashMap<K, V, S = RandomState> { /* private fields */ }

And only those methods that actually require K: Eq + Hash, S: BuildHasher are implemented in

impl<K, V, S> HashMap<K, V, S>
where
    K: Eq + Hash,
    S: BuildHasher,
{}

Everything else just is

impl<K, V, S> HashMap<K, V, S> {}

Type K does not need to be K: Eq + Hash in order for a HashMap<K, V> to obtain its .len(), so the latter is implemented in an impl without trait bounds. This allows for calling hashmap.len() in generic contexts where K: Eq + Hash might either be unknown or outright undesirable due to otherwise resulting in leaky abstractions.

Pretty much the only times where it is required and preferable to put trait bounds on a type's declaration itself is when those traits deal with allocation. As such Vec<T> is defined like this:

pub struct Vec<T, A = Global>
where
    A: Allocator,
{ /* private fields */ }

And BTreeMap<K, V>:

pub struct BTreeMap<K, V, A = Global>
where
    A: Allocator + Clone,
{ /* private fields */ }

It's only the state/allocation-relevant traits that are bound on the type's declaration. Everything else gets bound either on a per-impl, or even per-fn level:

impl<K, V, A> Clone for BTreeMap<K, V, A>
where
    K: Clone,
    V: Clone,
    A: Allocator + Clone,
{}

impl<K, V, A> BTreeMap<K, V, A>
where
    A: Allocator + Clone,
    pub fn get<Q>(&self, key: &Q) -> Option<&V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord + ?Sized,
    {}

    // ...
}
  • Ascent programs should not be in a library's public API! They should be implementation detail. This means the point about introducing breaking changes does not apply.

Yes. And that's exactly why the bound should be omitted on the type's declaration: Your outer container types should not have to leak their field's bounds.

Again, if you have a concrete and realistic use case where this becomes an issue, please share it here, so I can take a closer look.

Imagine a scenario where a generic type has multiple fields, one of which bing an ascent program:

pub struct Foo<T>
where
    T: Clone + Eq + Hash
{
    // ...
    program: AscentProgram<T>,
}

impl Foo<T>
where
    T: Clone + Eq + Hash
{
    pub fn do_the_ascent_thing(&mut self) { ... }

    pub fn do_something_else_entirely(&mut self) { ... }
    pub fn do_another_unrelated_thing(&mut self) { ... }

    // ...
}

The self.program is only used by a very small and specific subset of functionality of Foo<T> (here: self.do_the_ascent_thing()). And none of the other functions of Foo<T> require T: Clone + Eq + Hash.

However due to AscentProgram<T> forcing its own trait bounds upon its containing type we're forced to also constrain Foo<T> through its type declaration. As a result the type Foo<T> becomes unusable for any type T that isn't T: Clone + Eq + Hash. Even if we never intend to call foo.do_the_ascent_thing() on it. That's rather bad.

If however AscentProgram<T> omitted the bounds on its type declaration (as is the idiomatic thing to do in Rust) we could do this:

pub struct Foo<T> {
    // ...
    program: AscentProgram<T>,
}

impl Foo<T>
where
    T: Clone + Eq + Hash
{
    pub fn do_the_ascent_thing(&mut self) { ... }
}

impl Foo<T> {
    pub fn do_something_else_entirely(&mut self) { ... }
    pub fn do_another_unrelated_thing(&mut self) { ... }

    // ...
}

We're now free to both create Foo<T> with an arbitrary T, as well as call foo.do_something_else_entirely() on it, while also retaining support for the "ascent thing" via foo.do_the_ascent_thing() for any type T that is T: Clone + Eq + Hash.

Here is an in-depth discussion on the topic: https://stackoverflow.com/a/66369912/227536

@s-arash
Copy link
Owner

s-arash commented Dec 16, 2023

Update: support for aforementioned syntax has now been implemented on the PR: #25 (comment)

I definitely prefer this approach to shifting the user's defined generic bounds to the impl block.

You mentioned a scenario where you have a field of type AscentProgram<T> in another struct Foo<T>. Is that a real use-case you have, or is it more of a hypothetical? My sense is that one should use AscentPrgoram<T> in implementing some functionality, and put the results as fields on Foo<T>.

@regexident
Copy link
Contributor Author

You mentioned a scenario where you have a field of type AscentProgram<T> in another struct Foo<T>. Is that a real use-case you have, or is it more of a hypothetical? My sense is that one should use AscentPrgoram<T> in implementing some functionality, and put the results as fields on Foo<T>.

I was trying to wrap the Program<N, E> (where N and E correspond to node and edge types) into a Query<Tree<N>> for encapsulating the setup + run + query steps into a single run step and was set back by being forced to propagate the behavioral generic bounds to the outer, which as outlined above is rarely the case in Rust.

As another hypothetical scenario you may not be able to defer the creation of the program to the immediately before running, but might have distinct setup and run phases (and prefer not to introduce an intermediary type for holding the program's "ingredients" instead of itself).

But either way behavioral trait bounds should never be added to the type declaration as outlined above. They provide zero benefits, but lots of problems.

The struct Polonius<T: FactTypes> (from the tests) is different in that the T: FactTypes is structural, not merely behavioral as the members of Polonius access associated types on T: pub loan_live_at: Vec<(T::Loan, T::Point)>, hence requiring the trait bound on the type declaration itself. (Whether or not Polonius<…> should be generic over <T: FactTypes> or directly over <O: Atom, L: Atom, Po: Atom, V: Atom, Pa: Atom> instead, since FactTypes does not impose any other bounds between them, is debatable.)

@s-arash
Copy link
Owner

s-arash commented Dec 17, 2023

But either way behavioral trait bounds should never be added to the type declaration as outlined above.

Even assuming I agree with this completely (I'm not sure I do. Trait bounds on type definitions clearly signal the intended use to the user), the question is whether it's worth the added complexity to have different trait bounds on type declarations and impl blocks on Ascent. When you are defining a generic type directly, it takes 0 added complexity to not put trait bounds on the type definition.

Anyway, since you've already done the work, I don't have issues with it being added to Ascent, and thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants