Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Animatable trait for interpolation and blending #4482

Merged
merged 22 commits into from Feb 2, 2024

Conversation

james7132
Copy link
Member

@james7132 james7132 commented Apr 15, 2022

Objective

Allow animation of types other than translation, scale, and rotation on Transforms.

Solution

Add a base trait for all values that can be animated by the animation system. This provides the basic operations for sampling and blending animation values for more than just translation, rotation, and scale.

This implements part of bevyengine/rfcs#51, but is missing the implementations for Range<T> and Color. This also does not fully integrate with the existing AnimationPlayer yet, just setting up the trait.

@github-actions github-actions bot added the S-Needs-Triage This issue needs to be labelled label Apr 15, 2022
@james7132 james7132 added C-Enhancement A new feature A-Animation Make things move and change over time and removed S-Needs-Triage This issue needs to be labelled labels Apr 15, 2022
Co-authored-by: Kirillov Kirill <kirusfg@gmail.com>
@alice-i-cecile alice-i-cecile added the X-Controversial There is active debate or serious implications around merging this PR label Apr 22, 2022
@Weibye Weibye added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label Aug 10, 2022
@james7132 james7132 removed the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label Nov 7, 2022
@james7132 james7132 requested review from irate-devil and kirusfg and removed request for irate-devil and kirusfg November 7, 2022 10:03
Co-authored-by: François <mockersf@gmail.com>
@alice-i-cecile
Copy link
Member

I've submitted several PRs to James's branch to address some of the reviewer comments: https://github.com/james7132/bevy/pulls

Assuming those are all accepted we need to:

  1. Decide whether animations or BlendInput should control whether or not blending is additive.
  2. Consider improving the layout of BlendInput<T>.

I think that the following work should be handled in follow-up PRs (this PR is very old by now):

  1. Benchmarking and optimizing interpolation performance (including the question about Vec3A).
  2. Adding handle/asset-based interpolation (the assets v2 rework broke existing code).
  3. Providing an end-user API (and examples) for this feature.

if input.additive {
translation += input.weight * Vec3A::from(input.value.translation);
scale += input.weight * Vec3A::from(input.value.scale);
rotation = rotation.slerp(input.value.rotation, input.weight);
Copy link
Contributor

@atlv24 atlv24 Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rotation = rotation.slerp(input.value.rotation, input.weight);
// Important: additively animating quaternions means we first have to partially apply the input rotation.
// Quaternion multiplication order is important here. Additionally, do not use nlerp here. These rotations
// can be very different, so we must use slerp explicitly.
// Reference:
// "Additive blend" section in <https://animcoding.com/post/animation-tech-intro-part-3-blending>
// "Here’s how to apply an additive pose on top of the regular pose. It differs from the regular blend.
// If the weight is one, then we can multiply transforms. If the weight is non-one, then we need to first blend
// between identity and the additive transform, then apply this blended transform on the transform in the pose."
rotation = rotation * Quat::IDENTITY.slerp(input.value.rotation, input.weight);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the critical fix right? We should add a comment describing why this is done, ideally with a reference.

Copy link
Contributor

@atlv24 atlv24 Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the critical fix. Everything else in my review is 'fluff'. added a comment

input.weight,
);
scale = Vec3A::interpolate(&scale, &Vec3A::from(input.value.scale), input.weight);
rotation = Quat::interpolate(&rotation, &input.value.rotation, input.weight);
Copy link
Contributor

@atlv24 atlv24 Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be more explicit about the kind of interpolation

rotation = rotation.slerp(&input.value.rotation, input.weight);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nlerp is wanted here for commutativity on the blend stack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that mean? What's supposed to be commutative? What's the blend stack? Blending operations are not commutative in general, you need to apply them in order. And even if they are, what do we gain by reordering?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the blend stack is the sequence of animations being blended. commutativity makes reasoning about the operations simpler in the general case, as you dont need to have logic to ensure they are always ordered the same way for the result to be consistent. consider two animations A and B, which can each start and stop at any moment. if you start A, then B, you will have A applied first then B applied on top. if you start B then A, they will be on the stack the other way around and will look different. This gets more complicated if you can cancel and restart, or pause animations. You start needing an animation priority system to determine which get applied first, and that can be confusing. Besides all that, its slower for not much gain quality-wise, as far as i understand. Slerp is not needed here, refer to the document linked in the comments above interpolate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you start A, then B, you will have A applied first then B applied on top.

Only if B is additive, because otherwise you should stop A first, you can't apply two non-additive animations to the same target. Also pause etc. is about animation transitions and creating the stack, not about blending (how to consume the stack). And if B is additive then I argue that the operations are not commutative. It's not a question of quality, it's a question of producing the correct result.

fn interpolate(a: &Self, b: &Self, t: f32) -> Self {
Self {
translation: Vec3::interpolate(&a.translation, &b.translation, t),
rotation: Quat::interpolate(&a.rotation, &b.rotation, t),
Copy link
Contributor

@atlv24 atlv24 Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be more explicit about the kind of interpolation
this definitely needs to explicitly be slerp

Suggested change
rotation: Quat::interpolate(&a.rotation, &b.rotation, t),
rotation: a.rotation.slerp(&b.rotation, t),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because interpolate should be nlerp by default and transforms can be very different rotations and thus need slerp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you mean that for rotations the interpolation should be fixed to always be slerp because others will not provide decent results in general, right? I agree that we should probably limit the number of different interpolations we support, explicitly. See my other long comment about the design.

}

impl Animatable for Quat {
/// Performs an nlerp, because it's cheaper and easier to combine with other animations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs are dissonant. The code is not doing an nlerp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be doing an nlerp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your other comment says it should be doing a slerp() no? Because nlerp() for very different rotations might be way off.

Copy link
Contributor

@atlv24 atlv24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone over the usages of interpolate and slerp again after reading more. In general we want to use nlerp in most places, except for additive blending and transform interpolation. The rest we prefer nlerp.

@pcwalton
Copy link
Contributor

I don't have much to add to the review other than what @rodolphito said, looks good otherwise.

pub trait Animatable: Reflect + Sized + Send + Sync + 'static {
/// Interpolates between `a` and `b` with a interpolation factor of `time`.
///
/// The `time` parameter here may not be clamped to the range `[0.0, 1.0]`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean? How is an implementor supposed to understand interpolating with t=1.5 (which sounds like an extrapolation rather)? Are we sure this is even possible in the general case? Because this seems to strongly imply lerp here / a linear transform.

@atlv24
Copy link
Contributor

atlv24 commented Jan 31, 2024

There's a bigger problem with switching to nlerp i've discovered: Glam's nlerp is not commutative, nor can it easily be made commutative. This is because it is normalizing after interpolating, which makes sense for blending two quaternions, but it also means that its no longer a truly linear operation. To commutatively blend an arbitrary number of quaternions, one would need to defer the normalization til the very end. This is problematic however because the intermediary values are not true rotation quaternions because they are not unit length. We'd want to encode this in the type system somehow. I think it can be done with a PR to glam, but it might be contentious as it adds complexity and would need some extra implementations. There would need to be a DenormalizedQuaternion type that implements Into<Quaternion>, and have specialized lerp methods that take combinations of DenormalizedQuaternions and Quaternions to linearly interpolate them. Then at the end it would be silently normalized and converted into a Quaternion as soon as it gets used.

For this reason I am opting to leave slerp in for the moment, because this would need further discussion and upstream work in glam.

@atlv24
Copy link
Contributor

atlv24 commented Jan 31, 2024

I've realized there actually is a simpler way forward that can be implemented now without changes to glam.

@atlv24
Copy link
Contributor

atlv24 commented Jan 31, 2024

BlendInput says additive determines "Whether or not to additively blend this input into the final result" which is actually not what its doing. Additive BlendInputs are being applied in the order they are given in the blend stack, meaning if a non-additive BlendInput is present after an additive one, the additive one is not blended into the final result as claimed. This will look wrong and be unintuitive, compared to most animation solutions ive worked with

Copy link
Contributor

@djeedai djeedai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the long comment over the abstraction shape. We can discuss on Discord, but this was easier I think to make a comment here with code example.

use bevy_utils::FloatOrd;

/// An individual input for [`Animatable::blend`].
pub struct BlendInput<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought a bit more about this abstraction. I think it's useful to look at what our goals are, especially in terms of how things will be used, and how this is trying to achieve those goals.

As I recall, the base idea of BlendInput<T> is to "preprocess" all the various blend sources into a single common data structure per type (one for f32, one for Vec4, etc.) and then run a tight loop on blending all sources for that data type, which is then easier because you iterate only on BlendInput<T> and not a collection of heterogenous sources.

Now, this is clearly a step in the right direction, in the sense that we're thinking about the data usage pattern to optimize it and leverage e.g. SIMD. Where I think this abstraction falls short is in two ways:

  1. It iterates over the blend sources in the inner loop, as seen in the signature of Animatable::blend() which takes a set of sources to blend together. How many different sources do we expect will blend that way? Even for an advanced scenario with say 4 base animations blending together (walk, run, jump, and crouch), and let's say 4 more additive ones on top (look left/right, up/down, and whatever else), that makes 8 sources to blend into a final vertex value. Compare that with how many vertices we need to blend for a single 3D character in a modern game, which is in the order of thousands. And those will be iterated over in an outer loop. We've got the loops reversed from a performance point of view.

  2. The data structure is not optimal for packing and memory accesses. First the packing: depending on T, and let's take any VecN as example, the compiler will most likely put the VecN first, aligned on 4 bytes (or 16 bytes for SIMD), followed by weight (4B size/align), followed by additive (1B). This means we have 3 bytes of padding for every 12B (f32) to 24B (Vec4) of total size, which is a lot. The access pattern, which is also to load the value and weight into separate (SIMD) registers to perform some lerp() (which is the most common case; let's ignore rotations for a minute) is also not helped here because we can't load more than 1 source at once (SIMD register load only accepts a continuous block of memory). If, on the other hand, we were transforming this "array of struct" into a "struct of array", we could load more data sources at once, and blend multiple of them via SIMD. Better even, we can even pack all lerp() sources together (that is, including packing f32 with Vec3 and Vec4) since the SIMD instructions to lerp are the same and operate component-wise. This means a very tight loop on a long array of data without any branching etc. which is the absolute optimal for SIMD.

So the way I'd see a better abtraction for the blending step is:

  • explicitly support lerp / nlerp / slerp with separate optimized codepaths, because they require different SIMD instructions;
  • possibly add a fallback where we invoke something like the current Animatable::interpolate(), but only for "exotic" interpolations, and I'm not even sure we want that (I don't have an example);
  • pack all inputs and do lerp(lhs[], rhs[], weights[]) instead of lerp(lhs,rhs,weight)[].
pub enum Interpolation<T> {
    Linear,
    NormalizedLinear,
    SphericalLinear,
    // See how this immediately become messy due to T, maybe we can skip this entirely
    Custom(Box<dyn Fn(&T, &T, f32) -> T),
}

impl Interpolation<f32> {
    // Blend two arrays of values with given weights
    fn blend(&self, src: &mut [f32], rhs: &[f32], weights: &[f32], additive: bool) {
        match *self {
            Interpolation::<f32>::Linear => {
                // Branch _before_ we start the hot loop
                if additive {
                    // This loop can get partially unrolled, use SIMD, etc.
                    for i in 0..src.len() {
                        src[i] += rhs[i] * weights[i];
                    }
                } else {
                    // Same here
                    for i in 0..src.len() {
                        src[i] = (1.0 - weights[i]) * src[i] + rhs[i] * weights[i];
                    }
                }
            }
            // etc.
        }
    }
}

// Now, we need to know how to pack e.g. Vec3 and f32 and Vec4 together:
pub trait Animatable: [...] {
    type Unit;
    fn prepare(&self) -> &[Unit];
}

impl Animatable for Vec3 {
    type Unit = f32;
    fn prepare(&self) -> &[f32] { self.as_ref() }
}

// etc. for other types...

fn animation_system(anims: &[...]) {
    // Read and prepare values from component (here, Transform)
    let mut src = vec![];
    for tr in &transforms {
        // Assuming here the case where both translation and scale use Interpolation::Linear
        // to keep the example short:
        src.extend_from_slice(tr.translation.prepare());
        src.extend_from_slice(tr.scale.prepare());
    }

    // Blend all anims into src directly
    let mut rhs = vec![];
    for anim in &anims {
        // Grab one animation. Ideally we can even pre-process this in the anim format,
        // and completely remove this block.
        rhs.clear();
        for tr in &anim.transforms {
            rhs.extend_from_slice(tr.translation.prepare());
            rhs.extend_from_slice(tr.scale.prepare());
        }

        // Do the actual blend for every transform at once
        Interpolation::<f32>::blend(&mut src, &rhs, &anim.weights, false);
    }

    // Store back final result
    let mut offset = 0;
    for tr in &mut transforms {
        tr.translation.copy_from_slice(&src[offset..offset+3]);
        offset += 3;
        tr.scale.copy_from_slice(&src[offset..offset+3]);
        offset += 3;
    }
}

fn interpolate(a: &Self, b: &Self, t: f32) -> Self {
Self {
translation: Vec3::interpolate(&a.translation, &b.translation, t),
rotation: Quat::interpolate(&a.rotation, &b.rotation, t),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you mean that for rotations the interpolation should be fixed to always be slerp because others will not provide decent results in general, right? I agree that we should probably limit the number of different interpolations we support, explicitly. See my other long comment about the design.

}

impl Animatable for Quat {
/// Performs an nlerp, because it's cheaper and easier to combine with other animations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your other comment says it should be doing a slerp() no? Because nlerp() for very different rotations might be way off.

@pcwalton
Copy link
Contributor

pcwalton commented Feb 2, 2024

@rodolphito I've been told that what this PR does actually matches what Unity does. In Unity override layers cancel out the results of previous additive layers.

Copy link
Contributor

@djeedai djeedai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed a lot on Discord. There's relative consensus with @pcwalton and @james7132 that this is good enough for now, so let's move forward.

Some resources:

@alice-i-cecile alice-i-cecile modified the milestones: 0.14, 0.13 Feb 2, 2024
Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm content with this for now. Like djee said, there's been a lot of discussion and we have broad consensus. Ongoing improvements will be much easier to do in follow-ups :)

@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Feb 2, 2024
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Feb 2, 2024
Merged via the queue into bevyengine:main with commit 602515d Feb 2, 2024
23 checks passed
@james7132
Copy link
Member Author

For posterity, the discussion that led to this being merged can be found here: https://discord.com/channels/691052431525675048/774027865020039209/1202358388466384956

tjamaan pushed a commit to tjamaan/bevy that referenced this pull request Feb 6, 2024
# Objective
Allow animation of types other than translation, scale, and rotation on
`Transforms`.

## Solution
Add a base trait for all values that can be animated by the animation
system. This provides the basic operations for sampling and blending
animation values for more than just translation, rotation, and scale.

This implements part of bevyengine/rfcs#51, but is missing the
implementations for `Range<T>` and `Color`. This also does not fully
integrate with the existing `AnimationPlayer` yet, just setting up the
trait.

---------

Co-authored-by: Kirillov Kirill <kirusfg@gmail.com>
Co-authored-by: François <mockersf@gmail.com>
Co-authored-by: irate <JustTheCoolDude@gmail.com>
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
Co-authored-by: Alice Cecile <alice.i.cecil@gmail.com>
@james7132 james7132 deleted the animatable branch March 10, 2024 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Animation Make things move and change over time C-Enhancement A new feature S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet