Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy Objects and Lookups #160

Closed
mitsuhiko opened this issue Nov 28, 2022 · 16 comments
Closed

Lazy Objects and Lookups #160

mitsuhiko opened this issue Nov 28, 2022 · 16 comments

Comments

@mitsuhiko
Copy link
Owner

Isolated from #157:

In the latter case, I have imagined an API by which the engine tracks object path accesses at compile-time and then asks once (say via an Object trait) for a terminal value that isn't compound, so anything but a map or sequence. For example, for foo.bar[0].baz, the engine would ask the foo object (assuming it is one) for bar[0].baz. The object can then perform an efficient lookup without any clones and return the terminal baz. Similarly, if we have:

{% set bar = foo.bar[0] %}
{{ bar.baz }}

The engine would again ask foo for bar[0].baz. This of course requires some thought and implementation grease, but, aside from allowing full borrows of anything, seems like the most sensible approach.

@mitsuhiko
Copy link
Owner Author

I'm starting to favor the utilities in minijinja-stack-ref to accomplish this:

use minijinja::value::{StructObject, Value};
use minijinja::{context, Environment};
use minijinja_stack_ref::scope;

struct Config {
    version: &'static str,
}

struct State {
    config: Config,
}

impl StructObject for Config {
    fn get_field(&self, field: &str) -> Option<Value> {
        match field {
            "version" => Some(Value::from(self.version)),
            _ => None,
        }
    }
}

impl StructObject for State {
    fn get_field(&self, field: &str) -> Option<Value> {
        match field {
            // return a reference to the inner config through reborrowing
            "config" => Some(reborrow(self, |slf, scope| {
                scope.struct_object_ref(&slf.config)
            })),
            _ => None,
        }
    }
}

let mut env = Environment::new();
env.add_template(
    "info",
    "app version: {{ state.config.version }}"
)
.unwrap();

let state = State {
    config: Config {
        version: env!("CARGO_PKG_VERSION"),
    }
};

let rv = scope(|scope| {
    let tmpl = env.get_template("info").unwrap();
    tmpl.render(context! {
        state => scope.struct_object_ref(&state),
    }).unwrap()
});
println!("{}", rv);

@SergioBenitez
Copy link
Contributor

I'll give minijinja-stack-ref a shot and see how far it gets me!

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Dec 17, 2022

Finally had a chance to try it out. Unfortunately it doesn't quite suffice.

Here's something analogous to what I'd like to make work:

struct CompoundValue {
    seq: Vec<CompoundValue>,
    map: std::collections::HashMap<String, CompoundValue>,
}

impl StructObject for CompoundValue {
    fn get_field(&self, name: &str) -> Option<Value> {
        match name {
            "seq" => Some(reborrow(self, |this, scope| {
                scope.seq_object_ref(CompoundValueSeq::ref_cast(&this.seq))
            })),
            "map" => Some(reborrow(self, |this, scope| {
                scope.struct_object_ref(CompoundValueMap::ref_cast(&this.map))
            })),
            _ => None
        }
    }
}

#[derive(RefCast)]
#[repr(transparent)]
struct CompoundValueSeq(Vec<CompoundValue>);

impl SeqObject for CompoundValueSeq {
    fn get_item(&self, idx: usize) -> Option<Value> {
        // Since the second argument must be `fn`, this fails.
        reborrow(self, move |this, scope| {
            this.0.get(idx).map(|v| scope.struct_object_ref(v))
        })

        // Maybe an API by which we can pass some ctxt?
        // reborrow(self, idx, |this, scope, idx| {
        //     this.0.get(idx).map(|v| scope.struct_object_ref(v))
        // })
    }

    fn item_count(&self) -> usize {
        self.0.len()
    }
}

#[derive(RefCast)]
#[repr(transparent)]
struct CompoundValueMap(std::collections::HashMap<String, CompoundValue>);

impl StructObject for CompoundValueMap {
    fn get_field(&self, name: &str) -> Option<Value> {
        // The same thing here.
        reborrow(self, |this, scope| {
            this.0.get(name).map(|v| scope.struct_object_ref(v))
        })

        // Same thing with the context, but note that Ctxt: Copy would not work.
        // reborrow(self, name, |this, scope, name| {
        //     this.0.get(name).map(|v| scope.struct_object_ref(v))
        // })
    }
}

In short, with reborrow we can only pass a fn, and this makes it impossible to access the idx or name from the get_item() or get_field() function scope, respectively, in the reborrowing scope. As a result, for my use-case, and I imagine for most "interesting" use-cases, this API doesn't allow efficient Object creation.

One potential fix is to allow a context to be passed in along with the reborrow call, as shown in comments in the snippet above. Note, however, that for get_field(), such a ctxt would need to be an &str, which may or may not be a problem depending on the internal details, which I haven't taken a close look at.

In all, I'm finding this API to be somewhat laborious to use and at the same time wondering if different design decisions for the internals of minijinja might make these kinds of "borrows" just work, without the need for something like minijinja-stack-ref. In any case, I'm happy to jump through hoops as long as doing so effects efficiency.

@mitsuhiko
Copy link
Owner Author

mitsuhiko commented Dec 18, 2022

Unfortunately I'm not sure if there is a way to make this work with FnOnce without risking that someone can extend the lifetime to 'static but I will see if I can fix that.

In all, I'm finding this API to be somewhat laborious to use and at the same time wondering if different design decisions for the internals of minijinja might make these kinds of "borrows" just work, without the need for something like minijinja-stack-ref. In any case, I'm happy to jump through hoops as long as doing so effects efficiency.

I'm not sure. I agree that the API is pretty stupid and also the entire reborrowing system only works if the value was placed in the context in the first place via the scope guard. About different APIs I'm also entirely unsure how this could be built. I don't actually think that the MiniJinja internals are the limitation here. Even if hypothetically Value were to have a lifetime, I can't imagine how that would be in any way helpful.

I think it needs some form of runtime checks for borrows. I will try to brainstorm on what can be done.

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Dec 19, 2022

Even if hypothetically Value were to have a lifetime, I can't imagine how that would be in any way helpful.

One obvious thing here would be to store &dyn Object (or &dyn StructObject / &dyn SeqObject). That would eliminate the need for minijinja-stack-ref entirely, as far as I can tell. Coupled with built-in implementations for Vec<T: Object> and HashMap<K: Borrow<str>, V: Object>, and a Value::from_ref() : fn<T: Object>(&T) -> Value<'_>, the example above would not only work but also compress to the pithy:

struct CompoundValue {
    seq: Vec<CompoundValue>,
    map: HashMap<String, CompoundValue>,
}

impl StructObject for CompoundValue {
    fn get_field<'a>(&'a self, name: &str) -> Option<Value<'a>> {
        match name {
            "seq" => Value::from_ref(&this.seq),
            "map" => Value::from_ref(&this.map),
            _ => None
        }
    }
}

@mitsuhiko
Copy link
Owner Author

The issue is the Value<'a>. What do I do with the 'a lifetime? Every value would have to have its own lifetime and since I cannot guarantee stacked borrows it would be unclear to me how I can even utilize this in the engine.

@SergioBenitez
Copy link
Contributor

The issue is the Value<'a>. What do I do with the 'a lifetime? Every value would have to have its own lifetime and since I cannot guarantee stacked borrows it would be unclear to me how I can even utilize this in the engine.

I'm not sure I follow. I'm suggesting that there doesn't need to be any unsafe here.

As far as I can tell, Value in minijinja is doing a bit of double-duty as both a long-lived value (instances in which it's stored internally longer than the time needed to render a template) and a transient value (when it is used as the context of a template).

For the latter case, logically speaking, as long as the value is valid immediately before and during template rendering, everything is okay. Whether the value becomes invalid after shouldn't matter as rendering has completed. This is exactly what borrows in Rust capture, and the straightforward use of references in Value I suggested above would enable a borrowing API with dynamic Objects without unsafe.

I suggest that there's an issue with the internal design because conflating the two uses of Value above would make doing what I suggest pretty difficult, or at least cumbersome. Minimally, you'd need to store an Either<Box<dyn Object>, &dyn Object> and use Value<'static> where you need to eschew the lifetime.

@mitsuhiko
Copy link
Owner Author

Other than Value references in the LoadConst instruction which is an internal implementation detail, Value really behaves the same. It‘s used to provide a runtime object that represents a value. The issue is a value that is created during the execution of the template and thus has a shorter lifetime as the value it‘s created from. For instance if you do {{ foo([1, 2, bar]( }} you have just created a Value holding the list [1, 2, bar] with a lifetime lower than the template invocation. As such if that value is a SeqObject, what do you do with it now?

If you have an API such as get_field on it, it would not be able to borrow out of &self unless I can hold the parent object alive for the lifetime of the return value at the very least. That‘s not something that can be done with safe APIs in Rust. The closest we have to such an api is Ref::map which however requires that everything is held in a RefCell and due to the lifetime in the API would not permit that object to be held itself in the context or anywhere else (like on the engine stack).

@mitsuhiko
Copy link
Owner Author

There is a hypothetically different world where each object leaks out to the end of the execution out of principle, and then borrows would be possible without thinking about lifetimes, but that would most likely be an entirely different experience and I‘m not sure what the consequences of that are (other than potentially very high memory usage).

@SergioBenitez
Copy link
Contributor

Other than Value references in the LoadConst instruction which is an internal implementation detail, Value really behaves the same. It‘s used to provide a runtime object that represents a value. The issue is a value that is created during the execution of the template and thus has a shorter lifetime as the value it‘s created from. For instance if you do {{ foo([1, 2, bar]( }} you have just created a Value holding the list [1, 2, bar] with a lifetime lower than the template invocation. As such if that value is a SeqObject, what do you do with it now?

Then it sounds like the Either<Box<dyn T>, &dyn T> route is what's needed. Perhaps the unifying approach is to use Cow and/or Either for all owned variants in a Value, or isomorphically, create a ValueRef and ValueBuf and Value is either Owned(ValueBuf) or Borrowed(ValueRef). In any case, I would really like to avoid deep clones wherever possible - the performance gain is large - and it seems something like this is both the right and necessary approach.

@mitsuhiko
Copy link
Owner Author

I'm no longer planning on resolving this. The change is too invasive for the engine. While most of my lifetime concerns are resolved now, the Value object itself is just inherently 'static which makes any borrowing situation not work without some major hacks.

@SergioBenitez
Copy link
Contributor

Would you accept a PR that implements the needed changes, @mitsuhiko?

@mitsuhiko
Copy link
Owner Author

@SergioBenitez I would not object if you find a way!

@SergioBenitez
Copy link
Contributor

I imagine such a way exists! Though I do expect it to be rather invasive. Before I embark, I want to make sure that you're not opposed to the change, or the potential invasiveness of such a change, ideologically. Do you think that's the case?

@mitsuhiko
Copy link
Owner Author

@SergioBenitez I'm not opposed to potentially invasive changes for as long as it does not fundamentally restrict the language or user experience.

@SergioBenitez
Copy link
Contributor

Awesome! I'll have something for you to look at soon-ish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants