Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix path resolution in nested scopes. #174

Closed
wants to merge 13 commits into from

Conversation

juliangoacher
Copy link

This PR provides a fix for dotted path resolution within nested context scopes.

I've seen the discussion at #100 but still think there are good reasons for having this functionality.

Motivations:

  • Variable names should be resolved in a consistent manner, there shouldn't be two different methods producing different results depending on whether the variable reference is judged to be a path or not.

  • Variable name resolution could and should follow the same rules as used for nested scopes in Javascript.

    Scopes can be nested in dust.js using Context.push(..). Nested scopes should be equivalent to the following JS:

    var person = { name: 'John' };
    var settee = { color: 'red' };
    (function() {
        var person = { name: 'Jack' };
        (function() {
            var person = { name: 'Harry' };
            console.log( person.name, settee.color );
        })();
        console.log( person.name, settee.color );
    })();
    

    There is no confusion at any point here about which data item any variable name references, despite some names appearing in three different scopes. Using dotted notation doesn't in any way complicate the situation.

  • We make heavy use of Context.push to create local template scopes without polluting the global scope, and often these scopes are nested 3 or 4 deep. It doesn't make sense that templates in a nested scope loose visibility of any variable that requires a dotted path reference, while keeping visibility of those variables that don't.

  • The work arounds - {#breaking}{#the}{#path}{#into}{#sections} or using a helper - are ugly and unwieldy.

This PR also addresses problems with incorrect 'this' references when invoking function values.

@rragan
Copy link
Contributor

rragan commented Oct 12, 2012

Agreed. I think they should just work. {user.name} not working but {#user}{name}{/user} working shows this does not provide any sandboxing of references to data; just an annoyance.

Earlier, made such changes myself to getPath as an experiment. Curious about your exact semantics. Do you look downard first like current getPath, then if fail look up the tree at each level for a full match to the path returning the first encountered?

The only performance downside is getPath that fails to find the item downward will take longer to look up the tree but I think this is an unusual case and typically reflects the user not realizing they have reset the context.

Note: I have a separate outstanding PR for a helper:
{@access key="path" /} that will do the same thing but as you point out helpers are ugly and cannot be used in passing a value via a param.

@innerfunction
Copy link

No, I try to reproduce the semantics of nested JS scopes; so with a path like 'user.address.city' I first find the closest enclosing scope with 'user' bound as a name then resolve the rest of the path on that var. If the path does not resolve then that's it - the value is undefined, no faffing about trying to find another match further down the stack.

I don't think there's any major performance downside to this - the maximum possible number of iterations is (path.length + stack.length) - and I can't imagine path or stack going above 10 items in any common use case.

@ovaillancourt
Copy link

+1 on this - anything we can do to help with this pull?

@kethinov
Copy link

+1 as well. We need this patch in a stable release of dust.

@jairodemorais
Copy link
Contributor

Hi @juliangoacher, first of all, thanks for you PR. I have only one thing to say:

I have apply this patch and run the tests and some of them are failing, you should run the tests before sending a PR. If you fix this little issue I would be happy to merge this PR.

the tests that failed are:

should test force a key
should return a specific array element using the current context
should test the array reference access with idx and current context
should test the array reference access with len and current context

You can run the tests executing:

Server side tests:

make jasmine
make test

Client side tests:

just open the html file located in /test/jasmine-test/client/specRunner.html

@juliangoacher
Copy link
Author

I've made a couple of changes to the code:

  • Small fix to Context.getPath() so that arrays are handled correctly.
  • Modified one of the unit tests, changing the expected value to match the new behaviour. This change brings that particular test's behaviour in line with other similar tests.

@rragan
Copy link
Contributor

rragan commented Jan 7, 2013

+1 for this. I can't count the number of times I've had to explain to users why they can access a variable but not a pathed reference. There should be no performance impact to existing code. Please, let's do this.

@vybs
Copy link
Contributor

vybs commented Jan 7, 2013

does this add backward incompatibility?

@juliangoacher
Copy link
Author

Not sure what the question is about backwards compatibility; but if referring to the unit test change - that unit test specifically tested for non-resolution of a value when using a dotted path reference, so this change had to affect that test.

@rragan
Copy link
Contributor

rragan commented Jan 7, 2013

The only backward incompatibility I can think of is an existing pathed reference that returns empty because the item is not in the current context might return a non-empty result if that item could be found in a higher context. This seems like an edge case and the only way to avoid it is a helper which then causes all sorts of other issues because a helper cannot be used in other places that a pathed reference can. If this breakage is really considered serious, then bumping the version number might be appropriate and calling it out.

@vybs
Copy link
Contributor

vybs commented Jan 7, 2013

Understand the need for this and than you for the PR as always, but it literally changes the core functionality.

We heavily use dust and it is really hard to tell if the edge case is edge case or not. Some more thought into this might help.
I am wondering if there is way to allow overriding certain methods and does make it a option to use version of getContext

@juliangoacher
Copy link
Author

I could add a flag to the global dust object to preserve backwards compatibility by toggling between this behaviour and the old behaviour. Question then is which should be the default - I'm sure quite a few would agree with me that the new behaviour should be the default from now on.

@rragan
Copy link
Contributor

rragan commented Jan 7, 2013

There seem to be only a few choices on this matter:

  • Leave it as is and suffer with {#a}{b.c}{/a} when referencing a.b.c in a higher context. People already suffer with this. It also complicates passing a.b.c. This can be hacked but if you need to pass a.b.c and r.s.t I think you are stuck.
  • Add a helper (see my @access helper) so you write {@access key="a.b.c" /}. Works to get value but fails if you need to do much of anything with it like pass it as a param
  • Add a new syntax that invokes the new behavior and still retains the use as a simple reference. {@a.b.c} or {%a.b.c} or {a..b.c}, etc. Invent your own choice here but compiler change needed. User has to pay attention that the reference is up-context but they already have to do this via {#a}{b.c}{/a}
  • Just make {a.b.c} work which is what users expect. You aren't protecting anything by preventing upward refs since {a}{xxx}{/a} can always manage to reach the data, albeit awkwardly.

I still like making {a.b.c} just work. I guess a global context flag to control behavior would be OK but I want the default to be the new way. I know that is what we will be setting if it is not the default.

@vybs
Copy link
Contributor

vybs commented Jan 7, 2013

IMO, any new additions should be controlled by a flag and teams can choose to turn this on carefully.

@rragan
Copy link
Contributor

rragan commented Jan 7, 2013

Then I plan to boldly turn it on and boldly go where the users expect. :)

@vybs
Copy link
Contributor

vybs commented Jan 7, 2013

I would suggest having a good review and sign off from a few folks and set up a ETA and make sure we have addressed adding enough tests before the final merge.

@juliangoacher
Copy link
Author

I've gone ahead and added a global flag for switching between the new and old behaviour. I don't know whether there are any guidelines or preferences on naming of flags or globals, but I'll leave that to the reviewers.

@rragan
Copy link
Contributor

rragan commented Jan 17, 2013

What is the status of this? Lots of folks have an interest.

@vybs
Copy link
Contributor

vybs commented Jan 18, 2013

@smfoote @jimmyhchan @jairodemorais @iamleppert ( would you all have some time to review this week( we should have some reviews back next week since this is pending

@rragan and others sorry for the delay.

@smfoote
Copy link
Contributor

smfoote commented Jan 21, 2013

This is not backwards compatible and shouldn't be merged as is. We are using the dot-notation to force context:

{
    bar: "baz",
    foo: {
        bar: "biz"
    }
}

{#foo}
  {.bar}
{/foo}

In this case, we want "biz" or nothing. This PR would give us "biz" if foo.bar exists, and "baz" if it doesn't.

I like the idea of making it easier to access different branches of the tree. However, I think the syntax provided by this implementation is very confusing.

Additionally, the examples above are informative but slightly inaccurate. Comparing Dust contexts to JavaScript execution contexts, though interesting, can be a bit misleading. JavaScript execution contexts don't have branches, so you can never reach up the tree to navigate down a different branch, and that is the crux of the problem being solved for here.

That it should be possible to easily navigate up the tree and down a different branch without using way too many {#section} tags, I agree. However, this and the current getPath are distinct use cases and should not be bundled together.

@juliangoacher
Copy link
Author

@smfoote: I don't understand your second last para. The PR attempts to find the closest enclosing scope with the first name in the path bound to a value, and in that sense is analogous to JS scopes. But I take your point that in the example given, .bar should force scope and the PR could probably be adapted to do that if that was the consensus feeling.

As for getPath being a distinct use case - well, this PR is trying to make the case that it shouldn't be, and I think many other dust users would agree with that.

@smfoote
Copy link
Contributor

smfoote commented Jan 21, 2013

@juliangoacher: The difference between Dust and JS, as I see it, is that JS execution contexts are linear

(function() {
  (function() {
    var sibling = { name: 'Larry' };
  })();
  (function() {
    var person = { name: 'Harry' };
    console.log( sibling.name, settee.color );
  })();
  console.log( person.name.first, settee.color );
})();

Here, sibling.name cannot be accessed in the given execution context, because sibling is defined in a different branch of the execution. When JavaScript resolves it's variables, it searches linearly, and never visits these other branches. However, what you are proposing for Dust would make such branch traversal possible:

The JSON:

{
  sibling: {
    name: 'Larry'
  },
  person: {
    name: 'Harry'
  }
}

and the Dust:

{#person}
  My name is {name}, and I have a sibling whose name is {sibling.name}.
{/person}

@juliangoacher
Copy link
Author

@smfoote: I'm talking about nested evaluation scopes, not separate execution contexts. The example you give isn't what the PR does. Try nesting the third function in the second. The lexical closure you create then matches the behaviour of the PR.

@carchrae
Copy link
Contributor

carchrae commented Jul 5, 2013

hi @juliangoacher - the issue of paths and functions is tricky.

#271 has ended up with the approach of not allowing functions on the path. the one exception is if the last element on the path is a function, which is returned and handled correctly.

in my own fork, i use simple functions in the path (ie, ones that just return a result) - i find them incredibly useful. however, if you have the patience/endurance to read all the comments on #271 you can see the discussion around it. i think @jimmyhchan had the best argument for removing the partial support of functions: (direct link to comment -> #271 (comment) )

so, in the ideal world, i think the path should support async functions in the path. i'm still thinking about how i'd implement that.

@juliangoacher
Copy link
Author

hi @carchrae - thanks for the pointers.

In my usage, I'm not too worried about resolving function values encountered along a path, but I did encounter problems when the path resolved to a method function on an object and Chunk.section|reference didn't invoke it correctly due to an incorrect 'this'. My PR addressed this as a side issue, the path resolution code being a natural place to insert a fix. One of the reasons I didn't actually invoke the method at that point (or anywhere along the path) is simply because the methods arguments aren't known and aren't available, so it was better to return a reference to the method function in a form that could be invoked correctly. Do you think there could still be an argument for including this functionality in your PR (as I guess it's the main candidate now for fixing the path issue)?

I see in the #271 thread that there is discussion about whether functions should even be included in the data context. One of dust.js's most important features - and the main reason I switched to is - is its support of async functions in the data context, so I found those comments a bit strange.

@carchrae
Copy link
Contributor

carchrae commented Jul 5, 2013

Do you think you could make a test case that shows the issue? That would
be the best thing. Or if there is already a test case in your PR, I could
try it out.

w.r.t. the comments about functions - it seemed strange to me too. I think
I made that clear in that thread. :)

On Fri, Jul 5, 2013 at 9:15 AM, Julian Goacher notifications@github.comwrote:

hi @carchrae https://github.com/carchrae - thanks for the pointers.

In my usage, I'm not too worried about resolving function values
encountered along a path, but I did encounter problems when the path
resolved to a method function on an object and Chunk.section|reference
didn't invoke it correctly due to an incorrect 'this'. My PR addressed this
as a side issue, the path resolution code being a natural place to insert a
fix. One of the reasons I didn't actually invoke the method at that point
(or anywhere along the path) is simply because the methods arguments aren't
known and aren't available, so it was better to return a reference to the
method function in a form that could be invoked correctly. Do you think
there could still be an argument for including this functionality in your
PR (as I guess it's the main candidate now for fixing the path issue)?

I see in the #271 #271 thread
that there is discussion about whether functions should even be included in
the data context. One of dust.js's most important features - and the main
reason I switched to is - is its support of async functions in the data
context, so I found those comments a bit strange.


Reply to this email directly or view it on GitHubhttps://github.com//pull/174#issuecomment-20526979
.

@juliangoacher
Copy link
Author

@carchrae I was testing it implicitly, I've just pushed an update with an explicit test: juliangoacher@32397df Disabling the function wrapper code in my PR causes the test to fail, "Expected 'Peter Jones' to equal 'undefined undefined'."

@carchrae
Copy link
Contributor

carchrae commented Jul 5, 2013

ah, gotcha. you want the function to refer to other parts of the context. interesting problem. and good test case to demonstrate why you want it.

{
        name:     "method invocation",
        source:   "{person.fullName}",
        context:  {
          person: {
            firstName: "Peter",
            lastName:  "Jones",
            fullName: function() { return this.firstName+' '+this.lastName; }
          }
        },
        expected: "Peter Jones",
        message:  "should test resolve correct 'this' when invoking method"
      },

i'm not a fan of wrapping the function in the getPath code - you'll be doing that each time you hit the reference.

i think you might also break a path that returns a non-simple function - ie one like this: function(chunk, context, bodies, params) - which might instead return a chunk. see the Chunks section on http://akdubya.github.io/dustjs/ - it might that one of the rules on a function is to not allow access via this -

   fullname : function(chunk, context) {
    var cur = false;  /* search local context only? */
    return chunk.write(
           context.getPath(cur,["person","firstName"])   /* boy that is ugly! */
           + '  ' 
           + context.getPath(cur,["person","lastName"])
    );
  }

also, not sure if you've seen this code; https://github.com/linkedin/dustjs/blob/master/lib/dust.js#L370 - there is an explict attempt to set 'this' to the current context (note, not the object, but the dust.Context type)

@carchrae
Copy link
Contributor

carchrae commented Jul 5, 2013

also, thanks for reminding me why i hate 'this' in javascript so much. :)

@juliangoacher
Copy link
Author

@carchrae the minor overhead of creating the wrapper each time is better than code not working as expected. I know there is code elsewhere in the source that attempts to resolve 'this', but (obviously) it isn't working correctly. Not sure why you think it would break a non-simple function - I've been using my changes with async functions like the one you show. And I don't like arbitrary rules like "don't use this here' - 'this' is a fact of life!

@carchrae
Copy link
Contributor

carchrae commented Jul 5, 2013

yes, you're right - the async functions should be ok, you're just changing
'this' - however if anyone else is writing wrapped methods and presume that
'this' is set to Context.this then you may surprise them by setting it to
you object.this. i really don't like using 'this' in callbacks for that
reason - it is never clear what 'this' refers to when the function is
invoked.

you might well contest the fact that someone set this to the current
Context.this as well - i don't actually understand why that change was made

  • it seems wrong to me, since context is passed explicitly as an argument.

@juliangoacher
Copy link
Author

@carchrae My reasoning on this is that if the path 'a.b' resolves to a function, and the function contains 'this' references, then it's natural to suppose that 'b' is a method of 'a' and the 'this' refs refer to 'a'.

@carchrae
Copy link
Contributor

carchrae commented Jul 5, 2013

makes sense - although it is not the same as the dust template 'this' aka {.} - which is the current context.

i'm pretty sure the suggested way is to use ctx.get("firstName") instead of this inside functions - and since ctx==this, you could just write this.get() or get(). but that won't work in the above testcase - it would work if the template was {#person} {fullname} {/person} - instead you'd have to use the getPath ugliness.

another alternative is use the function constructor pattern: http://stackoverflow.com/questions/1114024/constructors-in-javascript-objects

var Person = function(first,last) {
  var self = this;
  self.first = first;
  self.last = last;
  self.fullname = function(){ return self.first  + self.last; } ;
}
var person = new Person('bob','dobbs');

@juliangoacher
Copy link
Author

@carchrae I don't think suggestions like using ctx.get inside functions are design choices so much as workarounds the limitation of not having a correctly resolve 'this'. Bottom line here is that in any other JS code, 'this' in a function method is a ref to the object the function is bound to, and there's no reason why it can't be the same in dust. Fixing this just means that object methods work as expected. For example, if I have a date object in the context then with the fix I can do something like {date.getYear} in a template - that won't work currently. Saying 'this === ctx' is just arbitrary, restrictive and obsolete because as you pointed out, ctx is passed as an argument anyway.

@carchrae
Copy link
Contributor

carchrae commented Jul 6, 2013

Yeah, I pretty much agree. Only thing I did not like so much is wrapping function. It would be nicer to set this at the function apply, wonder why that is being set to context..

Julian Goacher notifications@github.com wrote:

I don't think suggestions like using ctx.get inside functions are design choices so much as workarounds the limitation of not having a correctly resolve 'this'. Bottom line here is that in any other JS code, 'this' in a function method is a ref to the object the function is bound to, and there's no reason why it can't be the same in dust. Fixing this just means that object methods work as expected. For example, if I have a date object in the context then with the fix I can do something like {date.getYear} in a template - that won't work currently. Saying 'this === ctx' is just arbitrary, restrictive and obsolete because as you pointed out, ctx is passed as an argument anyway.


Reply to this email directly or view it on GitHub.

@juliangoacher
Copy link
Author

@carchrae The wrapper function is a necessary evil (although I don't think it's bad - it's one of the beauties of js that a wrapper like that can be easily written, and it's a common solution to this type of problem). I certainly think that getPath is the correct place for the wrapper function. The 'this' problem only really occurs when evaluating paths, and putting it there ensures that any code using getPath will be able to resolve function results correctly. If your concern is performance related - i.e. wrapping a function each time it is resolved - then there are ways to optimize, although I think such concerns are probably overstated.

@rragan
Copy link
Contributor

rragan commented Jul 8, 2013

I see your point about making this be the parent object in the event of a path reference. The code in the reference method has evolved slightly in the LinkedIn era. The original code was:

elem = elem(this, context, null, {auto: auto, filters: filters});
and now is
elem = elem.apply(context.current(), [this, context, null, {auto: auto, filters: filters}]);

Neither do what you want wrt "this".

The reference method does have logic to support functions that return Chunk and those that don't. This is part of the requirements.

akdubya does state:
"Dust does not care how your reference objects are built. You may, for example, push prototyped objects onto the stack. The system leaves the this keyword intact when calling handler functions on your objects."

Any chance we can divorce this from the getPath issue? I guess a reason not to would be that this would be another incompatible change that could be swept up as part of the "pathScope". The change would be to an internal interface that is not documented exactly. However, LI has no stomach for any incompatible changes due to the large number of internal use projects. Users always seem to find a way to depend on undocumented behaviors. it is OK to migrate, though, with options such as were added with #271.

@juliangoacher
Copy link
Author

@rragan I'd seen the original code before. I'm inclined to think that akdubya was looking for the most convenient 'this' candidate to hand, rather than necessarily the most correct one. Using plain 'this' in the first case was never going to be useful as it would reference a chunk object. Using context.current() was a stab at a better solution but is never going to return the correct reference when a path is being resolved. For these reasons I'd doubt that anyone uses 'this' in a path resolved function... but I guess you never know.

I really think there's an argument for including this fix as part of the getPath functionality because it's a problem that particularly occurs when a method function's value is being evaluated in a template. And having options for enabling/disabling the new behaviour (also included in this pr) are all the more reason to include a 'this' fix, as anyone likely to be affected won't enable the new behaviour.

Anyway, I see now that the situation has been complicated with the addition of a third pr addressing path functionality - I actually think #296 's treatment of paths is more correct than #271 as it follows the behaviour of an object prototype chain in js (same as this pr). Do we get a vote on which one goes through?

@rragan
Copy link
Contributor

rragan commented Jul 11, 2013

General feeling seems to be that it is OK to make #271 incorporate the wrapper function to ensure "this" is proper. Can we get that change incorporate and a test or two to verify it. Then maybe we can move forward with this set of changes -- "famous last words". Thanks for the work, folks.

@vybs
Copy link
Contributor

vybs commented Jul 11, 2013

concur with @rragan. Atleast at LI we never encourage use of lambda's in JSON for various reasons, security been one of them, not executing random JS code.

carchrae added a commit to carchrae/dustjs that referenced this pull request Jul 11, 2013
@carchrae
Copy link
Contributor

ok - i've done it and added a test case.

i'd be interested to see the performance difference. should be small, but
there is an extra var and a single typeof check at the end

On Thu, Jul 11, 2013 at 9:41 AM, Veena Basavaraj
notifications@github.comwrote:

concur with @rragan https://github.com/rragan. Atleast at LI we never
encourage use of lambda's in JSON for various reasons, security been one of
them, not executing random JS code.


Reply to this email directly or view it on GitHubhttps://github.com//pull/174#issuecomment-20825159
.

@juliangoacher
Copy link
Author

@rragan @vybs Ok, I realise I can't probably change the decision to go with #271 as this stage, but I didn't realise until @rragan pointed it out that #271 attempts to resolve the full path at all points on the stack until it finds a match, and I'm very strongly of the opinion that this is a bad choice of behaviour, which will produce surprising and unpredictable results for some subset of users. Because at my company we already make extensive use of the modified path functionality, and the #271 style change will just cause problems for us, it looks like I'm going to have to continue maintaining my current project fork as a separate project...

@carchrae
Copy link
Contributor

actually this is wrong. it used to do that, now it fails if it finds a
partial match

@juliangoacher
Copy link
Author

@carchrae Ok, so just to be clear - if I have a stack like the following on the context:

[
    { a: { b: 5 }},
    { a: { c: 0 }}
]

then:

  • the path {a.c} will evaluate to '0';
  • but the path {a.b} will evaluate to an empty string (i.e. in the template output)?

If this is the case then that's good and I don't have an objection.

@carchrae
Copy link
Contributor

that does not look like a valid context. but it would be fantastic if you can just express it as a test case, then i'll add it.

@carchrae
Copy link
Contributor

er, sorry, you did say stack.

@carchrae
Copy link
Contributor

and the answer, is yes, a.b will not be found, because of previous partial match on a.c

@juliangoacher
Copy link
Author

@carchrae Yes, it is an 'idealised' view of the stack (although perhaps I showed it upside down).

@carchrae
Copy link
Contributor

@vybs
Copy link
Contributor

vybs commented Jul 17, 2013

closed, since it is addressed in #305

@vybs vybs closed this Jul 17, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet