Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check if a field exists in 'super' #311

Closed
ant31 opened this issue May 26, 2017 · 60 comments
Closed

check if a field exists in 'super' #311

ant31 opened this issue May 26, 2017 · 60 comments
Assignees

Comments

@ant31
Copy link

ant31 commented May 26, 2017

Hi,

I'm trying to do the following

local addMetadata(key, value) =
 {metadata+: {[key]: value};

{} + addMetadata("app", "name")

It fails because super doesn't have the 'metadata' field. I would like to know if there is a way to test that condition ?
eg:

if std.objectHas(super, 'metadata')
@sparkprime
Copy link
Contributor

#305 asks for the same thing.

My preferred approach is to put metadata: { } in the super object, but if that is not possible, we might have to revive the standalone super, which I removed because the implementation was very complex.

@jbeda
Copy link
Contributor

jbeda commented May 26, 2017

As we build out ksonnet, we want to be able to "mix in" things that the super object may not know about apriori.

An alternative would be to have something like {} + initMetadata() + addMetadata(...) but that is pretty verbose.

@ant31
Copy link
Author

ant31 commented May 26, 2017

Is this limited to a new notation or function would reduce complexity ?

eg:
metadata|: {}

or
std.superHas('metadata')

@sparkprime
Copy link
Contributor

So in general +: can be used to extend an array, string, or even to add to a number. So a |: syntax sugar would have to default the super field to the correct "zero" value for the RHS of the operator.

What I used to do is have a template for my "base" objects that came from the library. So you do:

ksonnet.Base { } + ksonnet.mixin1 + ksonnet.mixin2 instead of { } + ksonnet.mixin1 + ksonnet.mixin2

That allows you to add stuff to Base as you extend the library. Can you do that?

@jbeda
Copy link
Contributor

jbeda commented May 26, 2017

@sparkprime this is much of what we are doing now. The end result is that there is a lot of ugly cruft left over in the final rendered output (hence prune). As things get deeply nested this gets pretty big. We also may find ourselves in a situation where we have:

lib1.Base + lib2.mixin1 + lib2.mixin2

lib2 may be adding a field that lib1 doesn't know about. We'd then have to do something like:

lib1.Base + lib2.Base + lib2.mixin1 + lib2.mixin2

But that is pretty order dependent and a bit dangerous.

This triggers a thought: Another reason to allow detection on super would be to assert the thing you are clobbering is empty already. That would prevent users from putting a base on top of something that already has data.

@hausdorff
Copy link

In addition to what Joe and Antoine are saying, our goal is to use mixins with Kubernetes API objects that may not have been created using our templates, and therefore, may not have our zero values. This use case is very, very important to us, and it would be very helpful to have language level support.

@sparkprime
Copy link
Contributor

In the old world where standalone super existed, you could achieve this in a library by doing something like

local extend_if_has(s, f, v) = if std.objectHas(s, f) then s[f] + v else v;
arbitrary_object + { f: extend_if_has(super, "f", ...) }

which would be equivalent to arbitrary_object + { f |: ... }

@sparkprime
Copy link
Contributor

What about if super[f], rather than yielding an error if f is not in super, instead returned some special value that triggers an error if used in any way other than on the LHS of a +, in which case the semantics are that the result of the + operator is the RHS of the +.

The net effect of all of that is that { } + { f+: { } } will yield { f: { } }

This should be backwards compatible because it's only introducing new semantics in a case that used to yield an error.

@sparkprime
Copy link
Contributor

My intuition here is that in an OO language, if you do class A extends B { public void f() { } }, there was no requirement for there to be an empty function f() in B. It either defines or overrides. So, fundamentally, what you're asking for is natural.

@sparkprime
Copy link
Contributor

An alternative way to achieve this is to make +: a first class primitive, i.e. it is not just a desugaring to super[f] + ..., as it embeds the error tolerance.

@hausdorff
Copy link

Unless I'm missing something, the option super[f] seems to me to impose a not inconsiderable risk to users who might accidentally shoot themselves in the foot, while its benefits seem to be only that it seems pretty quick to implement. If that's correct, then I would say that's my least preferred option.

That seems to leave the options of making +: a part of the runtime, or having a standalone super. From a semantics perspective these seem to be about equivalently sensible, so my suggestion would be the one that imposes the least amount of risk to the runtime, which I tend to think of as being correlated with complexity of implementation.

What do you think?

If that sounds more or less inline with your thinking, it would be great to have a breakdown of the work you would expect to ship the feature, so that we can prioritize it later next week. (BTW, we will also be discussing the changes to import I have proposed in #307.)

@sparkprime
Copy link
Contributor

What kind of foot shooting do you anticipate with a more permissive super[f] ?

@hausdorff
Copy link

Speaking for myself, I think it introduces corner cases that would not be obvious to me as a new user. The failure mode as it exists now is not super subtle: it becomes clear immediately what is and is not allowed (e.g., you now know you can't check if super has a field, so your mixins must build on fields that exist).

In the case of a more LHS-amenable super[f], the failure mode is more complex, and it's harder to reason about what is not allowed, which (I believe) makes users more likely to accidentally do something not that's not allowed.

@sparkprime
Copy link
Contributor

To 100% verify what you'd say, we'd have to find some case where super[f] + e actually should yield an error, and yielding e instead masks a real problem. The most obvious example is a mistake computing f (or a typo in the case of super.f). I can't think of an example but since super[f] has uses beyond the desugaring of +: I'm convinced there probably is one.

Two things come to mind:

  1. super[f] having more "powers" than self[f] is weird
  2. we don't need things like local s = super[f]; s + 3 to work, i.e. this is overkill.

The difference between "more permissive +:" and "standalone super" is that the former allows you to write your mixins more easily. I can imagine you defensively coding every field using reflection, and that's going to hurt readability and maybe performance too.

Having +: (and friends) be a first class operator with extra powers means you can't use a mixin to prepend to an array, or insert into a set, which might be useful.

I think what we need is a new operator, and I can think of two possibilities (syntax TBD) and I haven't thought about how to define the semantics in the spec yet (object-inherit rule):

  1. super[f] ? e to be parsed as a single AST with f and e as subexpressions. super.f ? e is also allowed, as a simple syntax sugar of super["f"] ? e. The semantics of super[f] + e are that if f exists in super, then super[f], otherwise e. Then the various forms of +: can all desugar to either one of those. The tricky part here is f+: e needs to desugar to f: (super[f] ? zero(e)) + e, where zero is a function that produces the appropriate zero value according to the type of its argument.

  2. super[f] ?+ e where the difference in semantics is it yields super[f] + e if super[f] exists, otherwise it is e. In this case the desugaring does not need a zero() function, but if you want to apply an arbitrary transformation to the value from the superobject, you have to add on an appropriate zero, e.g. to prepend a 1, do [1] + (super[f] ?+ [])

Some test cases:

{} + {f+: 3} == {f: 3}
{f: 1} + {f+: 3} == {f: 4}
{f+: 3} == {f: 3} (must be equivalent to {} + {f+: 3} because of identity of {}
{f: 1} + { } + {f+: 3} == {f: 4} (

@sparkprime
Copy link
Contributor

Another option:

  1. Something like super ? f, which returns true or false depending whether the field from expression f`` exists in the super.

@hausdorff
Copy link

hausdorff commented May 29, 2017

Assuming I understand your proposal, it seems to be similar to C#'s null coalescing operator, but instead of coalescing when the value is null, you're coalescing when !std.objectHas(super, f). This sort of thing has worked out really well for C#, so I think your proposal is reasonable.

I do wonder two things:

  1. Is it worth keeping the overload-or-set syntax explicit? It seems to me that users might in some cases want to fail if some property f is not in super.
  2. What is the zero value of a complicated object? I haven't thought about it a lot, but since the spec doesn't seem to indicate there are static, structural types, it seems like it could just be {}, which should retain the existing semantics of +:.

@sparkprime
Copy link
Contributor

Yeah everything we've discussed has been a twist on that kind of defaulting operator, except specifically for super.

There is Java's @OverRide annotation which might work for the case where you want to assert the existence of something in the super class. That works for these more complex nested things and in the basic case of A + { /* override */ f: 3 } where you don't want to add on to super.f but just replace it (e.g. A = { f: error "please override this" }).

zero is just z where z + v == v, so since + is only defined on a few types, it is { } [ ] "" or 0.

@hausdorff
Copy link

Ok, so does it make sense at this point to put together a proposal for how to proceed? I'm not sure generally what the process is.

@sparkprime
Copy link
Contributor

sparkprime commented May 31, 2017

I think we're doing the process right now :)

I think super ? f is the best bet as it's the smallest construct from which the others can be derived. It's easy to add it to the spec because it can desugar to std.objectHas(super, f). Actually implementing it is more challenging though. I'll look into that now.

@hausdorff
Copy link

Right, sorry, what I'm saying is that it would be helpful for me is if we had a breakdown of stuff to do so that I could start to push to put that in our planning cycle. Unless you want to do the implementation, of course. :)

@sparkprime
Copy link
Contributor

sparkprime commented May 31, 2017

I'd like to sort out all remaining design questions before we start implementing, as I'm not worried about the implementation and can do it all myself if needed. However I'd rather not have to backtrack the design after it's gotten adoption. Typically, the design part is the hard part :)

  1. Is the question mark appropriate syntax? Is there any syntax we'd like to support in future that has a ? (Doesn't appear to be the case in Python and Javascript, disregarding ?: which we already have via if then else).
  2. Do we need more symmetry with super.f and super[e], e.g. super?[e] and super?.f or ?super.f or super.f? or something?
  3. Do we want symmetry with self? I.e. do we want e ? e and in that case what is the precedence of ?
  4. Should we have an equivalent of std.objectFields() for super, like super?? or something? Therefore should we deprecate std.objectFields and have e?? the equivalent?

If the design ends up being bigger, that's fine as we can stage the implementation. So you'll still be able to take advantage of this quickly.

Here's a breakdown of implementation tasks with dependencies:

Code:

  1. ast.h: Add ASTType enum + struct for each new AST (or we can add boolean flags and extra whitespace & comments to existing ones)
  2. (deps: 1) Changes to lexer and parser to generate those ASTs
  3. (deps: 2) Change static analysis to understand them (they're the same as existing constructs)
  4. (deps: 3) Change desugarer (they're the same as existing constructs).
  5. (deps: 4) Modify VM to execute them. New ASTs require new FRAME enums to put on the stack. There is an existing function findObject that can be used to implement the actual semantics.
  6. (deps: 2) Change reformatter to print out the new constructs, and add rules to existing passes to do the right thing regarding reindentation, etc.
  7. (deps: 5) Modify desugaring of +: +:: +::: to use the new constructs.
  8. Add end-to-end tests for these constructs.

I propose I do 1, 2, 3, 4, 5 as (5) in particular involves ugly code and a delicate garbage collector.

Docs & Releasing:

  1. Update spec to add the new construct (we shouldn't have to change the core language, just add desugaring rules).
  2. Update spec for the change in +: desugaring.
  3. Update tutorial
  4. Cut the release
  5. Update brew and pypy

@hausdorff
Copy link

hausdorff commented May 31, 2017

I don't think that I understand what you mean in questions (2) and (3). I'm not sure if you're proposing new syntax/semantics, or what your proposing symmetry between.

I can talk a little bit about point (1) though.

A lot of the proposed syntax is similar to various implementations of null-coalescing operators (you might say they're null-coalesque), but semantically different (i.e., we coalesce when !std.objectHas(super, f) rather than when super.f == null. So I'd like to propose that we either (1) change the syntax to be less similar, or (2) change the semantics to be more similar. Aside from avoiding misleading users with syntactically similar, but semantically different operators, if we were ever to decide to adopt null-coalescing operators, we will have to discuss again how much syntax we'd like to borrow from others, how confusing it will be to have null-coalescing be a different operator while we still have ?, and so on.

Just to be completely explicit, let me describe the semantics of the C# implementation of null-coalescing operators (in the case of the latter operator, there is a similar implementation in JavaScript). super?.f and super?[f] will return null if super == null. And super[f] ?? e would return e if super[f] == null.

Given this, it seems like we have a few options:

  1. Keep the syntax the way it is and don't change the semantics.
  2. Use some other syntax. Antoine's (|:, which looks kind of like an or) seems sensible to me. If we ever adopt null-coalescing operators, this might be weird to have both.
  3. Keep the ? operator, but coalesce not only when !std.objectHas(super, f), but also when super.f == null. This introduces subtle semantic differences between this and other implementations, though, as it is subtly conflating "not in the object" with null, which might lead users to expect that they can get null if they try to do super.f for some fsuper.

I'm less concerned about other operators wanting to use the question mark. The obvious one is Javascript's optional-value ?: (i.e., not the ternary operator), and C#'s nullable syntax [struct name]?.

@sparkprime
Copy link
Contributor

The closest thing we have to null coalescing is if x == null then y else x, and we could conceivably add a syntax sugar to shorten that to x ?? y. You can't do x || y in Jsonnet unless they're actually bools, so that method is not available. Other languages that don't use their logical or operator seem have an explicit operator, e.g. ?? (C#) and ?: (https://en.wikipedia.org/wiki/Elvis_operator). Since we already adopted the C# @ string literals, it would make sense to add ?? if we added anything at all. Is e ? e too close to e ?? e -- yeah probably. Or at least we're not so squeezed yet that we need to risk it.

As for avoiding two different kinds of coalescing operators -- one for OO and one for expressions, that's an another argument for the proposal that just returns a bool.

We have the keyword 'in' that is not used except in comprehensions. I believe we could use that as a binary operator without introducing parsing headaches. This actually aligns with Python, which is nice.

"f" in super
"f" in self
"f" in e

It would make sense to also define it for arrays: 3 in [1, 2, 3] and maybe strings although "a" in std.stringChars(str) would also suffice.

The use within comprehensions is currently limited to arrays, whereas in Python it allows you to iterate over dict fields. Maybe we should extend comprehensions too.

In order for it to solve the original problem, it would have to return true if the field is hidden. I think that makes sense -- these reflective constructs are more useful during computation, whereas hidden / non-hidden is more about manifestation.

@sparkprime
Copy link
Contributor

This also answers my questions (2) and (3), because Python has already decided:

  1. No we don't need a special syntax for "f" in e when the field is a literal that also happens to be an identifier
  2. We do need to allow it for self as well

@hausdorff
Copy link

Ok, so the proposal at this point seems to be something along the lines of using if f in super then f + [...] else [...]?

It is verbose, but acceptable. It seems like this removes the need for std.objectHas? Also, personally, I kind of like that searching an array is an explicit function call right now. Seems like 3 in [...] makes it easy to write poorly performing code. It's not a show-stopper, though.

@sparkprime
Copy link
Contributor

Yes but also changing +: to test it as well, which should cover almost all cases

Sent from my Motorola Nexus 6 using FastHub

@hausdorff
Copy link

Sweet. So is the next step to write the spec, write the code, or something else?

@ant31 ant31 changed the title [help] check if a field exists in 'super' check if a field exists in 'super' Jun 1, 2017
@hausdorff
Copy link

Also, just for planning purposes, what are you thinking about timelines? (Obviously we are hoping to have everything somewhat sooner, but I also understand you are busy. :) )

@sparkprime
Copy link
Contributor

Some progress already:

$ jsonnet -e '"f" in {f: 3}'
true
$ jsonnet -e '"f" in {g: 3}'
false

@sparkprime
Copy link
Contributor

Yeah, also give it a go and see if you can write the code you're expecting to be able to write

@hausdorff
Copy link

Excellent. It seems to work for at least a core handful of scenarios we care about. I'll start writing actual things against it and let you know if I hit problems.

@sparkprime
Copy link
Contributor

Interestingly, it breaks the invariant that { } + e = e, e.g., where e is { f+: {} } the latter is an error whereas the former yields { f: { } }. I don't think this is necessarily bad, and the invariant could be alternatively defined as "If e executes without error, then { } + e also executes without error and yields the same thing".

@sparkprime
Copy link
Contributor

Actually, { } + e = e never held in the original version of the language, with standalone super, e.g., e as { f: super }. I do think it held in the version with just super[e] though. It's not a big deal, really.

@sparkprime
Copy link
Contributor

Ok, thinking about this has led me to realize that the new implementation does not match the new spec. I haven't written the new spec, but planned to use the fact that the spec's core language allows standalone super to simply desugar e in super into std.objectHasEx(super, e, true). However the spec also says (object-index) that an unbound super in an object is substituted with { }. In the case of super[e] that always yields an error because { } is empty. However with the new desugaring, std.objectHasEx(super, e, true) is accessible and that will return false.

Basically, the question is "Should e in super return false or yield an error when there is no super object?" If it returns false, as the spec says, then { } + e is equivalent to e and, in particular, you can manifest a mixin without actually mixing it into anything. If it yields an error, then (object-index) has to be changed to substitute unbound super to an error instead or to { }. In that case, { } + e is no-longer equivalent to e and you have to add your mixin to at least { } in order to be able to manifest it.

@hausdorff
Copy link

Alright, I'm going to tentatively say that this works for our use case. I'll post back with some examples of an end-to-end scenario after we finish adapting the codegen to use this feature.

From our end, what do you need from us? If you want code review I'm still happy to do that. (My one lingering concern was whether this would work for nested types, which is why I asked whether the zero value of a nested object type should really be {}, but this seems to work now so I'm satisfied.)

@sparkprime
Copy link
Contributor

I've been grappling with the semantic crisis I described in my last 3 comments, and I'm leaning towards e in super returning false for base objects (rather than an error). My reasoning is that we started with requiring a "full shape" on the left hand side of the +, e.g. { f: { g: { } } } + { f+: { g+: { h: 3 } } } and we reduced it to just { } + { f+: { g+: { h: 3 } } } but why not go the full hog and require only { f+: { g+: { h: 3 } } }

Appreciate your thoughts on it!

@hausdorff
Copy link

It is not clear to me that {} should be the zero value of a nested object, so an expression like {} + {f+: {g+: {h: 3}}} still sort of strikes me as being kind of odd.

I suppose my main question is: why not support both? We could keep :+ as it is in master (i.e., the "non-permissive" super), and then introduce a different operator (:|, say) that lazily populate the super s.t. an expression like { f|: { g|: { h: 3 } } } works without having to have a + with a left-hand side. Then, of course, something like {} + { f|: { g+: { h: 3 } } } would fail, which is nice because it gives you more fine-grained control over how and when your mixins get applied.

If that is not to your liking, it seems like you can maintain the equational reasoning if you have some way of explicitly specifying the default value of super, which could allow you to maintain your equational semantics.

@sparkprime
Copy link
Contributor

It's definitely a tricky one. { } being the zero value of a nested object is basically what people have been asking for. In master, the level of nesting is 0. @ant31 asked for 1 level of nesting. I anticipated people wanting to go deeper than that and proposed nesting of n - 1, thinking that was the limit, but then realized that we can go all the way to n and doing so preserves an intuitive property.

As for |: having two ways to do it that are quite similar is not so good. Especially since we already have 6 'colon' operators and |: would make it 9. It's also kinda like having two super -- a strict one and one that assumes everything descends from { }. They're really similar and I'm not sure how to explain to beginners how they should decide which to use when writing a library.

While I bought your argument that adding default super behavior to super.f + e would be surprising, I can't think of a reason why forcing people to do { } + mixin in order to print a mixin helps with programmer productivity.

@sparkprime
Copy link
Contributor

Feel free to review #318 btw. I still need to update the spec and other docs. The semantic crisis is resolved one way or another by changing a single line of code, so in that sense it is not a big deal :)

@hausdorff
Copy link

I'll try to review tomorrow, but I have a wedding to get to later in the day so it might have to be the day after.

@hausdorff
Copy link

Ok, so, not that you need my approval, but I think the proposal of an "n-permissive" super is acceptable, I buy your argument for why it's slightly superior to the "n-1-permissive" super, and I buy your argument about not bloating the language with another operator.

So, I want to be careful not to position this as a total roadblock, but I think it's worth stepping back and considering two goal-level questions:

  1. Do you see value in having an explicit way of erroring out if super does not exist?
  2. Do you see it as a problem that the proposed semantics of +: do not have a way to specify different values for the case that a field does and doesn't exist (e.g., if f in super then x else y)?

If your the answer to both of these is "no" then the proposal is probably mostly complete.

But, for whatever it's worth, I do happen to see a lot of value in (1), and if you agree, and an operator is off the table, then we should discuss how we might approach this by, e.g., surfacing the in syntax outside of the core language.

I also do think that (2) is a problem. If you agree then that discussion can probably be similar to (1), e.g., it can also be solved with explicit in syntax surfaced to the user.

@sparkprime
Copy link
Contributor

sparkprime commented Jun 7, 2017

If I paraphrase your first question:

  1. Is there value in allowing the programmer to write a mixin that is an explicit "delta" and cannot be used by itself. By analogy, some date/time libraries have the notion of a time delta and you must add it on to 1970-01-01 in order to get an absolute time. Also, it's kinda like the notion of an abstract class, but for a mixin.

You can can always use just : and do the expansion yourself, e.g. get the behavior of current master by explicitly doing { f: super.f + ... }. I suppose you could choose how deep you want to do that and thus choose a value between 0 and n.

Another approach is to add an assert that requires "f" in super and use the +: sugar elsewhere. Maybe that's the equivalent of doing f: error "You must override f" in the base class (which is how we typically do "abstract" classes). Maybe it's better to require a specific kind of super (whatever you're looking for) rather than just requiring there to be some super (which even an empty object would satisfy).

For question (2), you can always write that explicitly instead of using +:, which is probably a good idea for readability.

@sparkprime
Copy link
Contributor

sparkprime commented Jun 7, 2017

Oh it looks like we're miscommunicating: e in super is exposed outside of the core library, as is e in e.

@hausdorff
Copy link

Ah, sorry, doing e in super didn't seem to work after building that PR. I mistook that for the proposed spec, but I also could just be doing something wrong.

In any event, great, I buy your argument about the "fully permissive" mixin sugar. That seems like the way to go, given what you've just said.

@sparkprime
Copy link
Contributor

There's some examples at the bottom of test_suite/object.jsonnet

@sparkprime
Copy link
Contributor

Small update: I was out of office for a few days, now I'm working on the spec (figuring out how to left align things)

@hausdorff
Copy link

Sweet. Need anything from us?

@sparkprime
Copy link
Contributor

I'm good, let me know if I can help with #307 :)

@hausdorff
Copy link

I'm hoping to get some time to work on it later this week.

BTW, we're planning on releasing the ksonnet.beta.2 a little after this patch is published into brew, so if there is anything we can do to help expedite, just let us know. (I don't want to intruide if you're too busy.)

@sparkprime
Copy link
Contributor

I think it is all done now. Do you want me to cut a release with this change or wait for #307?

@sparkprime
Copy link
Contributor

BTW I will remove the addition of std.zero from the PR since it is no-longer used in the desugaring. We can always add it back again if it has a use beyond what we discussed here.

@hausdorff
Copy link

@sparkprime Speaking only for myself, I would prefer we cut the release now, since the import syntax doesn't block our release.

@hausdorff
Copy link

That said, I am committed to doing it, just when things calm down a bit...

@sparkprime
Copy link
Contributor

No probs, let's do that then

@sparkprime sparkprime self-assigned this Jun 21, 2017
@sparkprime
Copy link
Contributor

Fixed by #318

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants