Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should equals over null or undefined by specified, and if so, how? #151

Closed
RubenVerborgh opened this issue Jan 28, 2019 · 24 comments
Closed

Comments

@RubenVerborgh
Copy link
Member

RubenVerborgh commented Jan 28, 2019

In #142, we were discussing whether equals should be defined over null and undefined. Let's express opinions with arguments here. Please be clear and brief.

Please vote with on this message:

  • 👍 define in the spec, return false
  • 🎉 define in the spec, throw error
  • 👎 leave open in the spec
  • 👀 either 👍 or 🎉, but not 👎
@RubenVerborgh
Copy link
Member Author

RubenVerborgh commented Jan 28, 2019

I vote 👍 to define equals to return false when the argument is null or undefined.

My reasons for returning false are:

  • First of all, this is JavaScript, so there is no notion of design-time checks. So my starting assumption is that null or undefined can and hence will occur.
  • To me, the equals method answers the question "is this thing equal to this other thing?" A term is not equal to undefined, null, 1234353, or "foo", so should return false in all of those cases.
  • Throwing an error for undefined and null would provide inconsistent behavior. This would (presumably) be some kind of type error, indicating that the argument has the wrong type. However, only throwing this argument for undefined and null guards us only against an extremely small number of such cases. It does not guard us against 1234353, "foo", {}, new Error(), new Date(), NaN, Infinity, or whatever else that is of the "wrong type". So let's be consistent and just return false for everything that is not equal.
  • It makes for a more general equals contract that can be used in other objects.
  • It follows practice from other JavaScript methods such as isNaN, isFinite, isPrototypeOf.
  • It follows practice from other programming languages such as Java.

I additionally want to remark that a runtime check of !!other && should not cause any overhead on modern JavaScript engines. The fact that such a guard is present, might even (depending on the engine and version) make things run faster, because it eliminates an exception path. In contrast, a throw in a function could lead to an elevated calling cost because of the need to set up exception handling (although that might be able to be optimized away). As you can see: performance arguments either way, so if we decide to strongly rely on them (which I hope we don't), we'd need a benchmark.

@bergos
Copy link
Member

bergos commented Jan 28, 2019

I'm for allowing optional and nullable values (also in WebIDL).

If JS would not have null/undefined, we would need to define an interface like NoTerm extending from Term, if we want pass around a term which is optional. But we have null/undefined in JS and that's the actual usage for it.

Also losing errors is not an argument. If the term is a return value of a function which claims to not return null/undefined, the function has to take care that it throws an error if the return value can't be terminated. If the return value is optional, but the code which is calling the function requires a term, the code must take care of it. If we define .equals() to throws errors, that would mean we shift the error handling for the cases where term is not optional to the .equals method. It also means permitted optional values need special code paths before calling .equals().

@elf-pavlik
Copy link
Member

I think we should either agree to define it in specific way or not include it in the spec at all #150 . Having it loosely defined seems to me as worst case scenario since implementer can't assume how the method with behave and will need to code more defensively. I think one might actually prefer in that case to rely on custom function checking equality which follows a specific contract.

@RubenVerborgh

This comment has been minimized.

@RubenVerborgh
Copy link
Member Author

More votes? @blake-regalia? Others?

@ericprud
Copy link

ericprud commented Feb 1, 2019

I prefer strict APIs over a permissive APIs because it trivially catches a large fraction of the likely errors. The effort in debugging an invocation of a permissive API typically involves a debugger and lots of rewinding from the first noticed problem, while catching the same error in a strict API typically happens so quickly and immediately we don't notice it. The occasions where one would sensibly want to generalize the equals to include null and undefined are rare and should result in a visible manifestation in the code, e.g.

if (previousPoster && equals(previousPoster, currentPoster)) { /* mark as self-reply */ }

With the permissive API, that can be shortened to:

if (equals(previousPoster, currentPoster)) { /* mark as self-reply */ }

but I believe that hides the intention of the code.

@blake-regalia
Copy link
Contributor

blake-regalia commented Feb 1, 2019

@RubenVerborgh I think you are giving an unfair representation to the issue. First of all, the suggestion was never to require implementors to explicitly throw an error, the argument was simply for allowing implementors to not check for falsy values which would automatically raise an exception for null/undefinned. The phrasing and arguments you've made make it seem we are asking implementors to write a throw statement -- this is not what I am suggesting. However, I'd imagine that if an implementor wanted to throw a TypeError that would be their perogative and would impose zero interop challenges / breaking changes.


It follows practice from other JavaScript methods such as isNaN, isFinite, isPrototypeOf.

I'm glad you brought this up. Actually, after a bit of digging I found perhaps the strongest precedent:

>> buf.equals(otherBuffer) from node.js throws a TypeError. <<


Throwing an error for undefined and null would provide inconsistent behavior. [...] However, only throwing this argument for undefined and null guards us only against an extremely small number of such cases. It does not guard us against 1234353, "foo", {}, new Error(), new Date(), NaN, Infinity, [...]. So let's be consistent and just return false for everything that is not equal.

Saying that this only guards against an extremely small number of cases is totally incorrect as null/undefined is 1000000x more likely than new Error() or other silly things. As you've said yourself:

it's prudent to guard for undefined or null (which are not fake instances but rather null pointer mistakes)

'Null pointer mistakes' and variables that were not initialized are exactly what implementors want to prevent by raising an exception. Please let us do so.

.

Furthermore, you cannot argue for consistent behavior for all 'universal' cases and then surrender to not checking for 'function' === typeof this.subject.equals && 'function' === typeof this.predicate.equals && .... I debunked this with plain objects. If your argument is to be consistent then you need to own all of it. I'm tired of these arguments where it's having your cake and eating it too.


It follows practice from other programming languages such as Java.

......

First of all, this is JavaScript


Consider this

Assume that Term#equals allows null/undefined and some function yields terms (and possibly nulls) in no particular order.

// fill an array with terms and maybe some nulls
let a_terms = [
	...functionThatYieldsTermsAndNulls(),
];

// test every pairwise combination
let nl_terms = a_terms.length;
for(let i_a=0; i_a<nl_terms; i_a++) {
	for(let i_b=i_a+1; i_b<nl_terms; i_b++) {
		// found a duplicate
		if(a_terms[i_a].equals(a_terms[i_b])) {
			console.log('found duplicate');
		}
	}
}

If functionThatYieldsTermsAndNulls happens to yield a null value as the last item, nothing breaks. If it yields ONE MORE term after a null value, the whole thing blows up in the developer's face.

The point is that some of want to protect against null/undefined rather than silently ignoring it.

@RubenVerborgh
Copy link
Member Author

I think you are giving an unfair representation to the issue.

This was definitely not my intention. I raised an open question regarding false/throw/unspec'ed.

The phrasing and arguments you've made make it seem we are asking implementors to write a throw statement -- this is not what I am suggesting.

You are suggesting unspec'ed. Unspec'ed means concretely that implementors have the freedom between returning 1) false (which is what I want) or 2) throwing (which is what I am against). I am not aware of a third proposed option. Hence, my arguments are for the first option and against the second option, leading to my conclusion that it should be a) spec'ed b) as the first.

I'm tired of these arguments where it's having your cake and eating it too.

None of us are making broken arguments on purpose. As I said, happy to be proven wrong.

To your concrete point

you cannot argue for consistent behavior for all 'universal' cases and then surrender to not checking for 'function' === typeof other.subject.equals && 'function' === typeof other.predicate.equals && ...

Yes I can: other.subject.equals is never called, so why would it need to be checked? Also, you forgot the case where accessing other.subject or other.subject.equals throws an error. We can guard for all of that, but I don't think we should, and performance will agree with me. So this is why I am sticking to reasonably behaved objects (= no error thrown on property access) and am not calling anything that is not a function.

@blake-regalia
Copy link
Contributor

Yes I can: other.subject.equals is never called

Sorry I made a typo; should be this.subject.equals, which would need to be checked in order to return false for all cases -- otherwise it would throw an error. My point was that you are arguing for consistency and that returning false works for new Date() yet it does not work for: quad.equals(factory.quad({termType:'NamedNode', value:'http://ex.org/value'}, ...)). Okay if you want to be consistent but how do you address these cases which are more plausible than new Date()?

@RubenVerborgh
Copy link
Member Author

RubenVerborgh commented Feb 1, 2019

Okay if you want to be consistent but how do you address these cases which are more plausible than new Date()?

I'm not sure why malformed Term objects would be more plausible; I don't have things that would generate them. But yes, internal object consistency is assumed. Otherwise, we even could/should start doubting the existence of this.subject at that point (the constructor assigned it, but maybe someone changed it, and what if accessing it throws an error).

@elf-pavlik
Copy link
Member

@RubenVerborgh: I additionally want to remark that a runtime check of !!other && should not cause any overhead on modern JavaScript engines. The fact that such a guard is present, might even (depending on the engine and version) make things run faster, because it eliminates an exception path.

@ericprud: I prefer strict APIs over a permissive APIs because it trivially catches a large fraction of the likely errors. [...] The occasions where one would sensibly want to generalize the equals to include null and undefined are rare and should result in a visible manifestation in the code

It seems that those two come as mutually exclusive requirements which I see as motivation for #150 This way as libraries which want to rely only on what data-model-spec defines, would need to use something like if (equalTerm(previousPoster, currentPoster)). While applications which want to use member method, may need to ensure that specific factory which provides custom extension creates all the terms.

Mandating one of those mutually exclusive requirements in the spec, would prevent compliant factories to choose to implement it the other way. Taking the 👎 leave open in the spec would allow implementations to choose either way, but then it would put developers in the similar situation as not defining it at all, since they would need to either rely on a equality function which has clear contract matching their expectations, or ensure that all terms get produced by a specific factory. In that case not including it the spec feels a cleaner option.

Given current vote counts on 👍 define in the spec, return false. What people voting this way can recommend for implementations which need different behavior but still want to comply with the spec?

@RubenVerborgh
Copy link
Member Author

What people voting this way can recommend for implementations which need different behavior but still want to comply with the spec?

For instance, check for null before calling equals, or write a derived member method on Term that does this.

@bergos
Copy link
Member

bergos commented Feb 2, 2019

@blake-regalia About the equals method of Node Buffer:

If Node.js defines it different to ECMA, I would rather follow the examples from ECMA as our spec is about JS.

Also I see it even as an argument for my NoTerm argument as a Buffer can be empty. So every API we had a look at supports somehow empty/null.

@blake-regalia
Copy link
Contributor

If Node.js defines it different to ECMA, I would rather follow the examples from ECMA as our spec is about JS.

I would happily agree if there were any examples from ECMA on objects (i.e., non-frozen values that inherit from Object) but they are only for primitives (e.g., numbers and strings). For example, if only there was a Date#equals this whole debate would be settled but I have not been able to find any examples from ECMA on non-primitives.

Also I see it even as an argument for my NoTerm argument as a Buffer can be empty. So every API we had a look at supports somehow empty/null.

Empty and null are completely different. The NoTerm does not support null as it would be a defined object, right?

@RubenVerborgh

  1. false (which is what I want) or 2) throwing (which is what I am against). I am not aware of a third proposed option.
  1. don't check for falsy values; no throw statement. This is the third option, calling with null/undefined would raise an exception, the same way it might for any other method that expects a typed parameter, say, sink.import(null)

Anyways, it seems like we are split on this issue and nobody will be happy either way. As it stands, I cannot support null/undefined but there doesn't seem to be any more constructive discussion to be had.

@bergos
Copy link
Member

bergos commented Feb 2, 2019

You get the empty Buffer for free, without changing anything in the code. That's not the case for Term, that's why I would use null.

What I don't understand, if you want such strict typing for RDFJS, what about the rest of your code? How can null be the error case and not empty? Shouldn't you throw also an error in your code, instead if having variables with null values? Shouldn't it be like this:

A function returns a Term, but not found is not an error case, so it could be null:

const term = functionReturnsOptionalTerm()

if (other.equals(term)) {
  // you never reach that point with term === null
}

Another function always returns a Term. Not found is an error case, so it also throws an error:

try {
  const term = functionReturnsAlwaysTerm()

  // you never reach that point with term === null...
  if (other.equals(term)) {
  }
} catch(err) {
  // ...cause you will end up here
}

I'm still curious in which cases accepting null by .equals() is a problem.

@blake-regalia
Copy link
Contributor

blake-regalia commented Feb 2, 2019

Thanks @bergos 🙂 , the point is not to try/catch, nobody is suggesting to write try/catch blocks for .equals. The point is to prevent unexpected null/undefined variables from being silently ignored in prod.

A function returns a Term

const term = functionReturnsOptionalTerm()

// term may be null/undefined and i, the dev, *know* that is part of
// the return range for functionReturnsOptionalTerm; so i test for
// falsy before using it as a term anywhere
if (term && other.equals(term)) {
}

Another function always returns a Term.

const term = functionReturnsAlwaysTerm();

// no need to test for falsy term since it is *assertable* that it
// should *never* return null/undefined. if an exception is raised here,
// then there is a *bug* with functionReturnsAlwaysTerm
if (other.equals(term)) {
}

@bergos
Copy link
Member

bergos commented Feb 3, 2019

Thanks for explaining @blake-regalia. So in the end it's about priorities. For me it's more important to have null/empty feature than faulty code handling.

Also what we are working on is a spec and not a law. You are free to work outside the spec. Maybe you can do it even like this:

// spec interface
const factory = require('my-factory')

// spec + throw error on null interface
const factory = require('my-factory')({ throwNull: true })

Well documented in the readme, so everyone knows this is a custom feature.

@elf-pavlik elf-pavlik added this to the data-model-spec milestone Feb 7, 2019
@elf-pavlik
Copy link
Member

elf-pavlik commented Feb 8, 2019

https://gitter.im/rdfjs/Representation-Task-Force?at=5c5e0715adf6cb0b2c87a986

@timbl 16:47
For me the equals sunction in the library is sameTerm() which obvious suggests that it only expects to be passewd a term. I’d marginally be in favor of ending up with an exception if I pass it undefined, as experience tells me that is going to be the way a lot of wrrors are cuaght, and like @ericprud i prefer it the exception as early as possible. but much more important is consistency between different libraies

I see he voted with 👀 but it does have slight preference on 🎉

@elf-pavlik
Copy link
Member

Also what we are working on is a spec and not a law. You are free to work outside the spec.

// spec + throw error on null interface
const factory = require('my-factory')({ throwNull: true })

Instance created with this factory will not conform to the spec any more since when null or undefined given it will throw and not return false as spec mandates.

@bergos
Copy link
Member

bergos commented Feb 11, 2019

Instance created with this factory will not conform to the spec any more since when null or undefined given it will throw and not return false as spec mandates.

I'm aware of it. I'm not saying this should be used by a library, but if you want, you can do it in your app. Also you should be aware that you risk compatibility with other libraries.

@bergos
Copy link
Member

bergos commented Feb 11, 2019

Based on 8 👍 , 3 🎉 and 3 👀 votes, we will allow null and undefined values and return false for these cases.

How should we write it in the spec? Some details we have to clearify:

  • In Rigorously define equals method on Term and Quad #142 null and undefined were explicit mentioned. Should we just write falsy?
  • In the reference code we do a simple !!other. How do we get the description matching to the code?
  • How should the WebIDL look like? Do we defined other optional and nullable?

I prefer define it optional and nullable in WebIDL. Maybe explicit mention the falsy check is a good idea.

@RubenVerborgh
Copy link
Member Author

That is possible; reason I explicitly wrote null/undefined is because those are the two values for which the remainder does not hold. The remainder does hold for false and 0, for instance.

  • In the reference code we do a simple !!other. How do we get the description matching to the code?

We do not; the reference code has an optimization in there, which is not necessary to be spec'ed.

@vhf
Copy link
Member

vhf commented Feb 11, 2019

I suggest reapplying the patch I removed from #142 :

diff --git a/interface-spec.html b/interface-spec.html
index f0f7b01..5f4cc36 100644
--- a/interface-spec.html
+++ b/interface-spec.html
@@ -129,7 +129,7 @@
     interface Term {
       attribute string termType;
       attribute string value;
-      boolean equals(Term other);
+      boolean equals(optional Term? other);
     };
     </pre>
 
@@ -168,7 +168,7 @@
     interface NamedNode : Term {
       attribute string termType;
       attribute string value;
-      boolean equals(Term other);
+      boolean equals(optional Term? other);
     };
     </pre>
 
@@ -193,7 +193,7 @@
     interface BlankNode : Term {
       attribute string termType;
       attribute string value;
-      boolean equals(Term other);
+      boolean equals(optional Term? other);
     };
     </pre>
 
@@ -222,7 +222,7 @@
       attribute string value;
       attribute string language;
       attribute NamedNode datatype;
-      boolean equals(Term other);
+      boolean equals(optional Term? other);
     };
     </pre>
 
@@ -263,7 +263,7 @@
     interface Variable : Term {
       attribute string termType;
       attribute string value;
-      boolean equals(Term other);
+      boolean equals(optional Term? other);
     };
     </pre>
 
@@ -289,7 +289,7 @@
     interface DefaultGraph : Term {
       attribute string termType;
       attribute string value;
-      boolean equals(Term other);
+      boolean equals(optional Term? other);
     };
     </pre>
 
@@ -333,7 +333,7 @@
       attribute Term predicate;
       attribute Term object;
       attribute Term graph;
-      boolean equals(Quad other);
+      boolean equals(optional Quad? other);
     };
     </pre>
 

If we reopen #142 I'll do it

@bergos
Copy link
Member

bergos commented Feb 11, 2019

We do not; the reference code has an optimization in there, which is not necessary to be spec'ed.

That's ok for me, I just want to be sure we all agree on it and there is no need to change the reference code to make explicit comparisons against null and undefined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants