Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Non uniform null behaviour #830
In fact I need to identify when an expression has an undefined result...
I'll try to
I'll post here if this approach succeed...
I also beleive the general conversion logic that converts
The current conversion makes
One case where null could mean
Thanks guys, these are good arguments. Ok let's change the behavior of
Thinking about the changes that we need to do:
Are there any more behaviors that need to be changed regarding
We need an implementation for parse.set and parse.eval
let expression = "(calcium_mgdL < 7.2) or (calcium_mmolL > 3.5)" //TRUE because null is evaluated to zero (and we expect FALSE). parse.set("calcium_mgdL", null );//We want something FALSE or ignored. //TRUE parse.set("calcium_mmolL", 4.1 ); //Expected behavior: //The left side of the expression before OR should evaluate to FALSE. //The right side of the expression after OR is evaluated to TRUE. parse.eval(expression) === true (Even though the first part will evaluate 0 < 7.2)
Current behavior using undefined instead of null fails the whole expression which is something we do not want.
The workaround is create the expression to handle null/zero as boolean false.
let workaroundExpression = "(calcium_mgdL and (calcium_mgdL < 7.2)) or (calcium_mmolL and (calcium_mmolL > 3.5))"
Having recently been burned by null and the native Math.min, I have become convinced that null should not be coerced to anything but itself. The native Math.min and Math.max do the following:
var x; // variable to be clipped
This "feature" of native Math wasted hours of my time, since I was relying on the assumption that operations with null return null. That is indeed the case with "+" and "-". My brain simply applied induction to arrive at the assumption that max and min would behave in a similar civilized manner. They do not.
I use min and max as mathematical operators (like +), not as statistical operators.
For the statistical use of min and max with collections of things, I would even prefer to do the following at length since the intent reads clearly:
Math.min(lotsOfNumbers.map(n => n == null ? 0 : n));
If we need a statistical min and max that assume 0, I suggest we introduce new functions to handle the expected statistical behavior. For example, we might have minStat and maxStat for null coercion to 0.
In this case, I think the good solution is what the MS Excel does. It just skips the
If you have a rare use case, when you have to treat
I think we all agree that the behavior of
To summarize the thoughts so far:
Any help with implementing the new behavior will be welcome. Anyone interested in picking it up?
On the other hand
Can the behaviour be tied to the
hm, I'm not sure about the behavior of operators
I find it a strong argument that when some value is
As for the behavior of statistics functions like
I think that the most common use case for a function like
If you think about it, the way I see it, each function treats
About the addition and multiplication,
This is how MS Excel and similar spreadsheet programs work. I don't know about Mathlab and other software, but it seems quite logical to me.
About the implementations, I think it would be best to handle both cases with one function, and a global config option or an optional parameter. Multiple implementations with different names would be messy.
Either way, in my opinion, the default behaviour should be the one that ignores
But here are my 2 cents (I am gonna use the
I think that implementing the default
If the default behaviour of
Of course I can always check before running a
Instead, if the
I think that if the developer wants to ignore null values, he should explicitly agree on that, either by using a separate function (eg.
I think this behavior should apply to every function,
Summarizing, here what I would expect to see (substitute nan-versions of functions with options if you want):
I also would like to add that Matlab, Python (numpy) and R, 3 of the most used programming languages for data science, all deal with
@sfescape ok that's indeed also my preferred solution, passing along with the function itself (not something globally configurable)
@honestserpent great that you want to contribute! I think this topic (#830) is not te best to start with, because it requires changing code at quite some different places and levels in the project. An easy one to start with is for example #964. Or you could go deep into maths and try to improve the performance of determinant, see #908 . There are plenty of issues that you can pick up, is there any of your particular interest?
added a commit
Jan 23, 2018
math.mean([1,2, null, 3]) //TypeError: Cannot calculate mean, unexpected type of argument (type: null, value: null) // Strategy 1: convert null to 0 math.mean(math.number([1,2, null, 3])) // 1.5 // Strategy 2: filter null values math.mean(math.filter([1,2, null, 3], x => x !== null)) // 2