adding ability to control jexl expression behavior when missing values #772

Merged
merged 8 commits into from Dec 9, 2016

Conversation

Projects
None yet
4 participants
Contributor

lbergelson commented Dec 9, 2016

adding the ability to control how jexl expressions are handled when the expression contains literals that are not present in the context

it is now possible to have a jexl expression either match, not match, or throw an exception if there is a missing value

did some refactoring of JexlMap to make methods less stateful

lbergelson added some commits Dec 8, 2016

@lbergelson lbergelson refactoring jexl fd04b41
@lbergelson lbergelson add ability to treat missing values differently 108bbe7
@lbergelson lbergelson adding overloads to match to allow control of missing value behavior
d88c5c6
@lbergelson lbergelson making enum method package private
15b77d2

lbergelson requested a review from droazen Dec 9, 2016

Contributor

lbergelson commented Dec 9, 2016

@droazen Could you review this, it's for the gatk release

Coverage Status

Coverage increased (+0.03%) to 70.417% when pulling 15b77d2 on lb_jexl_update into e69aff0 on master.

@droazen

Review complete, back to @lbergelson, merge after addressing comments.

@@ -223,6 +224,23 @@ public JexlVCMatchExp(String name, Expression exp) {
}
}
+ public enum JexlMissingValueTreatment {
@droazen

droazen Dec 9, 2016

Contributor

Add javadoc for this public enum, including the getMissingValue() method.

@droazen

droazen Dec 9, 2016

Contributor

Also it would be more logically placed in JEXLMap or its own file rather than embedded in VariantContextUtils, I think

@lbergelson

lbergelson Dec 9, 2016

Contributor

That was my initial thought, but JEXLMap is package private so it's not a good place to expose anything to a public API

@lbergelson

lbergelson Dec 9, 2016

Contributor

I can make it it's own class, but it's only ever used by the match method in VariantContextUtils, so I'm not sure it's an inapproprate place for it.

@lbergelson

lbergelson Dec 9, 2016

Contributor

documented, Do you want it in it's own file or to stay where it is?

@droazen

droazen Dec 9, 2016

Contributor

I vote for a separate file to increase visibility, but ultimately up to you.

-import java.util.HashMap;
-import java.util.Map;
-import java.util.Set;
+import java.util.*;
/**
* This is an implementation of a Map of {@link JexlVCMatchExp} to true or false values.
* It lazily initializes each value as requested to save as much processing time as possible.
*/
class JEXLMap implements Map<JexlVCMatchExp, Boolean> {
@droazen

droazen Dec 9, 2016

Contributor

This class JEXLMap appears to have no direct unit testing at all. Can you add a JEXLMapUnitTest that at least covers the three different missing value behaviors?

@lbergelson

lbergelson Dec 9, 2016

Contributor

It's tested directly in VariantJEXLContextUnitTest, where I added tests for the 3 behaviors. They test the wrapper match methods instead of testing it directly, I can duplicate them with direct calls to JEXLMap if you want?

@droazen

droazen Dec 9, 2016

Contributor

Yes, I'd like to see direct tests that mimic the current indirect ones.

+ this.howToTreatMissingValues = howToTreatMissingValues;
+ }
+
+ public JEXLMap(final Collection<JexlVCMatchExp> jexlCollection, final VariantContext vc, final Genotype g) {
@droazen

droazen Dec 9, 2016

Contributor

Add javadoc for all of these constructors, and document what the default is for handling of missing values when it's not explicitly specified. (Yeah, I know the class is package-private, but it still needs docs to try to guard against future clobberage)

@lbergelson

lbergelson Dec 9, 2016

Contributor

done

}
public JEXLMap(final Collection<JexlVCMatchExp> jexlCollection, final VariantContext vc) {
- this(jexlCollection, vc, null);
+ this(jexlCollection, vc, null, VariantContextUtils.JexlMissingValueTreatment.NO_MATCH);
@droazen

droazen Dec 9, 2016

Contributor

Declare JexlMissingValueTreatment.NO_MATCH as a named constant DEFAULT_MISSING_VALUE_TREATMENT at the top of the class, with a comment explaining what the default behavior does.

@lbergelson

lbergelson Dec 9, 2016

Contributor

done

@lbergelson

lbergelson Dec 9, 2016

Contributor

how i wish java let you have parameter defaults...

- if (jexl.containsKey(o) && jexl.get(o) != null) {
- return jexl.get(o);
+ if (jexl.containsKey(key) && jexl.get(key) != null) {
+ return jexl.get(key);
@droazen

droazen Dec 9, 2016

Contributor

Would be good to save the value returned by the first jexl.get(key) call so that you don't call it twice in a row here.

@lbergelson

lbergelson Dec 9, 2016

Contributor

done

@@ -115,11 +118,13 @@ public void putAll(Map<? extends JexlVCMatchExp, ? extends Boolean> map) {
* Initializes all keys with null values indicating that they have not yet been evaluated.
* The actual value will be computed only when the key is requested via {@link #get(Object)} or {@link #values()}.
*/
- private void initialize(Collection<JexlVCMatchExp> jexlCollection) {
- jexl = new HashMap<>();
+ private static Map<JexlVCMatchExp,Boolean> initialize(final Collection<JexlVCMatchExp> jexlCollection) {
@droazen

droazen Dec 9, 2016

Contributor

Add a @return to the javadoc for this method.

@lbergelson

lbergelson Dec 9, 2016

Contributor

done

@@ -131,19 +136,17 @@ private void initialize(Collection<JexlVCMatchExp> jexlCollection) {
* when the Jexl expression in {@code exp} fails to evaluate the JexlContext
* constructed with the input VC or genotype.
*/
- private void evaluateExpression(final JexlVCMatchExp exp) {
+ private boolean evaluateExpression(final JexlVCMatchExp exp) {
@droazen

droazen Dec 9, 2016

Contributor

Add @return to the method javadoc

@lbergelson

lbergelson Dec 9, 2016

Contributor

done

} catch (final JexlException.Variable e) {
- // if exception happens because variable is undefined (i.e. field in expression is not present), evaluate to FALSE
- jexl.put(exp,false);
+ return howToTreatMissingValues.getMissingValue();
@droazen

droazen Dec 9, 2016

Contributor

This critical try/catch definitely needs a couple lines of explanatory comments for posterity. In particular you need to document what would cause a JexlException.Variable exception to be thrown, and you need to document the fact that howToTreatMissingValues.getMissingValue() can itself throw.

@lbergelson

lbergelson Dec 9, 2016

Contributor

added explanation, changed getMissingValue -> getMissingValueOrExplode

@@ -154,13 +157,13 @@ private void evaluateExpression(final JexlVCMatchExp exp) {
* Create the internal JexlContext, only when required.
* This code is where new JEXL context variables should get added.
*/
- private void createContext() {
+ private JexlContext createContext() {
@droazen

droazen Dec 9, 2016

Contributor

Add @return to method javadoc

@lbergelson

lbergelson Dec 9, 2016

Contributor

done

lbergelson was assigned by droazen Dec 9, 2016

Contributor

lbergelson commented Dec 9, 2016

@droazen are the enum options clear? Should I rename them? Would TRUE, FALSE, THROW be clearer?

Contributor

lbergelson commented Dec 9, 2016

or maybe MATCH, MISMATCH, THROW?

Contributor

lbergelson commented Dec 9, 2016

I could also get rid of the method on the enum and just put a switch in in the code that actually uses it, might be better?

Contributor

vdauwera commented Dec 9, 2016

FWIW I like MATCH, MISMATCH, THROW more than TRUE, FALSE, THROW

Contributor

lbergelson commented Dec 9, 2016

@vdauwera I do too, changed to that.

Contributor

lbergelson commented Dec 9, 2016

@droazen Changes made, back to you with a few questions

@lbergelson lbergelson responding to comments
2d2d78e
Contributor

droazen commented Dec 9, 2016 edited

@lbergelson Maybe TREAT_AS_MATCH, TREAT_AS_MISMATCH, and THROW would be clearest. "TRUE, FALSE, THROW" would be significantly worse/less clear.

Method on the enum is fine I think, as long as it's clearly indicated that it can blow up.

Contributor

droazen commented Dec 9, 2016 edited

@lbergelson Questions answered, back to you, merge after addressing.

Coverage Status

Coverage increased (+0.03%) to 70.419% when pulling 2d2d78e on lb_jexl_update into e69aff0 on master.

lbergelson added some commits Dec 9, 2016

@lbergelson lbergelson updated tests 85736cb
@lbergelson lbergelson moving JexlMissingValueTreatment to top level
4988b28
@lbergelson lbergelson fixing typo
8509ba2

Coverage Status

Coverage increased (+0.02%) to 70.409% when pulling 8509ba2 on lb_jexl_update into e69aff0 on master.

Coverage Status

Coverage increased (+0.02%) to 70.409% when pulling 8509ba2 on lb_jexl_update into e69aff0 on master.

@lbergelson lbergelson merged commit 5a2d7a7 into master Dec 9, 2016

3 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.02%) to 70.409%
Details

lbergelson deleted the lb_jexl_update branch Dec 9, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment