Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
adding ability to control jexl expression behavior when missing values #772
Conversation
lbergelson
added some commits
Dec 8, 2016
lbergelson
requested a review
from droazen
Dec 9, 2016
|
@droazen Could you review this, it's for the gatk release |
coveralls
commented
Dec 9, 2016
droazen
requested changes
Dec 9, 2016
Review complete, back to @lbergelson, merge after addressing comments.
| @@ -223,6 +224,23 @@ public JexlVCMatchExp(String name, Expression exp) { | ||
| } | ||
| } | ||
| + public enum JexlMissingValueTreatment { |
droazen
Dec 9, 2016
Contributor
Add javadoc for this public enum, including the getMissingValue() method.
droazen
Dec 9, 2016
Contributor
Also it would be more logically placed in JEXLMap or its own file rather than embedded in VariantContextUtils, I think
lbergelson
Dec 9, 2016
Contributor
That was my initial thought, but JEXLMap is package private so it's not a good place to expose anything to a public API
lbergelson
Dec 9, 2016
Contributor
I can make it it's own class, but it's only ever used by the match method in VariantContextUtils, so I'm not sure it's an inapproprate place for it.
lbergelson
Dec 9, 2016
Contributor
documented, Do you want it in it's own file or to stay where it is?
droazen
Dec 9, 2016
Contributor
I vote for a separate file to increase visibility, but ultimately up to you.
| -import java.util.HashMap; | ||
| -import java.util.Map; | ||
| -import java.util.Set; | ||
| +import java.util.*; | ||
| /** | ||
| * This is an implementation of a Map of {@link JexlVCMatchExp} to true or false values. | ||
| * It lazily initializes each value as requested to save as much processing time as possible. | ||
| */ | ||
| class JEXLMap implements Map<JexlVCMatchExp, Boolean> { |
droazen
Dec 9, 2016
Contributor
This class JEXLMap appears to have no direct unit testing at all. Can you add a JEXLMapUnitTest that at least covers the three different missing value behaviors?
lbergelson
Dec 9, 2016
Contributor
It's tested directly in VariantJEXLContextUnitTest, where I added tests for the 3 behaviors. They test the wrapper match methods instead of testing it directly, I can duplicate them with direct calls to JEXLMap if you want?
droazen
Dec 9, 2016
Contributor
Yes, I'd like to see direct tests that mimic the current indirect ones.
| + this.howToTreatMissingValues = howToTreatMissingValues; | ||
| + } | ||
| + | ||
| + public JEXLMap(final Collection<JexlVCMatchExp> jexlCollection, final VariantContext vc, final Genotype g) { |
droazen
Dec 9, 2016
Contributor
Add javadoc for all of these constructors, and document what the default is for handling of missing values when it's not explicitly specified. (Yeah, I know the class is package-private, but it still needs docs to try to guard against future clobberage)
| } | ||
| public JEXLMap(final Collection<JexlVCMatchExp> jexlCollection, final VariantContext vc) { | ||
| - this(jexlCollection, vc, null); | ||
| + this(jexlCollection, vc, null, VariantContextUtils.JexlMissingValueTreatment.NO_MATCH); |
droazen
Dec 9, 2016
Contributor
Declare JexlMissingValueTreatment.NO_MATCH as a named constant DEFAULT_MISSING_VALUE_TREATMENT at the top of the class, with a comment explaining what the default behavior does.
| - if (jexl.containsKey(o) && jexl.get(o) != null) { | ||
| - return jexl.get(o); | ||
| + if (jexl.containsKey(key) && jexl.get(key) != null) { | ||
| + return jexl.get(key); |
droazen
Dec 9, 2016
Contributor
Would be good to save the value returned by the first jexl.get(key) call so that you don't call it twice in a row here.
| @@ -115,11 +118,13 @@ public void putAll(Map<? extends JexlVCMatchExp, ? extends Boolean> map) { | ||
| * Initializes all keys with null values indicating that they have not yet been evaluated. | ||
| * The actual value will be computed only when the key is requested via {@link #get(Object)} or {@link #values()}. | ||
| */ | ||
| - private void initialize(Collection<JexlVCMatchExp> jexlCollection) { | ||
| - jexl = new HashMap<>(); | ||
| + private static Map<JexlVCMatchExp,Boolean> initialize(final Collection<JexlVCMatchExp> jexlCollection) { |
| @@ -131,19 +136,17 @@ private void initialize(Collection<JexlVCMatchExp> jexlCollection) { | ||
| * when the Jexl expression in {@code exp} fails to evaluate the JexlContext | ||
| * constructed with the input VC or genotype. | ||
| */ | ||
| - private void evaluateExpression(final JexlVCMatchExp exp) { | ||
| + private boolean evaluateExpression(final JexlVCMatchExp exp) { |
| } catch (final JexlException.Variable e) { | ||
| - // if exception happens because variable is undefined (i.e. field in expression is not present), evaluate to FALSE | ||
| - jexl.put(exp,false); | ||
| + return howToTreatMissingValues.getMissingValue(); |
droazen
Dec 9, 2016
Contributor
This critical try/catch definitely needs a couple lines of explanatory comments for posterity. In particular you need to document what would cause a JexlException.Variable exception to be thrown, and you need to document the fact that howToTreatMissingValues.getMissingValue() can itself throw.
lbergelson
Dec 9, 2016
Contributor
added explanation, changed getMissingValue -> getMissingValueOrExplode
| @@ -154,13 +157,13 @@ private void evaluateExpression(final JexlVCMatchExp exp) { | ||
| * Create the internal JexlContext, only when required. | ||
| * This code is where new JEXL context variables should get added. | ||
| */ | ||
| - private void createContext() { | ||
| + private JexlContext createContext() { |
lbergelson
was assigned
by droazen
Dec 9, 2016
|
@droazen are the enum options clear? Should I rename them? Would TRUE, FALSE, THROW be clearer? |
|
or maybe MATCH, MISMATCH, THROW? |
|
I could also get rid of the method on the enum and just put a switch in in the code that actually uses it, might be better? |
|
FWIW I like MATCH, MISMATCH, THROW more than TRUE, FALSE, THROW |
|
@vdauwera I do too, changed to that. |
|
@droazen Changes made, back to you with a few questions |
|
@lbergelson Maybe Method on the enum is fine I think, as long as it's clearly indicated that it can blow up. |
|
@lbergelson Questions answered, back to you, merge after addressing. |
lbergelson commentedDec 9, 2016
adding the ability to control how jexl expressions are handled when the expression contains literals that are not present in the context
it is now possible to have a jexl expression either match, not match, or throw an exception if there is a missing value
did some refactoring of JexlMap to make methods less stateful