Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate reports bogus type mismatch error when checking XPath date arguments #405

Open
tiritea opened this issue Feb 13, 2019 · 11 comments
Labels
Milestone

Comments

@tiritea
Copy link
Contributor

tiritea commented Feb 13, 2019

Software versions

javaRosa as it exists in Validate as it exists in XSLForm Online v1.3.1

Problem description

Validate appears to be performing a static evaluation of XPath expressions as part of its validation of XPath date/time/dateTime function arguments. But because it performs a static evaluation, it does not pickup valid runtime values, and as a consequence can throw a bogus type-mismatch error when it sees an invalid dateTime string (even though in actual runtime this string would contain a valid date string).

Steps to reproduce the problem

see https://forum.opendatakit.org/t/validate-issue-for-decimal-date-time/17710/6

specifically:

The simplest testcase I was able to come up with exhibiting the behavior is the above, namely

date:	year, default=2019
result = decimal-date-time(concat(${year},"-01-01"))

which is valid and runs fine in Collect, but causes a form type-mismatch error with Validate. I suspect this may only apply to date/time functions type-checking, but until we can isolate it in the code I cant be 100% certain this wont pop up in other non-date related cases.

Expected behavior

Validate should either not attempt to (statically) evaluate XPath expression in order to guess their type, for the purpose of type checking function arguments, or must fully evaluate all XPath expressions (infeasible?), including any defaults.

Other information

workaround requires addition of an explicit (and unnecessary) coalesce() call to effectively initialize datetime function arguments to something that will result in a valid datetime string, just for the purpose of validation.

@ggalmazor ggalmazor added the bug label Feb 13, 2019
@ggalmazor ggalmazor added this to the v2.14.0 milestone Feb 13, 2019
@tiritea
Copy link
Contributor Author

tiritea commented Feb 13, 2019

Please note that I havent started to try and figure out where in javaRosa/Validate it is doing this supposed static type-checking evaluation. So although it is readily reproducible for dateTime function argument checking, I dont know whether this 'static' analysis might also have failure scenarios wrt to other data types. I would want to be sure it doesnt before closing out this bug, but till we can isolate the actual code involved its probably not possible to say categorically it doesnt.

@lognaturel
Copy link
Member

I think what's going on here is that Validate steps through the form as if it were a form-filling client. It never gets a year value so it attempts to evaluate decimal-date-time("-01-01") and that leads to the crash.

Related to #481.

@tiritea
Copy link
Contributor Author

tiritea commented Sep 17, 2019

my suspicion is that Validate may just be evaluating each binding XPath expression in situ, without actually resolving any (?) instance XML element references to their actual value (substitutes '' instead?). Hence doesn't pickup any defaults.

@lognaturel
Copy link
Member

Can you please share your full form? Did you try setting a default of e.g. 2019 for year and still get that behavior?

@tiritea
Copy link
Contributor Author

tiritea commented Sep 17, 2019

try this:

validate.xlsx

BTW XLSForm Online used to throw a type mismatch error, just tried it now and its barfing with "java not found" on it (!) ha. I assume 'cause its now crashing on it...?

@lognaturel
Copy link
Member

Form conversion with XLSForm Online v1.6.1 at http://opendatakit.org/xlsform/ succeeds. It also validates from Validate 1.13.2 directly. I can reproduce the type mismatch if I omit the default, however.

I now see that you had listed a default in your original post but I'm wondering whether somehow you might not have it in the form?

I can't really think of any action to take, here. Perhaps there should be better documentation about the default requirement in a case like this but I don't know where that could go.

@tiritea
Copy link
Contributor Author

tiritea commented Sep 26, 2019

Let me try to reproduce.

@lognaturel
Copy link
Member

lognaturel commented Sep 26, 2019

Looking back at the forum thread this came from, it looks like there was no default specified.

For better or worse, Collect doesn't show an exception on load when the argument is "-01-01" because type exceptions on first form computation are swallowed. However, an exception will be shown if the year value is changed to another invalid value like -1, "cheese", etc, or changed to a valid year and then cleared. That exception can be dismissed by the enumerator.

Ideally, form designers would include appropriate constraints so that the runtime exceptions would never be needed but I think they're useful as fallbacks.

In Validate, any type of runtime exception hit while traversing the form make validation fail. We could silence type exceptions because it's true that they are likely to fail in the absence of defaults. I argue that it is helpful to users to get a hint that something isn't quite right and that they should provide defaults and constraints. Defaulting to the current year and constraining the year between 0 and the current year (or year + 5 or whatever is appropriate for the use case) would make for a better form design.

No matter what, the current exception messages are cryptic. They should at the very least be explicit about what argument value caused the problem. I've proposed wording at #494.

@tiritea
Copy link
Contributor Author

tiritea commented Sep 26, 2019

The issue was that even with a default, for ${year], Validate was throwing an error. That appears to have been fixed recently along the way somehwere (?)...

But the root issue - as I see it - remains, that Validate will flat out reject a valid form containing

decimal-date-time(concat(${year},"-01-01"))

Even if, say, ${year} is a required number question [unless you work-around the bug by giving ${year} a default, or replacing it with a coalesce(${year},"2019"), etc....].

I do see the benefit in performing a static analysis of all calculations - when you attempt to load a form - as a means to pickup errors. But fundamentally I dont believe you can categorically state in the general case that a calculation is invalid by performing a static analysis of the calcuation; elements in an XPath expressions are inherently dynamically type cast when you actually run them. So specific elements that may, for example, appear to be null initially (and hence may cause the expression not to evaluate correctly) may in fact be perfectly OK when you actually run the form normally.

I dont think type check mismatches are a particularly common user error [correct me if I'm wrong] so would it be sufficient for Validate to instead flag these as warnings instead of fatal errors? (since Validate cant actually be certain they are in fact actual errors, eg above example).

Note, the fact there is now a non-intrusive workaround - ie being able to set the default - is certianly a lot better state of affairs than having to rewrite calculations using coalesce(), which was very intrusive. Which is to say, if you push me on the issue, I'll probably back down... :-)

@tiritea
Copy link
Contributor Author

tiritea commented Sep 26, 2019

Ideally, form designers would include appropriate constraints so that the runtime exceptions would never be needed

As it happens, making the calculation only relevant if ${year} != '' has no effect - Validate still barfs. Basically, you have to fake out Validate with a default or coalesce workaround. Again, this goes to the root of the problem IMO: static analysis of dynamically interpreted expressions.

@danbjoseph
Copy link
Member

This seems to related to solving the issue raised in this thread.

I have simplified a form showing the issue using the example of collecting an employee identifier, using it to pull the date of hire from an external file, and then calculating the days since hire. type_mismatch.xlsx

Running the form through XLSForm Online v2.x throws an error.

XPath evaluation: type mismatch   
The value "mycsv" can't be converted to a date.

When 'mycsv' in the XLSForm is the reference within a pulldata to a filename, and the result of the pull data could very well be a properly formatted string such as "2010-08-02" or whatnot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants