ERROR [error.array.value_out_of_min_max_range] evaluation is not correct #529

katecrombie · 2022-07-21T18:24:22Z

When validating a pds4.folder of image type arrays, several errors were generated during the evaluation of the min/max array values. Some were correct, others were not.

ERROR [error.array.value_out_of_min_max_range] array 1, location (349, 345): Value is greater than the maximum value in the label (max=0.0014194, got=0.0014194006).

This is technically correct, and we can fix it by increasing the significant digits in the data product.

ERROR [error.array.value_out_of_min_max_range] array 1, location (386, 185): Value is greater than the maximum value in the label (max=0.0702686, got=0.07026859).

This is not correct.

My assumption is that validate is looking at the number of significant digits, and is giving incorrect results when the number of digits is different.

I can attach test case data products if needed.

Engineering Details

Tightly coupled with #514 and #434

see #574 for additional info

jordanpadams · 2022-07-21T19:14:27Z

@katecrombie example data products would be great.

can you also provide the version of Validate you are running?

validate -V

jordanpadams · 2022-07-21T19:15:20Z

@katecrombie also, if you could create an issue using the Bug template in the future, that would be great so it gets assigned to me right away! otherwise this can slip through the cracks :-)

katecrombie · 2022-07-21T19:22:14Z

Using Validate 2.2.0

katecrombie · 2022-07-21T19:22:45Z

roger that on the bug template (oops)

jordanpadams · 2022-07-22T15:13:10Z

@katecrombie do you have an example product of where this is failing?

katecrombie · 2022-07-22T16:26:58Z

erospdart_lblexample2.zip

report.txt is the validation report for these files. A few pass. a few fail.

jordanpadams · 2022-08-25T17:42:45Z

@katecrombie do you have the XML labels for these products? the ZIP you sent only include the LBLs.

katecrombie · 2022-08-25T17:47:28Z

erospdart_lblexample_20220429.zip

This zip should have the .lbl, .fit and .xml files. The report.txt file is a validation report that shows the failures.

jordanpadams · 2022-10-31T06:14:55Z

@katecrombie those 2 zips do not appear to contain any XML labels? Just the lbl and .fit files.

katecrombie · 2022-10-31T16:50:50Z

D'oh.... Third time is the charm.
erospdart_lblexample_20220429_2.zip

Fix 1 ----- Original compare method had some confusions between precision and scale. We should be using scale for handling doubles with many significant digits. > The precision is the number of digits in the unscaled value. For instance, for the number 123.45, the precision returned is 5. > If zero or positive, the scale is the number of digits to the right of the decimal point. If negative, the unscaled value of the number is multiplied by ten to the power of the negation of the scale. For example, a scale of -3 means the unscaled value is multiplied by 1000. Additionally, we need to be able to handle scale appropriately. Previous functionality would decrease the scale to the least between the min/max value and the actual value. This obviously doesn't make sense because when we do this, comparisons like this will be invalid: actual value: 0.0014194006 scaled actual value: 0.0014194 max value from label: 0.0014194 as you can see, if we compare the max value from the label to the scaled actual value, they are equal. but in reality they are not. this should raise an error. Fix 2 ----- that being said, we can't always trust the full scaled values that are calculated because doubles can introduce false precision. if we just change the algorithm to use whichever is the largest scale (aka number of decimal places) we were taking values that sometimes introduced false precision. thanks for our regression test in #435, we were able to see that the solution is not as simple as increasing the scale. so after some reading, it looks like using the BigDecimal(String) initializer as much as possible, and avoiding calculations/conversions to double in memory is the best way to avoid false precision. there are still some possible issues with false precision in the code, but this should resolve most of them. resolves #529 References: * https://blogs.oracle.com/javamagazine/post/four-common-pitfalls-of-the-bigdecimal-class-and-how-to-avoid-them * https://stackoverflow.com/questions/35435691/bigdecimal-precision-and-scale

al-niessner · 2022-12-22T19:15:50Z

@jordanpadams @nutjob4life @tloubrieu-jpl

Is this an error that should be looked at or a missing file?

  FAIL: file:/tmp/test/m0157417198f3_2p_iof_gbl.xml
      ERROR  [error.label.context_ref_not_found]   line 84: 'Context product not found: urn:nasa:pds:context:instrument:near.msi
        8 product validation(s) completed

jordanpadams · 2022-12-23T18:36:00Z

@katecrombie FYI, we are fairly certain max=0.0014194, got=0.0014194006 is a spurious error. 0.0014194006 is just an issue with the significant digits and how doubles are stored in memory.

rchenatjpl · 2023-02-03T22:19:35Z

@jordanpadams Here's an extreme example I created:
ERROR [error.array.value_out_of_min_max_range] array 1, location (400, 246): Value is less than the minimum value in the label (min=5.0E-5, got=0.005).
If that's unclear, multiply both numbers by 10000, and validate is saying 5 is greater than 500.

The two attached .FITs files differ by 1 bit. At the cell in question,
m5under.fit: 0x3ba3d70a
m5over.fit: 0x3ba3d70b
which in decimal are 0.00499999988824 and 0.0050000003539. So yes, it's a rounding problem, but note that the value in both labels has exactly 1 significant digit (brackets removed):
minimum 0.00005 /minimum

I think the bug is important. If the data values are like 123.45678, then this error won't fire much. Buf if the values are like 0.000012345678, this error will fire often. And some datasets will be in the latter domain.
Archive.zip

al-niessner · 2023-02-06T20:51:47Z

@jordanpadams @rchenatjpl

Not surprised this is a problem when min is represented by scientific notation. The original code, which was tweaked, compares using strings on the assumption they are in the form 0.00005 not 5.0E-5. The most probable reason for this is floating point representation is not perfect so 0.00005 can turn into 0.0000501 or 0.0000499 or something else nearby. String comparisons were then used to prevent this representation error but it makes processing forms other than direct decimal more difficult.

No matter the approach, there will be problems because the continuous number line cannot be accurately represented for any number in a computer. There are a couple ov choices:

convert things to doubles and live with the representation errors that introduced.
convert things to doubles then forever fit the code for corner cases.
convert all numbers to direct decimal formats (not scientific format)
1. using Double to do the conversion introduces same problems as 1 and/or 2
2. do string manipulation to move decimal point around padding with 0 as needed

Any of those can be done, but they are fraught with peril and I need to which peril is the desired peril.

rchenatjpl · 2023-02-07T08:01:41Z

golish.zip

I hope the attached file goes through. It's 700+ MB. It has a bundle that I've pared down to only those data files for which validate incorrectly flagged bad values. Below is validate's output.

`
PDS Validate Tool Report

Configuration:
Version 3.1.0-SNAPSHOT
Date 2023-02-07T06:47:27Z

Parameters:
Targets [file:/Users/rchen/Desktop/pds_review/]
Rule Type pds4.bundle
Severity Level WARNING
Recurse Directories true
File Filters Used [*.xml, *.XML]
Data Content Validation on
Product Level Validation on
Allow Unlabeled Files false
Max Errors 100000
Registered Contexts File /Users/rchen/PDS4tools/validate/resources/registered_context_products.json

Product Level Validation Results

PASS: file:/Users/rchen/Desktop/pds_review/bundle_nearmsi_gbl.xml
1 product validation(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/collection_data.xml
2 product validation(s) completed

FAIL: file:/Users/rchen/Desktop/pds_review/data/m0154651923f6_2p_cif_gbl.xml
Begin Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0154651923f6_2p_cif_gbl.fit
ERROR [error.array.value_out_of_min_max_range] array 1, location (400, 246): Value is less than the minimum value in the label (min=0.007827, got=0.007827956).
End Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0154651923f6_2p_cif_gbl.fit
3 product validation(s) completed

FAIL: file:/Users/rchen/Desktop/pds_review/data/m0156835546f4_2p_iof_gbl.xml
Begin Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0156835546f4_2p_iof_gbl.fit
ERROR [error.array.value_out_of_min_max_range] array 1, location (50, 365): Value is less than the minimum value in the label (min=0.009938, got=0.009938589).
End Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0156835546f4_2p_iof_gbl.fit
4 product validation(s) completed

FAIL: file:/Users/rchen/Desktop/pds_review/data/m0154651921f4_2p_cif_gbl.xml
Begin Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0154651921f4_2p_cif_gbl.fit
ERROR [error.array.value_out_of_min_max_range] array 1, location (165, 299): Value is less than the minimum value in the label (min=1.18E-4, got=1.18906144E-4).
End Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0154651921f4_2p_cif_gbl.fit
5 product validation(s) completed

FAIL: file:/Users/rchen/Desktop/pds_review/data/m0154651915f1_2p_cif_gbl.xml
Begin Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0154651915f1_2p_cif_gbl.fit
ERROR [error.array.value_out_of_min_max_range] array 1, location (322, 51): Value is less than the minimum value in the label (min=0.008009, got=0.00800981).
End Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0154651915f1_2p_cif_gbl.fit
6 product validation(s) completed

FAIL: file:/Users/rchen/Desktop/pds_review/data/m0125249036f2_2p_iof_gbl.xml
Begin Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0125249036f2_2p_iof_gbl.fit
ERROR [error.array.value_out_of_min_max_range] array 1, location (242, 287): Value is greater than the maximum value in the label (max=0.087356, got=0.08735539).
End Content Validation: file:/Users/rchen/Desktop/pds_review/data/m0125249036f2_2p_iof_gbl.fit
7 product validation(s) completed

PDS4 Bundle Level Validation Results

FAIL: file:/Users/rchen/Desktop/pds_review/data/.DS_Store
ERROR [error.file.name_has_invalid_characters] File name uses invalid character
WARNING [warning.file.not_referenced_in_label] File is not referenced by any label
1 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/collection_data.xml
2 integrity check(s) completed

FAIL: file:/Users/rchen/Desktop/pds_review/.DS_Store
ERROR [error.file.name_has_invalid_characters] File name uses invalid character
WARNING [warning.file.not_referenced_in_label] File is not referenced by any label
3 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/bundle_nearmsi_gbl.xml
4 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/m0154651923f6_2p_cif_gbl.xml
5 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/m0154651921f4_2p_cif_gbl.xml
6 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/m0156835546f4_2p_iof_gbl.xml
7 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/m0154651915f1_2p_cif_gbl.xml
8 integrity check(s) completed

PASS: file:/Users/rchen/Desktop/pds_review/data/m0125249036f2_2p_iof_gbl.xml
9 integrity check(s) completed

Summary:

7 error(s)
2 warning(s)

Product Validation Summary:
2 product(s) passed
5 product(s) failed
0 product(s) skipped

Referential Integrity Check Summary:
7 check(s) passed
2 check(s) failed
0 check(s) skipped

Message Types:
5 error.array.value_out_of_min_max_range
2 error.file.name_has_invalid_characters
2 warning.file.not_referenced_in_label

End of Report
Completed execution in 6256 ms

`

al-niessner · 2023-02-07T16:34:41Z

@rchenatjpl

Cannot reproduce your errors. Please check your validate that it has code updates from 529 in them and/or that golish.zip is correct. I may not have understood your checks, but I get no errors:

validate /tmp/golish/bundle_nearmsi_gbl.xml

PDS Validate Tool Report

Configuration:
   Version                       3.2.0-SNAPSHOT
   Date                          2023-02-07T16:29:24Z

Parameters:
   Targets                       [file:/tmp/golish/bundle_nearmsi_gbl.xml]
   Severity Level                WARNING
   Recurse Directories           true
   File Filters Used             [*.xml, *.XML]
   Data Content Validation       on
   Product Level Validation      on
   Max Errors                    100000
   Registered Contexts File      /home/niessner/Projects/PDS/validate/src/main/resources/util/registered_context_products.json



Product Level Validation Results

  PASS: file:/tmp/golish/bundle_nearmsi_gbl.xml
        1 product validation(s) completed
Feb 07, 2023 8:29:27 AM gov.nasa.pds.tools.validate.task.ValidationTask execute
INFO: Validation complete for location 'file:/tmp/golish/bundle_nearmsi_gbl.xml'

Summary:

  0 error(s)
  0 warning(s)

  Product Validation Summary:
    1          product(s) passed
    0          product(s) failed
    0          product(s) skipped

  Referential Integrity Check Summary:
    0          check(s) passed
    0          check(s) failed
    0          check(s) skipped


End of Report
Completed execution in 4279 ms

rchenatjpl · 2023-02-07T18:40:17Z

ok, 3.1.1 did validate both my last zips correctly.

al-niessner · 2023-02-07T19:46:21Z

@rchenatjpl

You are okay with closing this ticket again then?

rchenatjpl · 2023-02-07T19:54:47Z

yes, we can close this. I should have checked the latest version. OTOH, if this wasn't specifically addressed in a build, it might pop up again.

miguelp1986 · 2023-03-09T21:43:08Z

Testrail: https://cae-testrail.jpl.nasa.gov/testrail/index.php?/cases/view/1273422

jordanpadams added the needs:triage label Jul 21, 2022

jordanpadams added the bug Something isn't working label Jul 21, 2022

jordanpadams self-assigned this Jul 21, 2022

jordanpadams mentioned this issue Jul 22, 2022

Validate does not catch NaNs in Binary Tables #514

Closed

jordanpadams added B13.0 s.medium and removed needs:triage labels Aug 7, 2022

jordanpadams mentioned this issue Dec 8, 2022

validate incorrectly throws error.array.value_out_of_min_max_range #574

Closed

jordanpadams mentioned this issue Dec 13, 2022

Update compare method to use BigDecimal.scale #577

Merged

jordanpadams added sprint-backlog B13.1 p.must-have labels Dec 16, 2022

jordanpadams closed this as completed in #577 Dec 20, 2022

al-niessner mentioned this issue Dec 21, 2022

B13.1 Fix Must-Have Priority Bugs #578

Closed

al-niessner mentioned this issue Dec 22, 2022

issue 529: do decimal test then string based on significant digits #582

Merged

rchenatjpl reopened this Feb 3, 2023

jordanpadams assigned al-niessner and unassigned jordanpadams Feb 6, 2023

rchenatjpl closed this as completed Feb 7, 2023

jordanpadams removed the sprint-backlog label Feb 23, 2023

miguelp1986 added the i&t.done label Mar 9, 2023

jordanpadams mentioned this issue Mar 14, 2023

Missing validation of valid_maximum and valid_minimum from Special_Constants #611

Closed

jordanpadams mentioned this issue Apr 12, 2023

Scaled value min/max error does not display scaled value #434

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERROR [error.array.value_out_of_min_max_range] evaluation is not correct #529

ERROR [error.array.value_out_of_min_max_range] evaluation is not correct #529

katecrombie commented Jul 21, 2022 •

edited by jordanpadams

jordanpadams commented Jul 21, 2022

jordanpadams commented Jul 21, 2022

katecrombie commented Jul 21, 2022

katecrombie commented Jul 21, 2022

jordanpadams commented Jul 22, 2022

katecrombie commented Jul 22, 2022

jordanpadams commented Aug 25, 2022

katecrombie commented Aug 25, 2022

jordanpadams commented Oct 31, 2022

katecrombie commented Oct 31, 2022

al-niessner commented Dec 22, 2022 •

edited

jordanpadams commented Dec 23, 2022

rchenatjpl commented Feb 3, 2023 •

edited

al-niessner commented Feb 6, 2023

rchenatjpl commented Feb 7, 2023

al-niessner commented Feb 7, 2023

rchenatjpl commented Feb 7, 2023

al-niessner commented Feb 7, 2023

rchenatjpl commented Feb 7, 2023

miguelp1986 commented Mar 9, 2023

ERROR [error.array.value_out_of_min_max_range] evaluation is not correct #529

ERROR [error.array.value_out_of_min_max_range] evaluation is not correct #529

Comments

katecrombie commented Jul 21, 2022 • edited by jordanpadams

Engineering Details

jordanpadams commented Jul 21, 2022

jordanpadams commented Jul 21, 2022

katecrombie commented Jul 21, 2022

katecrombie commented Jul 21, 2022

jordanpadams commented Jul 22, 2022

katecrombie commented Jul 22, 2022

jordanpadams commented Aug 25, 2022

katecrombie commented Aug 25, 2022

jordanpadams commented Oct 31, 2022

katecrombie commented Oct 31, 2022

al-niessner commented Dec 22, 2022 • edited

jordanpadams commented Dec 23, 2022

rchenatjpl commented Feb 3, 2023 • edited

al-niessner commented Feb 6, 2023

rchenatjpl commented Feb 7, 2023

al-niessner commented Feb 7, 2023

rchenatjpl commented Feb 7, 2023

al-niessner commented Feb 7, 2023

rchenatjpl commented Feb 7, 2023

miguelp1986 commented Mar 9, 2023

katecrombie commented Jul 21, 2022 •

edited by jordanpadams

al-niessner commented Dec 22, 2022 •

edited

rchenatjpl commented Feb 3, 2023 •

edited