Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate does not flag <CR> within lid_reference #401

Closed
msbentley opened this issue Sep 13, 2021 · 1 comment · Fixed by #404
Closed

validate does not flag <CR> within lid_reference #401

msbentley opened this issue Sep 13, 2021 · 1 comment · Fixed by #404
Assignees
Labels
bug Something isn't working s.medium

Comments

@msbentley
Copy link

msbentley commented Sep 13, 2021

🐛 Describe the bug

I'm not 100% sure if this is a validate bug, or that something in the IM needs to be better constrained, but I'm assuming for now the former. Products declaring [lid_reference](https://pds.nasa.gov/datastandards/documents/dd/current/PDS4_PDS_DD_1G00.html#d5e44434) (e.g. in Internal_Reference) currently allow according to validate, e.g. this is valid:

<Internal_Reference>
    <lid_reference>urn:esa:psa:context:instrument_host:spacecraft.tgo
    </lid_reference>
    <reference_type>is_instrument_host</reference_type>
</Internal_Reference>

However, oXygen reports:

Value 'urn:esa:psa:context:investigation:mission.em16
                 ' is not facet-valid with respect to pattern 'urn(:[\p{Ll}\p{Nd}\-._]+){3,5}' for type 'lid_reference'.

📜 To Reproduce

Steps to reproduce the behavior:

  1. Validate a label with the above internal reference

🕵️ Expected behavior

I would not expect that carriage return would be allowed within an internal reference, and would expect a validation error.

📚 Version of Software Used

validate 2.0.7

🖥 System Info

  • OS: ubuntu
  • Version: 20.04

🦄 Related requirements

⚙️ Engineering Details

@msbentley msbentley added bug Something isn't working needs:triage labels Sep 13, 2021
@qchaupds
Copy link
Contributor

qchaupds commented Sep 14, 2021

First inspection of the code reveals something interesting. The call to node.getTextContent() at line 258 does contains the CR but when the trim() function is applied, the CR is removed. The bug is identified and can indeed be fixed.

{pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate 108 % vi ./src/main/java/gov/nasa/pds/tools/util/LabelUtil.java

241           NodeList nodeList = (NodeList) xPathFactory.newXPath().evaluate(searchPathName,source,XPathConstants.NODESET);
242           LOG.debug("getIdentifiersCommon:context,nodeList.getLength() {},{}",context,nodeList.getLength());
243           LOG.debug("getIdentifiersCommon:context,searchPathName,nodeList.getLength() {},{},{}",context,searchPathName,nodeList.getLength());
244           for (int i = 0; i < nodeList.getLength(); ++i) {
245               NodeList childList = ((Element) nodeList.item(i)).getChildNodes();
246               String singleIdentifier = null;
247               String singleVersion = null;
248               String singleLidvidValue = null;
249               for (int j = 0; j < childList.getLength(); ++j) {
250                   Node node = childList.item(j);
251                   LOG.debug("node.getTextContent().trim() {}",node.getTextContent().trim());
252                   //LOG.debug("node.getTextContent().trim() {}",node.getTextContent().trim());
253                   // Because the tagsList is an array, loop through to check for each tag
254 
255                   for (int kk = 0; kk < tagsList.length; kk++) {
256 
257                       if (node.getNodeName().equals(tagsList[kk]) || node.getNodeName().equals(VERSION_ID_TAG)) {
258                           if (node.getNodeName().equals(tagsList[kk]))
259                               singleIdentifier = node.getTextContent().trim();
260                           if (node.getNodeName().equals(VERSION_ID_TAG))
261                              singleVersion = node.getTextContent().trim();
262                           LOG.debug("getIdentifiersCommon:kk,CONTAINS_CR,TEXTCONTENT {},{},[{}]",kk,node.getTextContent().contains("\n"),node.getTextContent());
263                       }
264                       LOG.debug("getIdentifiersCommon:kk,singleIdentifier,singleVersion {},[{}],[{}]",kk,singleIdentifier,singleVersion);

Running validate on a bundle that does reference checks, we can see that the printout out of line 262 shows CONTAINS_CR as true

{pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate 129 % grep getIdentifiersCommon:kk,CONTAINS_CR,TEXTCONTENT t1
[main] DEBUG gov.nasa.pds.tools.util.LabelUtil - getIdentifiersCommon:kk,CONTAINS_CR,TEXTCONTENT 1,true,[urn:nasa:pds:kaguya_grs_spectra:document:kgrs_calibrated_spectra ZZZ BBB  
{pds-dev3.jpl.nasa.gov}/home/qchau/sandbox/validate 130 % 

qchaupds pushed a commit that referenced this issue Sep 16, 2021
…l integrity checks.

1. Add new test resources : src/test/resources/github401
2. Add logic to check for extraneous carriage returns for references : src/main/java/gov/nasa/pds/tools/util/LabelUtil.java
3. Add new regression test : src/test/resources/features/validate.feature

Refs:

#401 validate does not flag <CR> within lid_reference
jordanpadams added a commit that referenced this issue Sep 24, 2021
jordanpadams added a commit that referenced this issue Sep 24, 2021
* Check all documents specified in Product_Document tags for correct mime types.

1. Add new test resources :src/test/resources/github367
2. Add needed dependency for group com.sun.activation, which is the implementation jar :pom.xml
3. Add default mime type resource file :src/main/resources/validate_default_mime_types.txt
4. Add new class to validate JPEG and PNG image file content :src/main/java/gov/nasa/pds/tools/util/ImageUtil.java
5. Add new class to check if the mime type of a given file matches its file extension :src/main/java/gov/nasa/pds/tools/util/DocumentsChecker.java
6. Add new problem types :src/main/java/gov/nasa/pds/tools/validate/ProblemType.java
7. Add check for all file_name tags in Product_Document tags :src/main/java/gov/nasa/pds/tools/validate/rule/pds4/FileReferenceValidationRule.java
8. Add new regression test :src/test/resources/features/validate.feature

Refs:

#367 As a user, I want to validate all files referenced by a Product_Document

* Using alternative method to get the parent

* Fix sonatype-lift complaints: unused variables

* Check all references for extraneous carriage return in the referential integrity checks.

1. Add new test resources : src/test/resources/github401
2. Add logic to check for extraneous carriage returns for references : src/main/java/gov/nasa/pds/tools/util/LabelUtil.java
3. Add new regression test : src/test/resources/features/validate.feature

Refs:

#401 validate does not flag <CR> within lid_reference

* Add missing feature file in previous commit

* Make function getIdentifiersCommon() synchronized to avoid threads error

Co-authored-by: Qui T Chau <qchau@pds-dev3.jpl.nasa.gov>
Co-authored-by: Jordan Padams <33492486+jordanpadams@users.noreply.github.com>
jordanpadams added a commit that referenced this issue Sep 24, 2021
* Check all documents specified in Product_Document tags for correct mime types.

1. Add new test resources :src/test/resources/github367
2. Add needed dependency for group com.sun.activation, which is the implementation jar :pom.xml
3. Add default mime type resource file :src/main/resources/validate_default_mime_types.txt
4. Add new class to validate JPEG and PNG image file content :src/main/java/gov/nasa/pds/tools/util/ImageUtil.java
5. Add new class to check if the mime type of a given file matches its file extension :src/main/java/gov/nasa/pds/tools/util/DocumentsChecker.java
6. Add new problem types :src/main/java/gov/nasa/pds/tools/validate/ProblemType.java
7. Add check for all file_name tags in Product_Document tags :src/main/java/gov/nasa/pds/tools/validate/rule/pds4/FileReferenceValidationRule.java
8. Add new regression test :src/test/resources/features/validate.feature

Refs:

#367 As a user, I want to validate all files referenced by a Product_Document

* Using alternative method to get the parent

* Fix sonatype-lift complaints: unused variables

* Check all references for extraneous carriage return in the referential integrity checks.

1. Add new test resources : src/test/resources/github401
2. Add logic to check for extraneous carriage returns for references : src/main/java/gov/nasa/pds/tools/util/LabelUtil.java
3. Add new regression test : src/test/resources/features/validate.feature

Refs:

#401 validate does not flag <CR> within lid_reference

* Add missing feature file in previous commit

* Make function getIdentifiersCommon() synchronized to avoid threads error

* Check .tab file for records of same length and correct ERROR printing to debug for DocumentsChecker

1. Add new test resources : src/test/resources/github390
2. Correct ERROR printing to debug : src/main/java/gov/nasa/pds/tools/util/DocumentsChecker.java
3. Add check for .tab file for records of same length : src/main/java/gov/nasa/pds/tools/validate/rule/pds4/TableDataContentValidationRule.java
4. Add new regression test : src/test/resources/features/validate.feature

Refs:

#390 validate does not flag *.tab files with variable length records

Co-authored-by: Qui T Chau <qchau@pds-dev3.jpl.nasa.gov>
Co-authored-by: Jordan Padams <33492486+jordanpadams@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working s.medium
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants