Skip to content
This repository has been archived by the owner on Mar 26, 2024. It is now read-only.

comment or strings which contain '\u' breaks unit testing #42

Closed
mjj4791 opened this issue Apr 16, 2020 · 3 comments
Closed

comment or strings which contain '\u' breaks unit testing #42

mjj4791 opened this issue Apr 16, 2020 · 3 comments

Comments

@mjj4791
Copy link

mjj4791 commented Apr 16, 2020

CCL Maven plugin crashes/ fails build due to misinterpretation of valid text/character sequences in comments, constants and varaibles/strings code comments.

It is interpreting text/strings that resemble a Unicode encoded character, but are just a string which resembles a Unicode encoded character. It seems the ccl-testing plugin tries to interpret string that start with \u as an actual Unicode character and not as the text they are in the source code.

If code comments or string values contain a character sequence that looks like a Unicode encoded character (ie '\u'), the plugin will fail the built, due to its inability to retrieve, parse and save the program's source code.

Example 1

The code comments contains the text string '\u0000' (also fails with other Unicode characters: ie '\u0001')

Error:

ERROR] Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: Failed to write xml program listing for UnicodeTest: Failed to properly parse XML document. Error on line 1 of document : An invalid XML character (Unicode: 0x0) was found in the CDATA section. Nested exception: An invalid XML character (Unicode: 0x0) was found in the CDATA section. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal
com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: Failed to write xml program listing for UnicodeTest

Code:

/**
    Function comment

    @param  var some string param
    @returns TRUE or FALSE for some reason
 */
subroutine (dummyFunctionFortestingUnicodeIssue( var=vc)=i4)
    ; this comment contains text, which resembles a Unicode character, but is just text
    ; \u0000 this is a string 6 characters long, which just happens to resemble the Unicode character 0x0 encoding....
    null
end ; subroutine

Example 2

code contains the text '\u' followed by a space and then some numbers.

Error:

[ERROR] Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: For input string: " 000" -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: For input string: " 000

Code:

/**
    Function comment

    @param  var some string param
    @returns TRUE or FALSE for some reason
 */
subroutine (dummyFunctionFortestingUnicodeIssue( var=vc)=i4)
    ; this comment contains text, which resembles a Unicode character, but is just text
    ; \u 0000 this is a string 2 characters long one space and then 4 digits...
    null
end ; subroutine

Example 3

Code comment contains the text '\u' only (no decimals behind it)
The plugin will try to parse whatever is behind the '\u' and interpret that as the character code....

Error:

[ERROR] Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: For input string: " thi" -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: For input string: " thi"

Code:

/**
    Function comment

    @param  var some string param
    @returns TRUE or FALSE for some reason
 */
subroutine (dummyFunctionFortestingUnicodeIssue( var=vc)=i4)
    ; this comment contains text, which resembles the start of a Unicode character, but is just text
    ; \u this is a string 2 characters long, which just happens to resemble the start of a Unicode character....
    null
end ; subroutine

Example 4

Block-comment contains the text-string (not a Unicode character) '\u0000':

Error:

[ERROR] Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: Failed to write xml program listing for UnicodeTest: Failed to properly parse XML document. Error on line 1 of document : An invalid XML character (Unicode: 0x0) was found in the CDATA section. Nested exception: An invalid XML character (Unicode: 0x0) was found in the CDATA section. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: Failed to write xml program listing for UnicodeTest

Code:

/**
    Function comment

    \u0000 this is a string 6 characters long, which just happens to resemble the Unicode character 0x0 encoding....

    @param  var some string param
    @returns TRUE or FALSE for some reason
 */
subroutine (dummyFunctionFortestingUnicodeIssue( var=vc)=i4)
    null
end ; subroutine

Example 5

If a program contains a string variable/constant which resembles a Unicode encoded character, it fails the build as well

Error:

[ERROR] Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: Failed to write xml program listing for UnicodeTest: Failed to properly parse XML document. Error on line 1 of document : An invalid XML character (Unicode: 0x0) was found in the CDATA section. Nested exception: An invalid XML character (Unicode: 0x0) was found in the CDATA section. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test (default-test) on project UnicodeTest: Execution default-test of goal com.cerner.ccl.testing:ccl-maven-plugin:3.2:test failed: Failed to write xml program listing for UnicodeTest

Code:

/**
    Function comment

    @param  var some string param
    @returns TRUE or FALSE for some reason
 */
subroutine (dummyFunctionFortestingUnicodeIssue( var=vc)=i4)
    declare myString = vc with protect, noconstant("abc")

    ; this is a string 6 characters long, which just happens to resemble the Unicode character 0x0 encoding....
    set myString = "\u0000"
end ; subroutine

@feckertson
Copy link
Contributor

  1. The latest released version is 3.3
  2. Unable to reproduce. Please provide a link to an actual code project that exhibits the problem, the results from mvn -version, the exact maven command being applied and any global and/or profile configurations that could impact the sourceEncoding.

@mjj4791
Copy link
Author

mjj4791 commented Apr 17, 2020

This issue seems to be very illusive!
I have one situation where I had this issue (very persistently).
I tried my best to reproduce the issue in a small test, but i cannot get it to reproduce in isolation....

It seems that there is more at play that just the text string "\u0000" in the code comments; that alone does not trigger the issue.

What I do see is in the downloaded code/test result json/xml file, that it contains several \uxxxx sequences; such as \u0009 (which represents to a tab character in the source code).

In my case I see this in this file (filename '%TEMP%\j4ccl_dataout_?_?.json?.tmp'):
<LINE><NBR>2459<\/NBR><TEXT><![CDATA[ ; \u0000 this is a string 6 characters long, which just happens to resemble the Unicode character 0x0 encoding....]]><\/TEXT><\/LINE>

However in the file .\target\program-listings<file>.inc, i see this:
<LINE><NBR>2459</NBR><TEXT><![CDATA[ ; ? this is a string 6 characters long, which just happens to resemble the Unicode character 0x0 encoding....]]></TEXT></LINE>

The text \u0000 got converted into a character-0; So it seems that the conversion of the xml file, sometimes converts text that looks like a unicode charater....

I say sometimes, because if i do exactly the same in a simple test program, I get the same contents in the 'j4ccl_dataout_?_?.json?.tmp' file, but i do NOT get the 0-character in the program listing xml....

I hope this helps.... unfortionately I am unable to provide a reproducing testcase for this...

@feckertson
Copy link
Contributor

Reopen when a reproducible is available.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants