Refactor integration tests for clarity and speed by stevedlawrence · Pull Request #865 · apache/daffodil

stevedlawrence · 2022-10-28T15:47:37Z

The current integration tests have too much boilerplate that can be
refactored with some new APIs. Additionally, the integration tests are
incredibly slow due to all the forking and creation of new VMs. This
fixes both of these issues.

The new integration test API creates a new runCLI function, accepting
an array of string arguments (with a new "args" string interpolator to
make specifying this array easier), various execution parameters, a
function that contains the normal Expect logic all our of integration
tests already use, and finally the expected exit code.

A new path function is created to help deal with paths in a OS
agnostic manner.

New withTempFile and withTempDirectory functions are added to help
with creating temporary files in a special temporary daffodil directory,
and to ensure they are cleaned up when the test is complete.

A new withSysProp function is used to temporarily change and restore
system properties, helping to avoid changes that might interfere with
other tests.

With these new functions, integration tests now look something like
this:

    val schema = path("path/to/schema.dfdl.xsd")
    runCLI(args"parse -s $schema") { cli =>
      cli.send("data to parse", inputDone = true)
      cli.expect("<data>string to expect</data>")
    } (ExitCode.Success)

The runCLI function supports executing Daffodil in either a new process
or thread, defaulting to thread unless a test requires a separator
process (e.g. classpath changes). Streams are setup for handling
stdin/out/err. When threaded, the thread calls the
org.apache.daffodil.Main run function, and configures Main and log4j to
output to these streams. When using a new processing, the daffodil
sh/bat script is directly executed.

A new Expect instance is configured so that it can read/write to these
streams and validate the output, with a new CLITester class used to
provide helper wrappers around Expect to make logic a bit less
verbose.

Multiple changes were also made to Main and the Debugger to ensure they
always use the specified streams instead of always using stdin/out/err.
This is really only needed for this integration testing, but could
potentially be useful in other use cases where stdin/out/err need need
to be mimicked without forking a new process.

Note that the thread approach isn't thread safe, but integration tests
are already not run in parallel. By using threads instead of spawning
new processes where possible, we drastically decrease the overhead, with
test suites now taking a number of seconds instead of a number of
minutes, averaging and order of magnitude speed up.

This also tweaks various CLI tests to not expect data from files,
instead hardcoding what to expect in each test. This is a bit less
exact, but it make the logic simpler and more consistent, and allows us
to remove many files from the repo. Also found and removed many debugger
command files that are never actually used.

This also enables the blob CLI tests but adds an assumption so that they
are skipped if the large input files have not been generated. This gives
us a reminder that those tests exist and we can periodically ensure they
work by generating the files.

This also ensures coverage is enabled in the github workflow every time
we execute SBT. Otherwise, changes of coverage enablement cause a
recompile. By doing this we avoid recompiling the same files multiple
times which saves a couple of minutes overall. Note that we originally
disabled coverage on some steps because it caused failures, but whatever
caused that seems to have been fixed.

DAFFODIL-2381

tuxji

I think the new API looks fine and I would like to see the rest of the integration tests converted to use it too. The converted tests are much smaller as well as run much faster, which is a big quality of code improvement. As I noted in a couple of places, we might even be able to remove the old API entirely (which I would prefer).

We may need to pay attention to other places in Daffodil which call some variation of "println" and determine if these places need to be updated or not. We also may need to increase code coverage (this pull request currently hits only 65% code coverage).

tuxji · 2022-10-28T22:07:37Z

+  /*
+   * TODO: Update new API to support a way to set DAFFODIL_JAVA_OPTS. Maybe
+   * runCLI optionally accepts environment options, and they are provided it
+   * switches from using a thread to forking a process.


I have evidence from both another Daffodil source file (see daffodil-lib/src/test/scala/org/apache/daffodil/util/TestXMLCatalogAndValidate.scala) and a web search that you can set the Java system property xml.catalog.files at runtime and still see the change take effect upon the next catalog lookup. This particular test could call System.setProperty("xml.catalog.files", testXcatalogFile) and check for the desired result without needing to fork a process, although it would have to reset the property afterwards.

I don't see any other test using DAFFODIL_JAVA_OPTS in such a deliberate and meaningful way, so I think we can avoid the need to accept an optional environment map and switch runCLI from using a thread to forking a process. This would allow us to clean up Util.java by removing all the old API functions once all of the integration tests are converted to the new API (all the process-related code simply goes away).

tuxji · 2022-10-28T22:23:15Z

-    val testXcatalogFile = if (Util.isWindows) Util.cmdConvert(xcatalogFile) else xcatalogFile
+    val xcatalogFile = path("daffodil-cli/src/it/resources/org/apache/daffodil/CLI/xcatalog_invalid.xml")

    val DAFFODIL_JAVA_OPTS = Map("DAFFODIL_JAVA_OPTS" -> ("-Dxml.catalog.files=" + testXcatalogFile + " -Xms256m -Xmx2048m -Dfile.encoding=UTF-8"))


This is the only test which deliberately passes DAFFODIL_JAVA_OPTS in such a purposeful way, and it's not even necessary since you can do the same thing with System.setProperty.

(Later I found another test which also does a similar thing with properties, but maybe it can set these properties at runtime too.)

I've converted pretty much everything to this new API, you are right that tests that set DAFFODIL_JAVA_OPTS don't really need to do that, and in the case where it is needed there are workarounds using setProperty.

However, most of the UserDefinedFunction integration tests set the DAFFODIL_CLASSPATH variable. Essentially what these tests do is put all .class files defined in daffodil-udf on the classpath, and then individual tests also add specific META-INF/services directories on the classpath to tests loading specific UDF. I'm not sure there's a way to modify the classpath like this without forking a new process? Any ideas oh if that's possible?

Another consideration is that the UserDefinedFunctionService object loads all the available UDF's on initialization and stores their information in a mutable HashMap. So if we were able to have classpath specific for these tests, we would also need to ensure this object is reset/reloaded appropriately. It's possible sbt magic already handles resetting objects? I feel like they don't persist inbetween tests, or maybe test suites?

Any thoughts?

Note that even if we do fork for only the UDF integration tests, it still feels like a pretty big win. With the current changes, all integration tests take about 2 minutes on my machine (half of that is UDFs), which is a big improvement over the usual 20+ minutes.

Yes, it's possible but far from easy to search a different classpath at runtime without forking a new process. Java 9 and later versions have removed the hackish (it was never part of the API) ability to modify the system classloader at runtime. The only portable solution is to create a custom classloader and modify your application to use the custom classloader instead of the system classloader. There appears to be no solution that will allow the UserDefinitionFuction integration tests to avoid forking unless we modify not only these tests, but also the daffodil-udf code (if it's possible to restrict the scope of the changes to only that library), or else the Main class itself. I think we will have to be satisfied with the speedup we get from converting most of the integration tests and accept the slowness of forking for the UDF integration tests. Running all integration tests in 2 minutes instead of 20+ minutes is still wonderful.

Here are the two best StackOverflow questions about changing the Java classpath at runtime. I also googled for Scala as well as Java answers and didn't find a better approach although I did find a Scala library which lets you search any directory or jars for classes (the drawback is you can't load these classes unless the classes are also loadable by your system classloader or a custom classloader).

https://stackoverflow.com/questions/252893/how-do-you-change-the-classpath-within-java
https://stackoverflow.com/questions/60764/how-to-load-jar-files-dynamically-at-runtime
http://software.clapper.org/classutil/

tuxji · 2022-10-28T22:28:31Z

+import org.apache.logging.log4j.core.config.AbstractConfiguration
+import org.apache.logging.log4j.core.config.ConfigurationSource

 object Util {


Once all of the integration tests are converted to the new API, we should remove as many unused definitions from Util.scala as possible (daffodilPath, binPath, start, startNoConvert, getShell, cmdConvert, etc.).

tuxji · 2022-10-28T22:32:52Z

    }
  }
+
  /* See DFDL-1264


When you see some commented out tests, I hope you will uncomment them to verify the tests still behave the same way before you comment them back out again. We may find that the tests no longer behave the same way due to other changes in Daffodil and should be deleted or changed anyway.

tuxji · 2022-10-28T22:40:50Z

+
    val java_opts = Map("DAFFODIL_JAVA_OPTS" ->
      ("-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl " +
        "-Djavax.xml.xml.validation.SchemaFactory=com/sun/org/apache/xerces/internal/jaxp/validation/XMLSchemaFactory"))


OK, this test is indeed another test which sets DAFFODIL_JAVA_OPTS in a purposeful way. However, maybe these properties can be set at runtime with System.setProperty too. I would try that first before having to keep the old API around or make runCLI switch from using a thread to starting a process.

tuxji · 2022-10-28T22:50:47Z

            }
            val formatStr = maxCols.map(max => "%" + -max + "s").mkString("  ")
-            println(formatStr.format(headers: _*))
+            STDOUT.println(formatStr.format(headers: _*))


We have a bunch of other places in other Daffodil source files which call println or System.out.println or System.err.println or Console.out.println. These places probably get called only in special circumstances which won't affect the integration tests, but it's something we may have to watch out for and keep in mind. If it turns out to be necessary, we can pass around a PrintStream to be called or define a smart Daffodil println function to replace some of these calls.

mbeckerle

+1

This is a big improvement. The speed will be very nice, but also elimintes a bunch of portability concerns with the tests also where MS-Windows-isms were not adeuqately abstracted.

That is, this fixes DAFFODIL-2381

I belive this will also make debugging these tests more straightforward, as one can just set breakpoints in Daffodil per usual and debug the test, rather than various temp/hack ways Ih've had to modify the tests (have them call main, instead of spawning a process) for debugging in IDE.

stevedlawrence · 2022-11-04T16:40:24Z

I've completed the refactor using a new API and address all comments, I think this is ready for a full review.

stevedlawrence · 2022-11-04T17:11:53Z

Looks like the PR checks finished in 17-25, with most taking about 20 mins. Compared to the most recent merged commit which took between 30-60 averaging on the higher end, this is a nice speed up.

mbeckerle · 2022-11-04T19:34:32Z

This will be post 3.4.0, i.e., on the 3.5.0 branch yes?

stevedlawrence · 2022-11-04T19:38:03Z

This will be post 3.4.0, i.e., on the 3.5.0 branch yes?

Correct, once this gets the +1's, I'll hold off merging until 3.4.0 is released.

mbeckerle

+1

tuxji

+1

tuxji · 2022-11-04T18:55:16Z

+   * verifies the expected exit code.
+   *
+   * For performance reasons, this defaults to running the CLI in a new thread
+   * unless the classpaths parameter is nonempty or he fork parameter is set to


he -> the (fork parameter....)

tuxji · 2022-11-04T18:56:34Z

+   * @param classpaths sequence of paths to add to the classpath. If non-empty,
+   *   runs the CLI in a new process instead of a thread and will likely decrease
+   *   performance
+   * @param fork if true, forces the the CLI in a new process


the the CLI in -> the CLI to run in

tuxji · 2022-11-04T18:58:47Z

+   * @param testFunc function to call to send input to the CLI and validate
+   *   output from CLI stdout/stderr.
+   * @param expectedExitCode the expected exit code of the CLI. In the actual
+   *   exit code does not match


Second sentence doesn't make sense grammatically, please delete or revise. Also please decide if param descriptions should end with a period or end without a period and end all of them the same way.

tuxji · 2022-11-04T19:08:51Z

+   *   exit code does not match
+   *
+   * @throws AssertionError if the actual exit code does not match the expected exit code
+   * @throws ExpectIOException if the an CLITester expect validation operation fails


Please choose only one article (the / a).

tuxji · 2022-11-04T19:11:14Z

+        (toIn, fromOut, fromErr, Right(process))
+      } else {
+        // create a new thread for the CLI test to run, using piped
+        // input/output streams to connected the thread and the CLItester


connected -> connect

tuxji · 2022-11-04T20:57:03Z

+
+    withSysProp("javax.xml.parsers.SAXParserFactory" -> "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl") {
+      withSysProp("javax.xml.xml.validation.SchemaFactory" -> "com/sun/org/apache/xerces/internal/jaxp/validation/XMLSchemaFactory") {
+        runCLI(args"parse -s $schema -r matrix") { cli =>


tuxji · 2022-11-04T21:00:59Z

-  val output4 = Util.getExpectedString("output4.txt")
-  val output6 = Util.getExpectedString("output6.txt")
-  val output12 = Util.getExpectedString("output12.txt")
-  val savedParserFile = new File("savedParser.xsd.bin")


This PR inlines these files' expected strings into the tests below and uses withTempFile { parser instead of savedParserFile. Nice!

tuxji · 2022-11-04T21:05:21Z

-  // See DFDL-1141
-  /*@Test def test_3036_CLI_Saving_SaveParser_debug() {
+  // DAFFODIL-1141
+  /*@Test*/ def test_3037_CLI_Saving_SaveParser_trace(): Unit = {


This PR uncomments the test's body and keeps only the @Test commented out. Nice, this will keep the test compiling even after changes to functions.

tuxji · 2022-11-04T21:18:12Z

-      shell.close()
-    }
+  // DAFFODIL-1346
+  /*@Test*/ def test_3576_CLI_Unparsing_validate(): Unit = {


Again, nice job keeping a commented out test compiling.

tuxji · 2022-11-04T21:29:28Z

-    file("daffodil-cli/src/it/resources/org/apache/daffodil/CLI/debugger/1382"),
-    file("daffodil-cli/src/it/resources/org/apache/daffodil/CLI/debugger/1591"),
-    file("daffodil-cli/src/it/resources/org/apache/daffodil/CLI/debugger/1602"),
-    file("daffodil-cli/src/it/resources/org/apache/daffodil/CLI/debugger/1863"),


Does the PR delete these files because their contents had already been inlined as input strings into the corresponding CLI tests in an earlier PR? Nothing seems to read or use these files even before this PR was created, so it's time to remove them.

Looking at the commit history, when debugger tests were originally added (nearly 10 years) ago they used the -d option and passed in these files. A short time later (04e50f9), we switched to ExpectIt and changed these tests to inline the debugger commands, and I guess we just forgot to remove the files.

stevedlawrence · 2022-11-07T14:25:07Z

We also may need to increase code coverage (this pull request currently hits only 65% code coverage).

I didn't really add any new code except in couple minor places. I think lack of coverage in this PR is because I've made changes to printlns that were just already missing coverage.

In the past I know I've hesitated to add new CLI tests to get coverage because of the high overhead. But now that integration tests are much quicker, we may want to consider adding new tests to improve our CLI coverage. I think that's out of scope for this PR though, since this is more about just improving the CLI test infrastructure.

mbeckerle · 2022-11-07T15:13:34Z

When this says 64% coverage, is that the unit tests plus the CLI tests? Or just the CLI tests alone?

stevedlawrence · 2022-11-07T15:19:56Z

I believe it's both unit tests + CLI tests. But most of the lines missing coverage are println's in Main.scala that can only be tested with he CLI/Integration tests.

mbeckerle · 2022-11-07T15:59:32Z

Something doesn't add up.

We were at roughly 80% coverage on a 110K-lines code base.

We dropped to 64% coverage. That's a 16% drop, meaning 17K-lines are no longer covered that were.

That's can't be explained by println statements in the Main/CLI code.

So I'm assuming my arithmetic here doesn't make sense. But I clearly don't understand what we mean by 80+% dropping to 64% coverage.

stevedlawrence · 2022-11-07T16:09:27Z

I think that 64% is just the the coverage of this PR, not the coverage of the entire codebase. So of the lines this PR changed to src/main, 64% of those lines have a test that hit them. The overall coverage is still about 80%.

Here's the codecov page for this PR that makes it a bit more clear:

https://app.codecov.io/gh/apache/daffodil/pull/865

mbeckerle · 2022-11-07T16:13:25Z

Ok. Then I suggest adding a ticket to enhance code coverage here. But I wouldn't hold up committing this much improvement for that.

stevedlawrence · 2022-11-08T12:47:12Z

DAFFODIL-2746 added to improve CLI code coverage.

The current integration tests have too much boilerplate that can be refactored with some new APIs. Additionally, the integration tests are incredibly slow due to all the forking and creation of new VMs. This fixes both of these issues. The new integration test API creates a new `runCLI` function, accepting an array of string arguments (with a new "args" string interpolator to make specifying this array easier), various execution parameters, a function that contains the normal `Expect` logic all our of integration tests already use, and finally the expected exit code. A new `path` function is created to help deal with paths in a OS agnostic manner. New `withTempFile` and `withTempDirectory` functions are added to help with creating temporary files in a special temporary daffodil directory, and to ensure they are cleaned up when the test is complete. A new `withSysProp` function is used to temporarily change and restore system properties, helping to avoid changes that might interfere with other tests. With these new functions, integration tests now look something like this: val schema = path("path/to/schema.dfdl.xsd") runCLI(args"parse -s $schema") { cli => cli.send("data to parse", inputDone = true) cli.expect("<data>string to expect</data>") } (ExitCode.Success) The runCLI function supports executing Daffodil in either a new process or thread, defaulting to thread unless a test requires a separator process (e.g. classpath changes). Streams are setup for handling stdin/out/err. When threaded, the thread calls the org.apache.daffodil.Main run function, and configures Main and log4j to output to these streams. When using a new processing, the daffodil sh/bat script is directly executed. A new `Expect` instance is configured so that it can read/write to these streams and validate the output, with a new CLITester class used to provide helper wrappers around `Expect` to make logic a bit less verbose. Multiple changes were also made to Main and the Debugger to ensure they always use the specified streams instead of always using stdin/out/err. This is really only needed for this integration testing, but could potentially be useful in other use cases where stdin/out/err need need to be mimicked without forking a new process. Note that the thread approach isn't thread safe, but integration tests are already not run in parallel. By using threads instead of spawning new processes where possible, we drastically decrease the overhead, with test suites now taking a number of seconds instead of a number of minutes, averaging and order of magnitude speed up. This also tweaks various CLI tests to not expect data from files, instead hardcoding what to expect in each test. This is a bit less exact, but it make the logic simpler and more consistent, and allows us to remove many files from the repo. Also found and removed many debugger command files that are never actually used. This also enables the blob CLI tests but adds an assumption so that they are skipped if the large input files have not been generated. This gives us a reminder that those tests exist and we can periodically ensure they work by generating the files. This also ensures coverage is enabled in the github workflow every time we execute SBT. Otherwise, changes of coverage enablement cause a recompile. By doing this we avoid recompiling the same files multiple times which saves a couple of minutes overall. Note that we originally disabled coverage on some steps because it caused failures, but whatever caused that seems to have been fixed. DAFFODIL-2381

tuxji reviewed Oct 28, 2022

View reviewed changes

mbeckerle approved these changes Nov 3, 2022

View reviewed changes

stevedlawrence force-pushed the daffodil-2381-integration-tests branch from ebb4a9a to 2502782 Compare November 4, 2022 16:37

stevedlawrence changed the title ~~RFC: Refactor integration tests for clarity and speed~~ Refactor integration tests for clarity and speed Nov 4, 2022

mbeckerle approved these changes Nov 4, 2022

View reviewed changes

tuxji approved these changes Nov 4, 2022

View reviewed changes

stevedlawrence force-pushed the daffodil-2381-integration-tests branch from db8b43b to a759ebc Compare November 8, 2022 13:50

stevedlawrence merged commit 1a0954a into apache:main Nov 8, 2022

stevedlawrence deleted the daffodil-2381-integration-tests branch November 8, 2022 16:54

Conversation

stevedlawrence commented Oct 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tuxji left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mbeckerle left a comment

Choose a reason for hiding this comment

Uh oh!

stevedlawrence commented Nov 4, 2022

Uh oh!

stevedlawrence commented Nov 4, 2022

Uh oh!

mbeckerle commented Nov 4, 2022

Uh oh!

stevedlawrence commented Nov 4, 2022

Uh oh!

mbeckerle left a comment

Choose a reason for hiding this comment

Uh oh!

tuxji left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stevedlawrence commented Nov 7, 2022

Uh oh!

mbeckerle commented Nov 7, 2022

Uh oh!

stevedlawrence commented Nov 7, 2022

Uh oh!

mbeckerle commented Nov 7, 2022

Uh oh!

stevedlawrence commented Nov 7, 2022

Uh oh!

mbeckerle commented Nov 7, 2022

Uh oh!

stevedlawrence commented Nov 8, 2022

Uh oh!

Reviewers

Assignees

Labels

stevedlawrence commented Oct 28, 2022 •

edited

Loading

tuxji left a comment •

edited

Loading