Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start using a C++ unit testing framework #643

Closed
jsiwek opened this issue Oct 18, 2019 · 10 comments
Closed

Start using a C++ unit testing framework #643

jsiwek opened this issue Oct 18, 2019 · 10 comments
Labels
Type: Project A self-contained project — for example an intern project, a tech evaluation, or prototyping
Milestone

Comments

@jsiwek
Copy link
Contributor

jsiwek commented Oct 18, 2019

This follows up from part of #633. We want to decide on a testing framework/setup to use and then add the basic infrastructure to use it in Zeek. E.g. pick a few util functions and add tests for them to serve as examples. Catch2, googletest, Boost.Test, doctest, and CAF were mentioned frameworks:

For basic CHECK and REQUIRE tests with fixtures, I think Catch2, googletest, Boost.Test, etc. are all pretty interchangeable. One thing that may speak for using CAF's testing headers (aside from Zeek having already access to it) is that it allows deterministic testing of actors in case Zeek ever decides to parallelize some tasks / components with actors. Also, Broker's unit tests already use the CAF headers. Having multiple test frameworks across Zeek's submodules introduces additional complexity.

Agree that's a nice bonus, except actor-ifying Zeek isn't something that's really made it into the near-term roadmap and Broker is niche enough that's it's not too terrible that it uses a separate testing framework.

I'd prioritize low-barrier/low-impact unit testing -- make it easy for people (core devs or external contributors) to write/run tests. My wishlist in order of what I'd like most in a unit test framework:

  • Ability to write unit tests directly alongside the production code (opposed to always in separate files)
  • Low impact on compile times
  • Test grouping/naming/matching, so it's easy to run subsets of tests together (probably regardless of whether the framework itself provides this, we can integrate w/ CMake/CTest to get this ability)

Other nice-to-haves (IMO) to consider:

  • Header-Only / Single-Header: just seems a bit easier to maintain / integrate that way
  • Test timeouts to fail/terminate if something takes too long, but we may not end up with too many of these (they'll either be more suitable as a btest or a Broker unit test anyway), and again can maybe fallback on organizing some tests to run via CTest and rely on its timeout mechanism
  • Parallelized test execution just to get things done quicker

Any thoughts on requirements/features or initial opinions on frameworks ?

doctest seems to align with the low-barrier/low-impact goal so I'm interested to try it.

@jsiwek jsiwek added the Type: Project A self-contained project — for example an intern project, a tech evaluation, or prototyping label Oct 18, 2019
@jsiwek jsiwek added this to the 3.1.0 milestone Oct 18, 2019
@jsiwek jsiwek added this to Unassigned / Todo in Release 3.1.0 via automation Oct 18, 2019
@Neverlord
Copy link
Member

I bundled the setup I used for C++ unit testing from #633 in b780672. Feel free use that as a starting point or to pick and choose.

@rsmmr
Copy link
Member

rsmmr commented Oct 21, 2019

After looking some more at doctest, I also think that that might be good choice for us. I like how the tests can be put right into the source code, and compiling along with the main binary solves the problem of having to redeclare all those globals in a separate test executable.

@JustinAzoff
Copy link
Contributor

Another nice-to-have: Supporting benchmarks as well as tests. As we refactor and modernize more core bits, would be great to have individual benchmarks for things like 'Dict'.

@Neverlord
Copy link
Member

Is doctest consensus at this point? I still have the unit tests for broker/Data.cc laying around, so I'd be happy to do the scaffolding and add this initial couple of tests if no one else is already working on it. 🙂

@jsiwek
Copy link
Contributor Author

jsiwek commented Oct 28, 2019

@Neverlord doesn't feel like strong consensus, but yeah, seems a couple people are curious if doctest will work well. If you did find time to try setting up a few tests with it, will be nice to hear your opinion of that experience. Especially since CAF's test framework was a viable alternative, interested to know if the reasons for preferring doctest were legit (e.g. workflow benefits).

@Neverlord
Copy link
Member

nice to hear your opinion of that experience

Well, the initial scaffolding is easy enough:

  • Copy the current doctest.h to src/3rdparty
  • Add a configure option to turn unit tests on (side note: there was some discussion in the last PR of whether the default should be on or off, but I think since we are baking the unit testing into the zeek binary here, this should be off by default to make sure no one accidentally builds a "test zeek")
  • Hook doctest into the main function:
    • I've extend the CLI handling slightly, so that we look for a --testing flag as first argument
    • Motivation: don't interfere with Zeek's CLI parsing
    • CLI now looks somewhat like this: zeek --testing [doctest-options] -- [zeek-options] (when not passing --testing as first argument everything behaves as usual)
    • Alternative: let Doxygen recognize anything with a --dt- prefix and ignore then for Zeek, it's certainly a matter of taste but I'd prefer a clear separation
  • Implement some simple first unit test to get started

I've pushed my initial scaffolding if anyone wants to take a look (see 0955947).

Unfortunately, linking is where trouble starts. Doctest bundles the unit tests with production code, but this does not play nicely with the way we currently build the zeek binary. I've added a unit test to src/broker/Data.cc. This ends up in a static library brokercomm. Turns out this prevents doctest from finding the tests. We can either use extra, non-trivial CMake scripts to force the necessary symbols into the binary, or we could use CMake object libraries instead.

I figured the latter should be simple, because ZeekSubdir.cmake and ZeekPluginStatic.cmake already seem to support that. Alas... it's broken. Setting bro_HAVE_OBJECT_LIBRARIES to YES results in several errors. I tried fixing them. Here are the diffs to the individual CMake files:

zeek/zeek

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 564fd7acb..bf55a332a 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -15,7 +15,7 @@ set(bro_PLUGIN_BIF_SCRIPTS CACHE INTERNAL "Zeek script stubs for BIFs in Zeek pl
 
 # If TRUE, use CMake's object libraries for sub-directories instead of
 # static libraries. This requires CMake >= 2.8.8.
-set(bro_HAVE_OBJECT_LIBRARIES FALSE)
+set(bro_HAVE_OBJECT_LIBRARIES YES)
 
 configure_file(version.c.in ${CMAKE_CURRENT_BINARY_DIR}/version.c)
 configure_file(util-config.h.in ${CMAKE_CURRENT_BINARY_DIR}/util-config.h)
@@ -139,7 +139,9 @@ list(APPEND BINPAC_OUTPUTS "${BINPAC_OUTPUT_CC}")
 ########################################################################
 
 set(bro_SUBDIR_LIBS CACHE INTERNAL "subdir libraries" FORCE)
+set(bro_SUBDIR_DEPS CACHE INTERNAL "subdir dependencies" FORCE)
 set(bro_PLUGIN_LIBS CACHE INTERNAL "plugin libraries" FORCE)
+set(bro_PLUGIN_DEPS CACHE INTERNAL "plugin dependencies" FORCE)
 
 add_subdirectory(analyzer)
 add_subdirectory(broker)
@@ -396,12 +398,12 @@ add_dependencies(generate_outputs generate_outputs_stage2a generate_outputs_stag
 
 # Build __load__.zeek files for standard *.bif.zeek.
 bro_bif_create_loader(bif_loader "${bro_BASE_BIF_SCRIPTS}")
-add_dependencies(bif_loader ${bro_SUBDIRS})
+add_dependencies(bif_loader ${bro_PLUGIN_DEPS} ${bro_SUBDIR_DEPS})
 add_dependencies(zeek bif_loader)
 
 # Build __load__.zeek files for plugins/*.bif.zeek.
 bro_bif_create_loader(bif_loader_plugins "${bro_PLUGIN_BIF_SCRIPTS}")
-add_dependencies(bif_loader_plugins ${bro_SUBDIRS})
+add_dependencies(bif_loader_plugins ${bro_PLUGIN_DEPS} ${bro_SUBDIR_DEPS})
 add_dependencies(zeek bif_loader_plugins)
 
 # Install *.bif.zeek.
@@ -452,6 +454,9 @@ install(FILES
         DESTINATION include/zeek/3rdparty/tsl-ordered-map
 )
 
+########################################################################
+## Unit test setup
+
 ########################################################################
 ## Clang-tidy target now that we have all of the sources
 

zeek/cmake

diff --git a/ZeekPluginStatic.cmake b/ZeekPluginStatic.cmake
index 339b0c8..2883a30 100644
--- a/ZeekPluginStatic.cmake
+++ b/ZeekPluginStatic.cmake
@@ -35,6 +35,7 @@ function(bro_plugin_end_static)
     add_dependencies(${_plugin_lib} generate_outputs)
 
     set(bro_PLUGIN_LIBS ${bro_PLUGIN_LIBS} "${_target}" CACHE INTERNAL "plugin libraries")
+    set(bro_PLUGIN_DEPS ${bro_PLUGIN_DEPS} "${_plugin_lib}" CACHE INTERNAL "plugin dependencies")
 endfunction()
 
 macro(_plugin_target_name_static target ns name)
diff --git a/ZeekSubdir.cmake b/ZeekSubdir.cmake
index 8ae987b..8c74a36 100644
--- a/ZeekSubdir.cmake
+++ b/ZeekSubdir.cmake
@@ -11,5 +11,7 @@ function(bro_add_subdir_library name)
     endif ()
 
     set(bro_SUBDIR_LIBS "${_target}" ${bro_SUBDIR_LIBS} CACHE INTERNAL "subdir libraries")
+    set(bro_SUBDIR_DEPS "bro_${name}" ${bro_SUBDIR_DEPS} CACHE INTERNAL "subdir dependencies")
     add_clang_tidy_files(${ARGN})
 endfunction()
+

After applying the patches, I get tons of duplicate symbols errors:

|| duplicate symbol 'file_analysis::Analyzer::~Analyzer()' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::~Analyzer()' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::~Analyzer()' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'vtable for file_analysis::Analyzer' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'typeinfo name for file_analysis::Analyzer' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'typeinfo for file_analysis::Analyzer' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::id_counter' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::SetAnalyzerTag(file_analysis::Tag const&)' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/entropy/CMakeFiles/plugin-Zeek-FileEntropy.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::~Analyzer()' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/extract/CMakeFiles/plugin-Zeek-FileExtract.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::~Analyzer()' in:
||     src/file_analysis/analyzer/data_event/CMakeFiles/plugin-Zeek-FileDataEvent.dir/__/__/Analyzer.cc.o
||     src/file_analysis/analyzer/extract/CMakeFiles/plugin-Zeek-FileExtract.dir/__/__/Analyzer.cc.o
|| duplicate symbol 'file_analysis::Analyzer::~Analyzer()' in:
--- snip ---

It's probably not hard to fix, but I didn't want to invest too much time since I don't know how deep the rabbit hole goes and what surprises are coming up after fixing that. If object libraries (and doctest) are the way we want to go, I can continue from here later - or someone else takes over. 🙂

@jsiwek
Copy link
Contributor Author

jsiwek commented Oct 29, 2019

Switching to use object libraries (and then also removing the alternate, now-unused CMake code paths) generally sounds nice. A guess at what your patch is missing is just removing the src/plugins.cc.in file? See the comment:

// Note: This won't be necessary anymore once we can assume CMake >2.8.8
// as a required depencendy. If so, switch bro_HAVE_OBJECT_LIBRARIES
// in src/CMakeLists.txt to TRUE and remove this.

A related commit:
8e7ef00, so yeah, if that wasn't the only remaining thing to do, could be a rabbit hole since we've had that code path disabled for ~6 years (and I can help look into it more).

@Neverlord
Copy link
Member

A guess at what your patch is missing is just removing the src/plugins.cc in file?

The file zeek/src/plugins.cc.in gets ignored when setting bro_HAVE_OBJECT_LIBRARIES to true. Removing it has no effect. Looking at the diff for the commit, I've also removed ${PLUGIN_INIT} from bro_SRCS to be sure, but the duplicate symbols still exist.

Commit 8e7ef00 also removed a static in src/plugin/Macros.h, but this file no longer exists. The plugin setup seems to have changed a lot, so I have no idea where to start.

@rsmmr
Copy link
Member

rsmmr commented Oct 30, 2019

No immediate thought on what's broken here with the disabled cmake path, but generally going with object libraries sounds good; that was the plan already several years ago until it became clear we couldn't because of old cmakes ... Now that we don't have that cmake restriction anymore, it'll be good to get this cleaned up and remove the work-arounds.

@jsiwek
Copy link
Contributor Author

jsiwek commented Nov 15, 2019

Initial doctest integration merged via #682

@jsiwek jsiwek closed this as completed Nov 15, 2019
Release 3.1.0 automation moved this from Unassigned / Todo to Done Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Project A self-contained project — for example an intern project, a tech evaluation, or prototyping
Projects
No open projects
Release 3.1.0
  
Done
Development

No branches or pull requests

4 participants