Automatic testing of releases vs. previous one #6

GoogleCodeExporter · 2015-08-12T07:44:40Z

Implement automatic testing of releases with a wide range of command line
options. Results of simulations should be automatically parsed and compared
to fixed references. This should decrease the chance of introducing new
bugs into old parts/features of the code during further development.

Original issue reported on code.google.com by yurkin on 26 Nov 2008 at 6:54

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-08-12T07:44:44Z

Original comment by yurkin on 26 May 2009 at 7:29

Added labels: Milestone-0.80

GoogleCodeExporter · 2015-08-12T07:44:44Z

Original comment by yurkin on 12 Jul 2010 at 8:35

Added labels: Milestone-1.1
Removed labels: Milestone-0.80

GoogleCodeExporter · 2015-08-12T07:44:44Z

Another possibility is to two executables (e.g. current versus previous 
release). The specified files from the produced results (e.g. stdout, log, 
CrossSec-Y) are compared (identical or not). If difference is found, files are 
sent to a graphical diff program (e.g. tortoisemerge), so one can quickly 
evaluate whether the difference is significant or not. Spurious examples of the 
latter are differences in the last digits or in values, which are negligibly 
small (analytical zeros). I am currently using such technique for preparing to 
release 1.0.

Advantage is there are no need either to parse parameters or to store database 
of benchmark results. Thus a vast range of command line parameters can be 
easily covered just by storing all variants of the command line.

Disadvantages are (advantages of the original idea)
1) Only parsing of results allows complete flexibility over e.g. which values 
to compare and to what accuracy (i.e. what difference is considered natural, 
and which is suspicious). The new proposed method allows choosing only the 
files to compare. Thus it does not seem possible (with the new method) to build 
up a fully automatic test suite, i.e. the one, which will not require user 
intervention when program is working fine.

2) It will not allow testing compilation result on a new platform, because in 
this case (verified) executable of the previous release are not readily 
available.

Original comment by yurkin on 5 Sep 2010 at 2:14

GoogleCodeExporter · 2015-08-12T07:44:44Z

The test suite described in previous comment has been added to repository after 
some improvements. See r1021. The main script is at tests/2exec/comp2exec. 
Comments inside it should be sufficient to understand its usage.

Currently, it produces a lot of "false alarms" due to round-off errors, but 
still can be used to perform an extensive test set in a short time (if GUI diff 
program is used). The next step should be to replace literal comparison of 
number-rich files (like mueller or CrossSec) by calculating differences of 
numbers and comparing to some threshold. An example of such implementation is 
provided by test scripts of near_field package (misc/near_field/RUNTESTS/)

r1021 - 7a89d7b

Original comment by yurkin on 16 Feb 2011 at 8:16

GoogleCodeExporter · 2015-08-12T07:44:44Z

Original comment by yurkin on 22 Apr 2011 at 2:40

Added labels: Milestone-1.2
Removed labels: Milestone-1.1

GoogleCodeExporter · 2015-08-12T07:44:44Z

r1107 greatly improves the performance of 2exec tests. It is now possible to 
run it almost unattended to perform a thorough list of tests. So it solves the 
problem of testing new releases against the previous ones. 

Creating a test, which does not use a reference executable, is still desirable. 
However, the priority of this is not that high.

r1107 - 60ed73c

Original comment by yurkin on 9 Feb 2012 at 3:45

Added labels: Priority-Medium
Removed labels: Milestone-1.2, Priority-High

myurkin · 2020-10-14T05:41:18Z

Currently scripts in tests/2exec do not test reserve FFT options (FFT_TEMPERTON and CLFFT_APPLE). This seems to be easy to implement, e.g. in tests/2exec/test_all by using FFTCOMP definition in comp2exec.

myurkin · 2020-10-14T06:31:12Z

A few more ideas:

Make convenient comparison with previous (semi-stable) version, similar to what is now possible with the previous release. This can be accompanied by automatic build script, based on specific version hash.
For non-interactive regime, output some error statistics in the end (how many differences were found). This is slightly related to Color in stdout/stderr #254.

myurkin · 2020-12-02T05:39:15Z

Probably, Github Actions can somehow be used for that.

myurkin · 2023-11-30T09:55:56Z

This paper seems to be relevant for ADDA testing (in general):
Ding J., Hu X.-H., and Gudivada V. A machine learning based framework for verification and validation of massive scale image data, IEEE Trans. Big Data 7, 451–467 (2017).

At least, it can be used to understand the modern terminology in computer science. For instance, meromorphic testing (when no true solution is known) is a relevant term.

GoogleCodeExporter added OpSys-All comp-Scripts Related to Makefiles, wrappers, developer and testing scripts pri-Medium Worth assigning to a milestone labels Aug 12, 2015

myurkin added task Not directly related to code (e.g., documentation) and removed Type-Task labels Aug 13, 2015

myurkin self-assigned this Nov 12, 2015

myurkin removed the auto-migrated label Nov 12, 2015

myurkin mentioned this issue Jun 19, 2018

Improvement of tests/2exec #238

Closed

myurkin assigned dsmunev and unassigned myurkin Jul 10, 2018

myurkin added this to the 1.5 milestone Jul 10, 2018

myurkin mentioned this issue Feb 22, 2022

Implement testing in GitHub Actions #316

Open

myurkin added DevOps Testing, deployment, automation and removed task Not directly related to code (e.g., documentation) comp-Scripts Related to Makefiles, wrappers, developer and testing scripts labels Mar 18, 2022

myurkin changed the title ~~Automatic testing of releases~~ Automatic testing of releases vs. previous one Mar 21, 2022

myurkin mentioned this issue Mar 21, 2022

Testing with equivalent command lines #317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic testing of releases vs. previous one #6

Automatic testing of releases vs. previous one #6

GoogleCodeExporter commented Aug 12, 2015

GoogleCodeExporter commented Aug 12, 2015

GoogleCodeExporter commented Aug 12, 2015

GoogleCodeExporter commented Aug 12, 2015

GoogleCodeExporter commented Aug 12, 2015

GoogleCodeExporter commented Aug 12, 2015

GoogleCodeExporter commented Aug 12, 2015

myurkin commented Oct 14, 2020

myurkin commented Oct 14, 2020

myurkin commented Dec 2, 2020

myurkin commented Nov 30, 2023