Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a better way to test Trixi2Vtk #680

Open
efaulhaber opened this issue Jun 29, 2021 · 13 comments
Open

Find a better way to test Trixi2Vtk #680

efaulhaber opened this issue Jun 29, 2021 · 13 comments
Assignees

Comments

@efaulhaber
Copy link
Member

Currently, Trixi2Vtk tests are comparing hashes of the generated VTK files. This doesn't work reliably, especially not when testing on different operating systems.

@sloede
Copy link
Member

sloede commented Aug 19, 2021

I am trying to collect some ideas here, and weigh the pros and cons. Feel free to amend this post with more arguments or new ideas (and to leave a message afterwards stating what you did):

0 Do not change anything

That is, compare SHA hashes of the VTK files generated while testing against reference values.

Pro

  • Zero initial overhead (we already have this!)
  • Works for all file formats (VTI, VTK, VTS)
  • Transparent to changes to underlying data written etc. (the hash is oblivious to the file format)
  • Highly sensitive: anything changes, the hash will change

Con

  • Overly sensitive: VTK format changes or machine-accuracy differences in Trixi results already produce changed hashes even though no differences are visible to the user
  • Not portable: different hashes for different OS, different versions of zlib
  • No diff: a change in the hash only says something has change, but not what has changed
  • Not very robust & high maintenance: need to continuously update the hashes for any change --> real risk of people just updating hashes without checking files

1 Store VTK uncompressed

Achievable by passing compress=false to vtk_grid as an additional option.

Pro

  • Might circumvent the issue of different zlib implementations causing different hashes
  • Easy to implement

Con

  • Does not reflect what we use in practice (we store as compressed when running Trixi2Vtk)
  • Other disadvantages of current approach remain

2 Compare only headers

Instead of comparing the entire file and its contents, we could just compare the headers without the data. For example, for a VTK file, the file would look something like:

<?xml version="1.0" encoding="utf-8"?>
<VTKFile type="UnstructuredGrid" version="1.0" byte_order="LittleEndian" header_type="UInt64" compressor="vtkZLibDataCompressor">
  <UnstructuredGrid>
    <Piece NumberOfPoints="259997" NumberOfCells="197440">
      <Points>
        <DataArray type="Float64" Name="Points" NumberOfComponents="3" format="appended" offset="0"/>
      </Points>
      <Cells>
        <DataArray type="Int32" Name="connectivity" NumberOfComponents="1" format="appended" offset="543461"/>
        <DataArray type="Int32" Name="offsets" NumberOfComponents="1" format="appended" offset="1369328"/>
        <DataArray type="UInt8" Name="types" NumberOfComponents="1" format="appended" offset="1598133"/>
      </Cells>
      <CellData>
        <DataArray type="Float64" Name="rho" NumberOfComponents="1" format="appended" offset="1598380"/>
        <DataArray type="Float64" Name="v1" NumberOfComponents="1" format="appended" offset="2969685"/>
        <DataArray type="Float64" Name="v2" NumberOfComponents="1" format="appended" offset="4283158"/>
        <DataArray type="Float64" Name="p" NumberOfComponents="1" format="appended" offset="5822483"/>
      </CellData>
    </Piece>
  </UnstructuredGrid>
  <AppendedData encoding="raw">
  [raw encoded data]
  </AppendedData>
</VTKFile>

We could just read in the files line by line until AppendedData, then diff the string against a reference.

Pros

  • Robust against changes in the file format, the compression, or even changes to the solution
  • On error, there is a chance to find out what went wrong

Cons

  • Maybe too insensitive against, e.g., errors during interpolation
  • Requires new code for creating a suitable diff

3 Read in VTU files and compare directly

Since ParaView files are "somewhat" compatible to XML, we could just parse them as XML and then compare. Alternatively, we could try to read them in as actual VTK files.

Pros

  • Fine-grained control over sensitivity (e.g., allow differences of floating point values up to a given epsilon)

Cons

  • Tremendous effort to implement testing infrastructure
  • Need to store entire VTU files as references

@sloede
Copy link
Member

sloede commented Aug 19, 2021

My current favorite would be to try 1) and if it doesn't help enough, go for 2).

@efaulhaber
Copy link
Member Author

It looks to me like the headers of two simulations in 2) are identical if the same number of cells and the same polydeg has been used. So, the output could have completely different nodal coordinates and nodal values, and the tests would still pass, right?

@sloede
Copy link
Member

sloede commented Aug 19, 2021

It looks to me like the headers of two simulations in 2) are identical if the same number of cells and the same polydeg has been used. So, the output could have completely different nodal coordinates and nodal values, and the tests would still pass, right?

Yes, correct. Although I think it's unlikely (but not impossible) that the number of cells is "accidentally" generated correctly, this check no. 2) will tell us nothing about the contents of the individual arrays. If that's not enough, I think we'd have to go towards something like 3) and either parse the files as XML or directly write a (simplified) parser to be able to compare values one by one against a reference file.

@ranocha
Copy link
Member

ranocha commented Aug 20, 2021

We're already doing this in practice, aren't we? Speaking at least for me, whenever I see a failing hash check, I expect it to be caused by minor differences and tend to think we should just update the hash. Effectively, we could just remove the checks of the hashes and would only get some smoke tests 🤷

@sloede
Copy link
Member

sloede commented Aug 20, 2021

Yes, I agree, @ranocha - I have done the same...

In the past, I have done such tests for comparing two files by first comparing their headers and then compare the contents variable by variable. I expected equality for integer types and allowed for 1e-12 difference for floating point values.

The VTK XML format is not super sophisticated, so I guess it would be possible to write a rudimentary reader in Julia. Given that we will likely always rely on ParaView for 3D visualization, I think the question about a more sophisticated test routine is not "if" but rather "when".

@sloede
Copy link
Member

sloede commented Aug 22, 2021

To support this discussion at the next meeting, I went ahead and implemented a rudimentary VTK XML reader: ReadVTK.jl. It is capable of reading in all unstructured VTK files produced by Trixi2Vtk, i.e., those files ending on .vtu.

Thus, it should be sufficient for comparing VTK files against references during testing, should we decide to go down this road. If necessary, adding support for the remaining file types wouldn't be too hard either.

@sloede
Copy link
Member

sloede commented Aug 24, 2021

The consensus at today's Trixi meeting was to rewrite Trixi2Vtk.jl testing to use ReadVTK.jl for comparing generated VTK files against reference files. Maybe @FelipeSantillan will be able to help us with this.

@sloede
Copy link
Member

sloede commented Oct 20, 2021

@FelipeSantillan @ranocha @jlchan

This is what I had in mind for ReadVTK.jl-based testing (but feel free to comment/improve):

Rewrite test_trixi2vtk such that it takes the names of reference VTK files against which to test. Then, for each reference file/test file (we should probably adopt a unique naming convention and stick to it), run the following checks (in this order) in a separate function:

  • Compare if expected file exists
  • Compare headers
    • meta data such as file type, version, byte order, compressor should match
    • number of cells and points should match as well
  • Compare points
    • Shape should match
    • Values should match approximately (FP comparison)
  • Compare cells
    • Connectivity, offsets, cell types should match exactly (all integers IIRC)
  • Compare {point, cell} data
    • Number of variables should match
    • For each variable in the reference file, a variable should exist in the file to test and vice versa (i.e., no missing, no extra variables)
    • For each variable in the reference file, compare to the variable in the file to test:
      • Length should match
      • Types should match
      • Values should match approximately for floating point data, exactly for all other data

A possible API for said test function could be

function compare_vtk_files(reference_file, vtk_file, atol=1e-12)
  ...
end

where atol (and possible rtol?) allow to specify tolerances for floating point comparisons.

Each test category should be in its own @testset such that an error can immediately be tracked to the offending part of the VTK file. Equally important: If possible a failed test should not result in the remaining tests to be skipped (unless they are rendered impossible, e.g., when a variable is missing). This way, you get the full picture of failures after a single run and don't have to step through it multiple times.

The code that compares and checks a single file against its reference (i.e., compare_vtk_files above) should be stateless and usable outside the testing framework, such that it can be used to also manually test VTK files against references. Maybe it could even become an exported function from Trixi2Vtk?

I would put the reference files in a repo, e.g., Trixi2Vtk_reference_files (or Trixi2Vtk_test_files). In the repo, I would probably create a folder structure that mimics the test structure with a clear rule to translate from test name to folder, e.g., "2D" -> "TreeMesh" -> "uniform test" to 2d/treemesh/uniform-test etc. In the test scripts of Trixi2Vtk, I would store the latest commit hash from that reference file repo for which to retrieve all reference files, such that files can be overwritten/deleted without destroying testing for older versions.

Now, these are my thoughts that I had when starting to write ReadVTK.jl, based on experience from other codes I used to work with, where we compared results against reference files. Please note that these are merely taken to be suggestions, and I might very well have missed something, but maybe this can be used to start a conversation and give @FelipeSantillan a starting point.

@sloede
Copy link
Member

sloede commented Oct 20, 2021

Regarding the ReadVTK.jl package registration: For the time being, @FelipeSantillan could install ReadVTK.jl locally as a dev version and develop initial testing capabilities based on it. Once he finds that the interface seems to work well enough, we need to register it as a package such that it can be used for CI testing. Alternatively, we could just register v0.1.0 right now, and then do a breaking release v0.2 (if necessary) once Felipe has gathered some experience with it.

The former approach might be slightly cleaner, the latter is likely more convenient. What do you think?

@ranocha
Copy link
Member

ranocha commented Oct 20, 2021

Your overall proposal sounds good to me - thanks 👍
If ReadVTK.jl is already in a "somewhat usable" shape, I think it's okay to register it right now. That way, we will not have to wait for three days when everything is fine.

@sloede
Copy link
Member

sloede commented Oct 20, 2021

Your overall proposal sounds good to me - thanks 👍 If ReadVTK.jl is already in a "somewhat usable" shape, I think it's okay to register it right now. That way, we will not have to wait for three days when everything is fine.

OK. I think it is very well in a usable state right now. I'll check again, though, and then register it as a new package if it seems OK.

@ranocha
Copy link
Member

ranocha commented Oct 20, 2021

Moreover, we should copy the elixirs we use for testing Trixi2Vtk.jl over to Trixi2Vtk.jl - the exact content of elixirs shouldn't be part of the public API of Trixi.jl, I think. This has already caused problems in trixi-framework/Trixi2Vtk.jl#46

ranocha added a commit to trixi-framework/Trixi2Vtk.jl that referenced this issue Jan 13, 2022
This allows us to merge simple improvements such as #50 without tedious and useless work.

Xref trixi-framework/Trixi.jl#680
ranocha added a commit to trixi-framework/Trixi2Vtk.jl that referenced this issue Jan 13, 2022
This allows us to merge simple improvements such as #50 without tedious and useless work.

Xref trixi-framework/Trixi.jl#680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants