Change `ScalarType` to `U8` for colors in lookup tables stored in binary files #47

Siimeloni · 2024-08-15T12:20:05Z

When trying to parse a binary .vtk file the program had an error in the Vtk::parse_legacy_be() function.

2024-08-15T11:56:05.184570Z ERROR GST convert failed error=
   0: Parse error: Complete

The file has following structure:

# vtk DataFile Version 3.0
vtk output
BINARY
DATASET UNSTRUCTURED_GRID
POINTS 2804 float
---data---

CELL_TYPES 3633
--- data---

CELL_DATA 3633

SCALARS colors float 1
LOOKUP_TABLE color_table
---data---

LOOKUP_TABLE color_table 2
---data---
contains (in binary, encoded as u8):
1 0 0 1
1 1 1 1
-----------

The error happens in the attribute_lookup_table() function.

It was not possible to parse this file because vtkio was expecting the color values (r, g, b, a) to be encoded in f32 binary and not in u8. Since we are able to write these files on our own we tested output with the color values written as f32, but on import ParaView would throw an error. This does not happen with the u8 color values, so we assume that u8 is the correct encoding here.

When changing the appropriate location in vtkio we are able to successfully parse these files.

…ary files

elrnv · 2024-08-16T05:09:37Z

This looks like a bug, nice find and thank you for the PR!
The correct behavior is in the spec and is thankfully consistent with your comment:

The definition of color scalars (i.e., unsigned char values directly mapped to color) varies depending upon the number of values (nValues) per scalar. If the file format is ASCII, the color scalars are defined using nValues float values between (0,1). If the file format is BINARY, the stream of data consists of nValues unsigned char values per scalar value.

We will need to parse as float for ASCII and u8 for binary. Could you update the PR to do this and add a test? I would be happy to accept.

MrMuetze · 2024-08-20T08:59:56Z

Heyo 👋 I've been looking into this issue together with @Siimeloni and we've found a couple of things that probably make this fix a bit more involved. We wanted to summarize our findings so far to maybe get some input from @elrnv and to just have something to come back to in the future.

Changing the parser part is only half of the required fix here as we also need to adjust the writer side. We have mostly looked at the cube_complex_test() since it covers binary and ASCII for parsing and writing. For the parser, the fix can be done as described by @elrnv and correctly parses F32 for ASCII input and U8 for binary input.

Here we run into the first issue. Since the cube_complex_test() tests the writer and the parser (and the writer hasn't been adjusted so far), the test fails because for binary output, the writer still writes F32 values. Thus the parser does not encounter U8 which leads to a test failure.

Then I started to look into the writer code more deeply. I guess that, up until this point, the assumptions under which the VTK model and the parser/writer infrastructure haven been devised include that all attributes for a given dataset always remain in the same type independent of the input/output format. The writer functions are relatively rigid in regards to the output type. They just write whatever buffer they have been fed. For this specific lookup table we would have to decide whether to write F32 or U8 based on the output type. This is something that cannot be easily done right away (if I haven't overlooked anything).

Since I wanted to try something anyway, I hacked something into the writer code (adjusting

vtkio/src/writer.rs

Lines 881 to 883 in 02d9100

    
           IOBuffer::F32(v) => { 
        
               write_buf_impl(v, &mut self.0, W::write_f32::<BO>)?; 
        
           }

) to convert F32 input to U8 if the length would match the lookup table data from the test. This leads to the writer/parser combination working nicely, but of course that is not a practical solution. Additionally, the test still would not work because a difference in the VTK Model (test expects F32, but now we parse a U8 buffer).

This might be another issue that came up. With the changes described above, the VTK model might contain an IOBuffer::F32 or an IOBuffer::U8 for the lookup table, while still representing the same kind of data. So based on input/output types, we might have to do one of the following conversions when writing the model again:

F32 -> F32
F32 -> U8
U8 -> F32
U8 -> U8

I'm not sure how practical it is that the VTK model is directly dependent on the input data type at this point. Maybe it would make sense to alway convert this lookup table data to either F32 or U8? Another question is if we should adjust all writer functions to be more flexible in terms of their output or if it would be more sensible to add new functions that handle this specific lookup table case? This is the point where we would like to get some feedback. Maybe we have overlooked something up to this point or are not seeing the best solution. We would love to hear your thoughts on this!

Best regards

This is implemented in the writer and parser.

elrnv · 2024-08-22T05:57:25Z

Good point. I added an writer implementation here: GiGainfosystems#1 . Please review to see if I missed anything.

Siimeloni scalartype color lookup

MrMuetze · 2024-09-03T07:03:31Z

I've incorporated the remaining changes discussed on the other PR and I think this should be ready now. Thanks again for your input and work on this!

elrnv · 2024-09-04T17:04:10Z

Just one comment above. Otherwise ready to merge. Thank you, @MrMuetze, for adding these changes here!

MrMuetze · 2024-09-04T18:48:24Z

Cool, Im not sure what comment you are referring to though. 😅 Am I not seeing something on my end?

elrnv · 2024-09-04T17:02:54Z

src/writer.rs

@@ -869,8 +863,8 @@ mod write_vtk_impl {
            }

            match buf {
-                IOBuffer::Bit(v) => write_buf_impl(v, &mut self.0, |x| x * 255 as u8)?,


Could we change this to if x != 0 { 255 } else { 0 } to make it explicit. I think all of these other than u8 and f32 are just trying to predict the intent of the user who may have made a mistake. Since Bit is either zero or 1, I think it's more appropriate to have 1 correspond to the largest byte value.

I think this doesn't apply anymore. I've removed the * 255 part here (see GiGainfosystems#1 (comment)). My reasoning was that the bit case also includes a Vec<u8> (which is a bit curious). So I handled it like the other match arm.

pub enum IOBuffer { /// Bit array is stored in 8 bit chunks. Bit(Vec<u8>), ... }

So for this type of conversion only the float values need to be multiplied by 255.

Oh yes, you are absolutely right, thank you for that reference!

elrnv · 2024-09-04T23:14:59Z

sorry about that, I forgot to submit the review! Do you see the comment now?

Change ScalarType to U8 for colors in lookup tables stored in bin…

dac0525

…ary files

Add support for Ascii and Binary color scalars

b3c1d84

This is implemented in the writer and parser.

MrMuetze and others added 3 commits September 2, 2024 15:34

Merge pull request #1 from elrnv/Siimeloni-scalartype-color-lookup

7311794

Siimeloni scalartype color lookup

Remove unused error, add new tests, fix one writer implementation

85574ae

Merge branch 'master' into Siimeloni-patch-binary-color-fix

569e7d3

elrnv requested changes Sep 4, 2024

View reviewed changes

elrnv merged commit 661f7cd into elrnv:master Sep 6, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change `ScalarType` to `U8` for colors in lookup tables stored in binary files #47

Change `ScalarType` to `U8` for colors in lookup tables stored in binary files #47

Siimeloni commented Aug 15, 2024

elrnv commented Aug 16, 2024

MrMuetze commented Aug 20, 2024

elrnv commented Aug 22, 2024

MrMuetze commented Sep 3, 2024

elrnv commented Sep 4, 2024

MrMuetze commented Sep 4, 2024

elrnv Sep 4, 2024

MrMuetze Sep 6, 2024

elrnv Sep 6, 2024

elrnv commented Sep 4, 2024

Change ScalarType to U8 for colors in lookup tables stored in binary files #47

Change ScalarType to U8 for colors in lookup tables stored in binary files #47

Conversation

Siimeloni commented Aug 15, 2024

elrnv commented Aug 16, 2024

MrMuetze commented Aug 20, 2024

elrnv commented Aug 22, 2024

MrMuetze commented Sep 3, 2024

elrnv commented Sep 4, 2024

MrMuetze commented Sep 4, 2024

elrnv Sep 4, 2024

Choose a reason for hiding this comment

MrMuetze Sep 6, 2024

Choose a reason for hiding this comment

elrnv Sep 6, 2024

Choose a reason for hiding this comment

elrnv commented Sep 4, 2024

Change `ScalarType` to `U8` for colors in lookup tables stored in binary files #47

Change `ScalarType` to `U8` for colors in lookup tables stored in binary files #47