Add reader for renishaw .wdf format #55

pietsjoh · 2022-10-27T14:35:11Z

This reader is based on py-wdf-reader with extra insights from gwyddion.
Compared to py-wdf-reader, more metadata is extracted (PSET Blocks).
Moreover, changes are made to meet rosettasciio's format.

Progress of the PR

Minimal example of the bug fix or the new feature

from rsciio.msa import api
api.file_reader("file.wdf")

codecov · 2022-10-27T14:38:27Z

Codecov Report

Patch coverage: 88.35% and project coverage change: +0.11 🎉

Comparison is base (5b43a34) 85.33% compared to head (fd29c2c) 85.45%.

❗ Current head fd29c2c differs from pull request most recent head 87a1917. Consider uploading reports for the commit 87a1917 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #55      +/-   ##
==========================================
+ Coverage   85.33%   85.45%   +0.11%     
==========================================
  Files          73       75       +2     
  Lines        9076     9747     +671     
  Branches     2053     2140      +87     
==========================================
+ Hits         7745     8329     +584     
- Misses        860      926      +66     
- Partials      471      492      +21

Impacted Files	Coverage Δ
rsciio/renishaw/_api.py	`88.29% <88.29%> (ø)`
rsciio/renishaw/__init__.py	`100.00% <100.00%> (ø)`

... and 19 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

github-advanced-security

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

rsciio/tests/test_renishaw.py

rsciio/tests/generate_renishaw_test_file.py

rsciio/renishaw/_api.py

sem-geologist · 2023-04-25T11:40:53Z

@pietsjoh, what is still missing for this to be ready for review?

sem-geologist · 2023-04-26T09:05:40Z

I see in code there is marked some lines with caveats that there could be unknown and unreadable structures. Is this based on Reverse engineering? I think this format is potentially moving target. In such case it would be easier to write parsing using kaitai_struct as it is much easier to adapt parsing then when binary file specification gets updated by OEM.

I have experience with RE such moving target (i.e. https://github.com/sem-geologist/peaksight-binary-parser). I think it would be then useful to communicate with gwyddion, and have single Reverse engineering implementation written in kaitai, where gwyddion would compile it into C++ library and this would use python compilation of code. BTW there are some free kaitai_struct descriptions for jpeg and EXIF parsing definitions ready, which could be reused then without needing to add PIL as optional dependency.

I saw there is PIL checked as optional dependency to extract jpeg and EXIF; What about using imagio instead as that is one of requirements for this library?

rsciio/renishaw/_api.py

+            detector["integration_time"] = (
+                exposure_per_frame * detector["frames"] / 1000
+            )  # s
+        except TypeError:


pietsjoh · 2023-04-26T14:39:24Z

I see in code there is marked some lines with caveats that there could be unknown and unreadable structures. Is this based on Reverse engineering?

Yes, this format is reverse engineered. I based my work on the py-wdf-reader with some extra inspiration from gwyddion.

I think this format is potentially moving target. In such case it would be easier to write parsing using kaitai_struct as it is much easier to adapt parsing then when binary file specification gets updated by OEM.

I have experience with RE such moving target (i.e. https://github.com/sem-geologist/peaksight-binary-parser). I think it would be then useful to communicate with gwyddion, and have single Reverse engineering implementation written in kaitai, where gwyddion would compile it into C++ library and this would use python compilation of code.

I have no experience with kaitai yet, but this might be an option in the future. However, I don't know if Renishaw changes their format often enough for this to be useful.

I saw there is PIL checked as optional dependency to extract jpeg and EXIF; What about using imagio instead as that is one of requirements for this library?

My implementation is close to how py-wdf-reader reads the EXIF-Tags. I tried using imageio instead, but couldn't get it to work. The picture with the exif-data is still included in the metadata (even when PIL is not installed). So I figured that it is not a big problem if I leave it like this. But I can give it another go and try to use imageio for reading the EXIF tags.

@pietsjoh, what is still missing for this to be ready for review?

The main issue is licensing, as I based my work on the py-wdf-reader. I just asked in alchem0x2A/py-wdf-reader#43 how to handle this. Moreover, I asked Renishaw directly for some insights on some of the TODO's/questions I have. But this could also be implemented in a different PR.

jlaehne · 2023-04-26T19:49:38Z

@pietsjoh, what is still missing for this to be ready for review?

The main issue is licensing, as I based my work on the py-wdf-reader. I just asked in alchem0x2A/py-wdf-reader#43 how to handle this. Moreover, I asked Renishaw directly for some insights on some of the TODO's/questions I have. But this could also be implemented in a different PR.

I would consider this ready to review ... the remaining questions can be handled along the way :-)

docs/supported_formats/renishaw.rst

pietsjoh · 2023-04-28T14:42:04Z

I tried loading the exif tags with imageio and remember now why I stayed with PIL. imageio returns exif as a binary sequence, whereas PIL returns exif as a dict (tag number as key and tag as value). I can easily deal with the dict. However, I would need to write an extra parser for the binary sequence.

pietsjoh · 2023-06-05T14:36:27Z

pre-commit.ci autofix

jlaehne · 2023-06-05T21:06:40Z

@ericpre: from my side, this is ready to be merged

pietsjoh mentioned this pull request Oct 27, 2022

Contributors / Maintainers of this repository? alchem0x2A/py-wdf-reader#43

Open

jlaehne mentioned this pull request Nov 2, 2022

Implementation of Spectroscopy File Readers LumiSpy/lumispy#130

Open

7 tasks

jlaehne added type: new format status: WIP labels Nov 2, 2022

pietsjoh force-pushed the renishaw branch from 0f224af to e05f2fc Compare January 12, 2023 13:21

github-advanced-security bot found potential problems Jan 12, 2023

View reviewed changes

pietsjoh force-pushed the renishaw branch 2 times, most recently from 011a6bc to 8983449 Compare January 23, 2023 11:54

github-advanced-security bot found potential problems Jan 26, 2023

View reviewed changes

pietsjoh force-pushed the renishaw branch from 031b4cb to 5d15f24 Compare January 26, 2023 12:43

pietsjoh force-pushed the renishaw branch 3 times, most recently from 3f735a4 to 9e3882e Compare February 17, 2023 15:13

github-advanced-security bot found potential problems Feb 17, 2023

View reviewed changes

rsciio/tests/generate_renishaw_test_file.py Fixed Show fixed Hide fixed

pietsjoh force-pushed the renishaw branch from 0904a25 to 767d2a5 Compare February 17, 2023 15:38

github-advanced-security bot found potential problems Mar 3, 2023

View reviewed changes

rsciio/renishaw/_api.py Fixed Show fixed Hide fixed

rsciio/renishaw/_api.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Mar 16, 2023

View reviewed changes

rsciio/renishaw/_api.py Fixed Show fixed Hide fixed

pietsjoh force-pushed the renishaw branch 2 times, most recently from d18d146 to 17d1987 Compare April 26, 2023 13:32

github-advanced-security bot found potential problems Apr 26, 2023

View reviewed changes

rsciio/renishaw/_api.py

detector["integration_time"] = (

exposure_per_frame * detector["frames"] / 1000

) # s

except TypeError:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.

jlaehne marked this pull request as ready for review April 26, 2023 19:48

jlaehne added status: needs review and removed status: WIP labels Apr 26, 2023

jlaehne reviewed Apr 26, 2023

View reviewed changes

docs/supported_formats/renishaw.rst Show resolved Hide resolved

pietsjoh added 22 commits June 5, 2023 15:49

Refactor setting navigation axes

33e725b

Add tests for Z-Scan and unspecified file

792b01f

Map metadata

35da94a

Add tests metadata/original_metadata (spectrum)

f6e2799

Extend documentation

156abf5

Implement linexy as distance, reorder functions

3de9a1d

Revert signal axis, refactor init + parse_WDF1

b4a4eb1

Extend test suite (mostly original_metadata)

656bebd

Resolve CodeQL warnings

12b1523

Resolve CodeQL warnings

d2009d9

Add support for time and focus_track axes

10a6b22

Add extra tests for metadata parser

232630b

Match metadata from WXDM Block

ff7704a

Map wxdm metadata

4fa4f5e

Implement code scanning suggestions

2200ccb

Add tests for integration time

48359b6

Add load_unmatched_metadata option

736dae7

Fix tests

39c527d

Update docs/comments, add default value for enums

741c74d

Fix typos in documentation

ce2b6e6

Change license format according to softwarefreedom

0f15ff3

Update test file locations

87a1917

pietsjoh force-pushed the renishaw branch from fd29c2c to 87a1917 Compare June 5, 2023 14:01

jlaehne approved these changes Jun 5, 2023

View reviewed changes

ericpre merged commit 525fbfe into hyperspy:main Jun 5, 2023
29 of 30 checks passed

ericpre removed the status: needs review label Jun 5, 2023

ericpre added this to the v0.1.0 initial release milestone Jun 5, 2023

ericpre mentioned this pull request Jun 6, 2023

Add test for DM4: complex packed (FFTs) #135

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reader for renishaw .wdf format #55

Add reader for renishaw .wdf format #55

pietsjoh commented Oct 27, 2022 •

edited

codecov bot commented Oct 27, 2022 •

edited

github-advanced-security bot left a comment

sem-geologist commented Apr 25, 2023

sem-geologist commented Apr 26, 2023

pietsjoh commented Apr 26, 2023

jlaehne commented Apr 26, 2023

pietsjoh commented Apr 28, 2023

pietsjoh commented Jun 5, 2023

jlaehne commented Jun 5, 2023

Add reader for renishaw .wdf format #55

Add reader for renishaw .wdf format #55

Conversation

pietsjoh commented Oct 27, 2022 • edited

Progress of the PR

Minimal example of the bug fix or the new feature

codecov bot commented Oct 27, 2022 • edited

Codecov Report

github-advanced-security bot left a comment

Choose a reason for hiding this comment

sem-geologist commented Apr 25, 2023

sem-geologist commented Apr 26, 2023

pietsjoh commented Apr 26, 2023

jlaehne commented Apr 26, 2023

pietsjoh commented Apr 28, 2023

pietsjoh commented Jun 5, 2023

jlaehne commented Jun 5, 2023

pietsjoh commented Oct 27, 2022 •

edited

codecov bot commented Oct 27, 2022 •

edited