Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reader for renishaw .wdf format #55

Merged
merged 32 commits into from
Jun 5, 2023
Merged

Conversation

pietsjoh
Copy link
Contributor

@pietsjoh pietsjoh commented Oct 27, 2022

This reader is based on py-wdf-reader with extra insights from gwyddion.
Compared to py-wdf-reader, more metadata is extracted (PSET Blocks).
Moreover, changes are made to meet rosettasciio's format.

Progress of the PR

  • Adapt to general rosettasciio structure
  • Extend metadata extraction
  • add tests
  • map metadata
  • look at WHTL, WMAP parsing
  • update docstring (if appropriate)
  • update user guide (if appropriate)
  • add an changelog entry in the upcoming_changes folder (see upcoming_changes/README.rst)
  • Check formatting changelog entry in the readthedocs doc build of this PR (link in github checks)
  • ready for review

Minimal example of the bug fix or the new feature

from rsciio.msa import api
api.file_reader("file.wdf")

@codecov
Copy link

codecov bot commented Oct 27, 2022

Codecov Report

Patch coverage: 88.35% and project coverage change: +0.11 🎉

Comparison is base (5b43a34) 85.33% compared to head (fd29c2c) 85.45%.

❗ Current head fd29c2c differs from pull request most recent head 87a1917. Consider uploading reports for the commit 87a1917 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #55      +/-   ##
==========================================
+ Coverage   85.33%   85.45%   +0.11%     
==========================================
  Files          73       75       +2     
  Lines        9076     9747     +671     
  Branches     2053     2140      +87     
==========================================
+ Hits         7745     8329     +584     
- Misses        860      926      +66     
- Partials      471      492      +21     
Impacted Files Coverage Δ
rsciio/renishaw/_api.py 88.29% <88.29%> (ø)
rsciio/renishaw/__init__.py 100.00% <100.00%> (ø)

... and 19 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@pietsjoh pietsjoh force-pushed the renishaw branch 2 times, most recently from 011a6bc to 8983449 Compare January 23, 2023 11:54
rsciio/tests/test_renishaw.py Fixed Show fixed Hide fixed
rsciio/tests/test_renishaw.py Fixed Show fixed Hide fixed
rsciio/tests/test_renishaw.py Fixed Show fixed Hide fixed
rsciio/tests/test_renishaw.py Fixed Show fixed Hide fixed
rsciio/tests/test_renishaw.py Fixed Show fixed Hide fixed
rsciio/tests/test_renishaw.py Fixed Show fixed Hide fixed
rsciio/renishaw/_api.py Fixed Show fixed Hide fixed
rsciio/renishaw/_api.py Fixed Show fixed Hide fixed
rsciio/renishaw/_api.py Fixed Show fixed Hide fixed
@sem-geologist
Copy link
Contributor

@pietsjoh, what is still missing for this to be ready for review?

@sem-geologist
Copy link
Contributor

I see in code there is marked some lines with caveats that there could be unknown and unreadable structures. Is this based on Reverse engineering? I think this format is potentially moving target. In such case it would be easier to write parsing using kaitai_struct as it is much easier to adapt parsing then when binary file specification gets updated by OEM.

I have experience with RE such moving target (i.e. https://github.com/sem-geologist/peaksight-binary-parser). I think it would be then useful to communicate with gwyddion, and have single Reverse engineering implementation written in kaitai, where gwyddion would compile it into C++ library and this would use python compilation of code. BTW there are some free kaitai_struct descriptions for jpeg and EXIF parsing definitions ready, which could be reused then without needing to add PIL as optional dependency.

I saw there is PIL checked as optional dependency to extract jpeg and EXIF; What about using imagio instead as that is one of requirements for this library?

@pietsjoh pietsjoh force-pushed the renishaw branch 2 times, most recently from d18d146 to 17d1987 Compare April 26, 2023 13:32
detector["integration_time"] = (
exposure_per_frame * detector["frames"] / 1000
) # s
except TypeError:

Check notice

Code scanning / CodeQL

Empty except Note

'except' clause does nothing but pass and there is no explanatory comment.
@pietsjoh
Copy link
Contributor Author

I see in code there is marked some lines with caveats that there could be unknown and unreadable structures. Is this based on Reverse engineering?

Yes, this format is reverse engineered. I based my work on the py-wdf-reader with some extra inspiration from gwyddion.

I think this format is potentially moving target. In such case it would be easier to write parsing using kaitai_struct as it is much easier to adapt parsing then when binary file specification gets updated by OEM.

I have experience with RE such moving target (i.e. https://github.com/sem-geologist/peaksight-binary-parser). I think it would be then useful to communicate with gwyddion, and have single Reverse engineering implementation written in kaitai, where gwyddion would compile it into C++ library and this would use python compilation of code.

I have no experience with kaitai yet, but this might be an option in the future. However, I don't know if Renishaw changes their format often enough for this to be useful.

I saw there is PIL checked as optional dependency to extract jpeg and EXIF; What about using imagio instead as that is one of requirements for this library?

My implementation is close to how py-wdf-reader reads the EXIF-Tags. I tried using imageio instead, but couldn't get it to work. The picture with the exif-data is still included in the metadata (even when PIL is not installed). So I figured that it is not a big problem if I leave it like this. But I can give it another go and try to use imageio for reading the EXIF tags.

@pietsjoh, what is still missing for this to be ready for review?

The main issue is licensing, as I based my work on the py-wdf-reader. I just asked in alchem0x2A/py-wdf-reader#43 how to handle this. Moreover, I asked Renishaw directly for some insights on some of the TODO's/questions I have. But this could also be implemented in a different PR.

@jlaehne jlaehne marked this pull request as ready for review April 26, 2023 19:48
@jlaehne
Copy link
Contributor

jlaehne commented Apr 26, 2023

@pietsjoh, what is still missing for this to be ready for review?

The main issue is licensing, as I based my work on the py-wdf-reader. I just asked in alchem0x2A/py-wdf-reader#43 how to handle this. Moreover, I asked Renishaw directly for some insights on some of the TODO's/questions I have. But this could also be implemented in a different PR.

I would consider this ready to review ... the remaining questions can be handled along the way :-)

@pietsjoh
Copy link
Contributor Author

I tried loading the exif tags with imageio and remember now why I stayed with PIL. imageio returns exif as a binary sequence, whereas PIL returns exif as a dict (tag number as key and tag as value). I can easily deal with the dict. However, I would need to write an extra parser for the binary sequence.

@pietsjoh
Copy link
Contributor Author

pietsjoh commented Jun 5, 2023

pre-commit.ci autofix

@jlaehne
Copy link
Contributor

jlaehne commented Jun 5, 2023

@ericpre: from my side, this is ready to be merged

@ericpre ericpre merged commit 525fbfe into hyperspy:main Jun 5, 2023
29 of 30 checks passed
@ericpre ericpre added this to the v0.1.0 initial release milestone Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants