Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample file and unit tests for Datx files #7

Open
brandondube opened this issue May 3, 2019 · 21 comments
Open

Sample file and unit tests for Datx files #7

brandondube opened this issue May 3, 2019 · 21 comments
Labels
help wanted Extra attention is needed

Comments

@brandondube
Copy link
Owner

The prysm.io submodule should have full unit test coverage. To do this, tests for the datx reader are needed. Tests require a sample file -- someone will need to provide one via donation or license to prysm for this to happen. The file could be synthesized with prysm, saved as asc, then converted with Mx.

@brandondube brandondube added the help wanted Extra attention is needed label May 3, 2019
@JakobSilbermann
Copy link

Hi Brandon,
if it is still nedded i could offer a datx file.
Best Regards
Jakob

@brandondube
Copy link
Owner Author

Sure, that would be great! Could you make a PR with...

  • the datx file in the right top-level folder
  • A license file that either grants permission for its use and distribution with prysm, or just assign copyright of it to the prysm developers
  • Tweak the sample files Python file to have a datx lookup?

I can help with any of those steps if you need. Thanks!

@JakobSilbermann
Copy link

Hi, i will do. I searched for an example of a license file for the .dat file already included but did not find it. Is there one? If not what license would you suggest. Permission for for its use and distribution with prysm would not be a problem.

@brandondube
Copy link
Owner Author

All of the existing files sample files are provided by the same copyright holders as prysm (basically, me) so no license is necessary. It would be like having a license for each .py file, not needed!

The MIT license used by prysm boils down to "do whatever you want but support is not a given" and could be apt. If you wanted to, you could retain your copyright by providing the file with an MIT license. Then the IP remains yours, but it can be redistributed here without violating your intellectual property rights.

If you don't really care so much, a verbal permission here leaves the ownership ambiguously yours without a clear license, but an effective enough "verbal contract" formed.

@JakobSilbermann
Copy link

I just added the datx file in my fork i will do a pull request if rest of work is done

@brandondube
Copy link
Owner Author

Great!

@JakobSilbermann
Copy link

JakobSilbermann commented Apr 17, 2021

Ok here is everything done,

  • Datx is in top level folder
  • license file added took yours and changed a bit.
  • Ttweaked sample files Python file to have a datx lookup

I still need some testing on zemax io stuff i think i will do the PR next week when i did this i also wanted to add a notebook as example can i add this somwhere?

@brandondube
Copy link
Owner Author

For examples, there isn't an io example so it would be a good start to that. If you wanted to cover as much of such an example as you could (beyond just datx/zemax sag), I would much appreciate it.

Could you make a pull request for me to review please?

@JakobSilbermann
Copy link

Hi i am coming back with some problems i encouterd when loading 2 different datx files. The two files are in my repository and they are not synthesized like you proposed above but measured data. I will put them in two different comments so that it is not to much:

  • valid_zygo_datx_file.datx here h5py has some issues shown in the screenshot. I use anaconda and from googeling the problem i tried it with deinstaltion of the h5py that comes with conda and reinstalling it via pip but id did not help. Maybe the problem is on my machine only but it would be great if you cat get that file out of my fork and try also to open it.

valid_zygo_datx_file datx

@JakobSilbermann
Copy link

JakobSilbermann commented Apr 18, 2021

  • valid_zygo_datx_file2.datx here the error is: ValueError: datx file does not use expected phase unit, contact the prysm author with a sample file to resolve. Some debugging showed that it works, when line 626 in io.py becomes if punit == 'inge': i got this from reading the datx file and then looking at the value of punit. Then i figured out this line and punit = str(phase_obj.attrs['Unit'][0])#[2:-1] # this for some reason is "b'Fringes'", need to slice off b' and ' and commented the slicing out. So it seems that here are datx files that do not have this issue. Maybe it would be good that the error also shows the punit value so that the user can react? After commentign it out everything went smooth without the h5py error mentioned above so it seems that this error comes from the above file. Both are from the same Mx Version i think.
    valid_zygo_datx_file2 datx

@brandondube
Copy link
Owner Author

Before going into the technicals above, can we cover some meta stuff?

Your datx files are in the top level directory, please place them in the sample_files folder. Please also do the changes on a separate branch, your master is now in a way "beyond saving." By this I mean, those two datx files are forever in the git storage twice, once to go in and once to go out (be moved, in this case). That's about 6MB, a huge increase. I would prefer to not have that kind of bloat, lest cloning prysm start to take as long as cloning numpy.

It would be good, too, for you to work from the v020-dev branch. I am happy to merge any contributions into master for a hypothetical v0.19.2, but I won't personally do any development on those (v0.20 is my focus). If there is a git conflict between v020-dev and master it will make unnecessary headaches.

Anyway....

The second file opens fine for me (seems a nice part, too, 4 nm RMS(!)). I realized a while ago, but never got around to changing it. I misunderstood the binary string decoding in python at the time I wrote the datx parser. Lines like the punit one should be changed like:

#L656
            if value.dtype == 'object':
                value = value[0].decode('UTF-8')  # object dtype is a string

Compare str(b'abc') to .decode('UTF-8'). UTF-8 is a safe get on the encoding scheme (I think). Zemax uses UTF-16 / wchars, but since Mx is largely written in python I doubt they are using anything but UTF-8 (it is extraordinarily painful to use UTF-16).

That problem is fixed in 00e3b64, and file2 opens A-ok.

The first file is more problematic. I actually loath the datx format; it is an egregious regression from the dat format. dat may be 'a pain' because it's a fixed binary format (just look how repetitive the dat parser is) but datx is just unstructured h5 in disguise. I've found that every interferometer model has its own format disgused as datx (I've seen dynafiz, NX2, MST, NX9000).

Since there is no manual or specification for datx, everything in prysm for it is me reverse engineering the format.

I get the same OSError as you there. I did some print debugging, and the dataset does exist in the file, so that isn't it. I changed my h5py from conda to pip, and now the above commit hash is broken because h5py made a breaking change from bytes=>str there, in a minor release (audible groan). After plastering over that, it still doesn't work. I do not know what would be wrong at that point, perhaps the file is corrupted or there is some arcane permission issue.

@JakobSilbermann
Copy link

JakobSilbermann commented Apr 18, 2021

Sorry for messing things up while working on the forked master branch. So i will work from now on the v0.20 dev branch and add all of the things i have done + files in the sample_files folder. When i am done i will do a PR for this branch, and leave my master as it is. I think later after the release i can fethh everything from your updated master prysm repo. Is there anything else i should keep in mind or something i could break somewhere?

Placing the files in the top level folder was my missunderstanding i was first supposed to put the files in the sample folder but then i just read top-level folder in your comment and missed out the word right :( Sorry it is the first time doing something on github and i did not know that such failures can have tricky conseqences like this.

For the second file i will check back with my colleague go trough some other files from that interferometer. I think now that i was wrong with this statement

Both are from the same Mx Version i think.

These are definitly two different interferometer models. I will get more information on this and test more files from the two models. BTW i could also add a sample file from Trioptics muphase software later or now. But i am not sure if i have the skills to reverse engineer it. So thank you for your patience , help and quick answers!

@brandondube
Copy link
Owner Author

You didn't really mess anything up, it is just an optimization. Since the scale and scope of commits is small, I would prefer to not pay the bloat. What I would do is check out origin/v-020-dev into a new branch and then merge your changes into io.py and sample_data.py.

You need not delete and re-fork or anything like that. Worst-case, you could just delete your master branch and check out origin/master again. Mind that you don't lose your changes doing that, though.

The bloat only comes because these are large binary files; if they were python source files, something like < 1kB is nothing, but a few MB is a noticeable amount. The reason I say it is "beyond saving" is that it is difficult/impossible to delete something from git as if it never existed, since git is designed to preserve history. I once lost about 1,000 commits of history because git got borked when I tried to rewind a commit.

@JakobSilbermann
Copy link

JakobSilbermann commented Apr 25, 2021

Hi, took me a while to reply :) i branched the V-020-dev branch in my fork now into a new branch contributions-Prysm-020-dev: when i have everything updated there i will do a PR. I hope now everything is set up correct. If not please leave me a short message.

@brandondube
Copy link
Owner Author

I don't see any commits from you in. your branch there?

Either way, I do not see an issue with the setup -- as long as you don't merge master into anything (which will bring the MBs of diff history), it will be fine.

Please go ahead and make a PR whenever you are ready, thanks!

@JakobSilbermann
Copy link

Yes ist is correct i did not commit anything yet. I first want to get things straight with two different datx files and why one of them causes this error. I am still checking that out.

@brandondube
Copy link
Owner Author

Check that you own the file and have the most permissive possible rights for it is where I would start. I also see in grimbough/Rhdf5lib#11 That os.environ['HDF5_USE_FILE_LOCKING'] = 'FALSE' may help. Perhaps some special arguments are needed in h5py.File(...) or something. I'll look more during the weekend.

@JakobSilbermann
Copy link

Ok cool thanks for the hint! As you mentioned in your first post i also prepared a syntetic interferogram and saved it as zygo ascii. I will also check with the two different software versions how the saved datx files look like later this week and if the error can be repoduced.

@JakobSilbermann
Copy link

JakobSilbermann commented Apr 28, 2021

Ok so we did the test with a syntetic interferogram saved as ascii opened it with the same version of Mx, (that did not work first hand) and saved it as datx. And it worked! I could open it with prysm. Thas good news. And i might have an idea why the other file did not work.
This file includes also a measurement the radius of the mirror. Therfore a second interferogram of the cats eyes position is saved within the datx file alongside the interferogram of the mirror. From these two measurments the exact radius can be calculated. I had several files where we measured the radius along with the surface and none of them could be imported with prysm. So maybe there are several phase datas nested in the h5 format that causes this error.
I also tried check the input you gave yesterday but i was not sucessful. I will work on it....

@JakobSilbermann
Copy link

Besides checking why this special type of file creates this problem i will use the datx file from the synthetic interferogram as valid sample datx file to finalize this work package and do a PR finally later this week.

@brandondube
Copy link
Owner Author

👋 @JakobSilbermann is there any update on this? Prysm v0.20 is out soon and I was hoping this could be included in the release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants