Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4D-STEM data reads incorrectly #17

Closed
ercius opened this issue Jun 5, 2019 · 1 comment
Closed

4D-STEM data reads incorrectly #17

ercius opened this issue Jun 5, 2019 · 1 comment

Comments

@ercius
Copy link
Owner

ercius commented Jun 5, 2019

Newer versions of DM with the STEMX capability for 4D-STEM now write 4-dimensional DM4 files with a different shape than previously was done. This is actually an improvement in the layout of the data on disk, but ncempy.io.dm does not currently read this data correctly.

There is a workaround. You can use the low level API in ncemy.io.dm.fileDM to create a memmap on disk with the correct ordering:

d1 = nio.dm.fileDM('filename.dm4')
data = np.memmap(d1.fid,dtype=d1._DM2NPDataType(d1.dataType[1]), mode = 'r', offset=d1.dataOffset[1], shape=(d1.xSize[1],d1.ySize[1],d1.zSize[1],d1.zSize2[1])[::-1],order='C')
image0 = data[:,:,0,0] # the first diffraction pattern

Note how the shape keyword has [::-1] to reverse the order of the data shape. The shape is now [ky, kx, scanY, scanX] making the diffraction patterns the fastest data to access according to C-ordering.

I will attempt to add automatic identification of this change in the file format, but it will take some time to figure out.

@sk1p
Copy link
Contributor

sk1p commented Dec 5, 2019

I also stumbled on this.

The shape is now [ky, kx, scanY, scanX] making the diffraction patterns the fastest data to access according to C-ordering.

Actually, it is the other way around: in numpy, the rightmost axis is the fastest to access. You can easily check this using the strides attribute on the array, or even by benchmarking:

>>> image0.strides  # image0 from your code snippet above, applied to some sample data
(143654400, 74820)

This means, between each pixel in image0, we need to jump 74820 bytes in the input data. In case of fast access, this should be a number corresponding to the dtype, for example 4 for float32 data.

>>> b = np.random.randn(128, 128, 128, 128)
>>> b.strides
(16777216, 131072, 1024, 8)

%timeit np.copy(b[..., 0, 0])  # on my system: 534 µs ± 25.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> b[..., 0, 0].strides
(16777216, 131072)

%timeit np.copy(b[0, 0, ...])  # on my system: 7.02 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> b[0, 0, ...].strides
(1024, 8)

Or, from your example above, compare the timing of np.copy(data[:, :, 0, 0]) with np.copy(data[0, 0, ...]). The np.copy is needed to actually evaluate the performance - without it, numpy just creates a view into the memory map, without accessing the underlying data.

So sadly, in this new format, while it is fast to access all scan positions for a single pixel of a diffraction pattern, accessing a whole diffraction pattern is very expensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants