You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Newer versions of DM with the STEMX capability for 4D-STEM now write 4-dimensional DM4 files with a different shape than previously was done. This is actually an improvement in the layout of the data on disk, but ncempy.io.dm does not currently read this data correctly.
There is a workaround. You can use the low level API in ncemy.io.dm.fileDM to create a memmap on disk with the correct ordering:
d1=nio.dm.fileDM('filename.dm4')
data=np.memmap(d1.fid,dtype=d1._DM2NPDataType(d1.dataType[1]), mode='r', offset=d1.dataOffset[1], shape=(d1.xSize[1],d1.ySize[1],d1.zSize[1],d1.zSize2[1])[::-1],order='C')
image0=data[:,:,0,0] # the first diffraction pattern
Note how the shape keyword has [::-1] to reverse the order of the data shape. The shape is now [ky, kx, scanY, scanX] making the diffraction patterns the fastest data to access according to C-ordering.
I will attempt to add automatic identification of this change in the file format, but it will take some time to figure out.
The text was updated successfully, but these errors were encountered:
The shape is now [ky, kx, scanY, scanX] making the diffraction patterns the fastest data to access according to C-ordering.
Actually, it is the other way around: in numpy, the rightmost axis is the fastest to access. You can easily check this using the strides attribute on the array, or even by benchmarking:
>>>image0.strides# image0 from your code snippet above, applied to some sample data
(143654400, 74820)
This means, between each pixel in image0, we need to jump 74820 bytes in the input data. In case of fast access, this should be a number corresponding to the dtype, for example 4 for float32 data.
>>>b=np.random.randn(128, 128, 128, 128)
>>>b.strides
(16777216, 131072, 1024, 8)
%timeitnp.copy(b[..., 0, 0]) # on my system: 534 µs ± 25.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>>>b[..., 0, 0].strides
(16777216, 131072)
%timeitnp.copy(b[0, 0, ...]) # on my system: 7.02 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)>>>b[0, 0, ...].strides
(1024, 8)
Or, from your example above, compare the timing of np.copy(data[:, :, 0, 0]) with np.copy(data[0, 0, ...]). The np.copy is needed to actually evaluate the performance - without it, numpy just creates a view into the memory map, without accessing the underlying data.
So sadly, in this new format, while it is fast to access all scan positions for a single pixel of a diffraction pattern, accessing a whole diffraction pattern is very expensive.
Newer versions of DM with the STEMX capability for 4D-STEM now write 4-dimensional DM4 files with a different shape than previously was done. This is actually an improvement in the layout of the data on disk, but
ncempy.io.dm
does not currently read this data correctly.There is a workaround. You can use the low level API in ncemy.io.dm.fileDM to create a memmap on disk with the correct ordering:
Note how the shape keyword has [::-1] to reverse the order of the data shape. The shape is now [ky, kx, scanY, scanX] making the diffraction patterns the fastest data to access according to C-ordering.
I will attempt to add automatic identification of this change in the file format, but it will take some time to figure out.
The text was updated successfully, but these errors were encountered: