# Loading 4DSTEM datasets from ePSIC

With the installation of the DE16 camera and the new Freescan arbitrary scan generator, there are now a number of different ways of collecting 4DSTEM data at ePSIC. This notebook demonstrates how to load all of the different types of data that you might encounter into the Hyperspy/pyxem ecosystem.

## MerlinEM data acquired with the JEOL scan generator

Opening MerlinEM data acquired with the JEOL scan generator is straightforward as a data loader has been included in pyxem. The rows of the data are determined by picking out the flyback pixels from their longer exposure time. Data is automatically loaded lazily (that is, it is not loaded into RAM).

In [None]:
import pyxem as pxm
import hyperspy.api as hs
import numpy as np
%matplotlib notebook

In [None]:
merlin_jeol = pxm.load_mib('/dls/e02/data/2021/cm28158-1/Merlin/20210118_152746_data.mib')

In [None]:
print(merlin_jeol)

Typically, the data has also been converted to HDF5 format using ePSIC's conversion scripts (which uses pyxem's data loader). The converted HDF5 files can be loaded using Hyperspy, either lazily or not.

In [None]:
merlin_jeol_conv = hs.load('/dls/e02/data/2021/cm28158-1/processing/Merlin/20210118 145609/20210118_145609.hdf5',lazy=True)

In [None]:
print(merlin_jeol_conv)

## MerlinEM data acquired with the Freescan

Loading Freescan data is slightly more difficult due to the arbitrary nature of the scan. For regular rectangular scans, there remains straightforward approaches. The current version of pyxem (0.13.0) will read the mib file as TEM data because the Freescan doesn't include any flyback pixels.

In [None]:
merlin_freescan = pxm.load_mib('/dls/e02/data/2021/cm28158-1/Merlin/Freescan_cal/20210119_Freescan_600us_256X256_2.mib')

In [None]:
print(merlin_freescan)

Once loaded, it's then possible to reshape the data to the correct shape.

In [None]:
merlin_freescan_reshaped = hs.signals.Signal2D(merlin_freescan.data.reshape((256,256,515,515)))

You can then convert the data back into the electron diffraction datatype for pyxem to use.

In [None]:
merlin_freescan_reshaped.set_signal_type('electron_diffraction')

I have also modified the pyxem loader to input the data shape and load in the mib files to the correct shape. This will hopefully go into a later version of pyxem but feel free to ask for the code currently.

In [None]:
merlin_freescan_mod = pxm.load_mib('/dls/e02/data/2021/cm28158-1/Merlin/Freescan_cal/20210119_Freescan_600us_256X256_2.mib', scan=(256,256))

In [None]:
print(merlin_freescan_mod)

### Loading non rectangular scans

Loading non-rectangular scans is more difficult, due to the arbitrary nature of the scan positions. The initial mib files can be loaded as usual, but will be assigned as TEM data.

In [None]:
merlin_arbitrary = pxm.load_mib('/dls/e02/data/2021/cm28158-1/Merlin/Freescan_cal/20210119_Freescan_600us_256_256_subsamples2X_2.mib')

In [None]:
print(merlin_arbitrary)

It is then necessary to import the scan positions from the xyz file, this can be done using the following function.

In [None]:
def import_scan_pos(filename):
    positions = []
    import csv
    with open(filename, newline='') as csvfile:
        reader = csv.reader(csvfile, delimiter=',')
        
        #Skip lines
        for i in range(4):
            next(reader)

        for row in reader:
            positions.append((int(row[0]),int(row[1])))
    return(positions)

In [None]:
scan_positions = import_scan_pos('/dls/e02/data/2021/cm28158-1/DE16/Scan Coordinates XYZ Files/0256x0256/Subsampled_02x/0256x0256_Subsampled_02x_00001.xyz')

In [None]:
print(scan_positions{0:10})

Once the scan positions are imported, it's then possible to reshape the data using the scan positions.

In [None]:
def reshape_arbitrary_scan(data, scan_positions):
    import numpy as np
    
    x_positions = []
    y_positions = []
    for pos in scan_positions:
        x_positions.append(pos[0])
        y_positions.append(pos[1])
    x_max = max(x_positions)
    y_max = max(y_positions)
    
    data_np = np.zeros((x_max,y_max,data.axes_manager[1].size,data.axes_manager[2].size))
    
    for i,pos in enumerate(scan_positions[:-1]):
        data_np[pos[0]-1,pos[1]-1,:,:] = data.inav[i].data
    
    data_out = hs.signals.Signal2D(data_np)
    
    return(data_out)

In [None]:
merlin_arbitrary.compute()

In [None]:
merlin_arbitrary_reshaped = reshape_arbitrary_scan(merlin_arbitrary, scan_positions)

In [None]:
print(merlin_arbitrary_reshaped)

## DE16 acquired data

Data from the DE16 comes in a variety of file formats. The raw data from the Streampix software is in .seq format. These files do not include any dark or gain correction but can be loaded using pims (note: get pims for epsic3.7?).

In [None]:
import pims

In [None]:
images = pims.open('/dls/e02/data/2020/mg25140-9/DE_test_data/16-47-35.930.seq')

In [None]:
def pims_to_hs(images):
    np_data = np.zeros((len(images),images.frame_shape[0],images.frame_shape[1]))
    for i,image in enumerate(images):
        np_data[i,:,:] = image
    return(hs.signals.Signal2D(np_data))

In [None]:
de_raw = pims_to_hs(images)

In [None]:
print(de_raw)

Once the data is loaded, you can then reshape as shown for the Merlin data.

In [None]:
de_raw_reshaped = hs.signals.Signal2D(de_raw.data.reshape((100,100,256,256)))
de_raw_reshaped.set_signal_type('electron_diffraction')

In [None]:
print(de_raw_reshaped)

### Importing non-raw data

Direct Electron have their own software in which you can convert seq files to other formats. This software also performs dark/gain reference correction (with proprietary algorithms). There are a number of file formats to export to. The following code shows how to import the HDF5 files that have been reshaped to the correct dimensions. HDF5 files can only be exported for rectangular arrays.

In [None]:
import h5py

In [None]:
f = h5py.File('/dls/e02/data/2021/cm28158-1/DE16/256X256 test.h5', 'r')

In [None]:
de_hdf = np.array(f['4DSTEM_experiment']['data']['datacubes']['datacubes_0']['data'])

In [None]:
de_hdf_conv = hs.signals.Signal2D(de_hdf)

In [None]:
print(de_hdf_conv)

In [None]:
de_hdf_conv.plot()

In the case of non-rectangular scan arrays, data can be saved in to one of the other formats. The MRC format does not currently work with the hyperspy loader...

In [None]:
de_mrc = hs.load('/dls/e02/data/2021/cm28158-1/DE16/Test_256_subsampled_02x_01.mrc')