-
Notifications
You must be signed in to change notification settings - Fork 2
Data format requirements: Displacement data file formats
Chorus can import a full matrix of displacement data from five different displacement data file formats:
(
-
.npy 3D:
.npyfile containing a 3D NumPyndarrayin NumPy[d, g, t] format.$\color{lime}{\textsf{(Recommended)}}$ -
.npy 2D:
.npyfile containing a 2D NumPyndarrayin NumPy[dn+g, t] format. -
.txt 2D:
.txtfile containing rows of whitespace-separated values, where A-scans form rows. When loaded withnumpy.loadtxtusing default parameters, a 2Dndarrayin NumPy[dn+g, t] format is returned. -
.mat 3D: Version 7.2 or lower
.matfile containing a single 3D MATLAB array in MATLAB(t+1, g+1, d+1) format. -
.mat 2D: Version 7.2 or lower
.matfile containing a single 2D MATLAB array in MATLAB(t+1, dn+g+1) format.
Inside the file, the displacement values should ideally be in units of nanometres (nm). However, Chorus offers the option to multiply all values in the file by a conversion factor at import in order to convert to nanometres.
After the user supplies a path to their chosen file, the format is automatically detected out of the list of five formats above based on the file extension (.npy, .txt or .mat) and, for .npy and .mat files, the number of dimensions of the array stored inside (3D or 2D).
To import a new full matrix, select File > Open full matrix....
In the dialog that opens, enter the path to your displacement data file, and the other required information.
Chorus requires multiple pieces of information regarding a 1D, periodic full matrix dataset in order to display and process it.
Many of these peices of information (such as the time basis of the A-scans, the material properties of the sample etc.) can be input manually by the user. The largest single set of data Chorus requires is a file containing a full matrix of displacement measurements (the output of the ultrasound sensors used during the full matrix capture).
As illustrated in the Array Requirements page of this wiki, the displacements measured during full matrix capture using a 1D, periodic array can be organised into a 3D data array (the full matrix), where values of displacement are stored in a 3D data volume with dimensions of generation index
Following a Python-ready zero-indexed system:
-
Both
$g$ and$d$ range in value from$0$ to$(n-1)$ where$n$ is the number of elements in the 1D periodic array. -
$t$ varies from$0$ to$(m-1)$ where$m$ is the number of displacement samples acquired per A-scan.
The number of individual displacement measurements forming a full matrix is therefore equal to
Full matrices of displacement measurements may be imported into Chorus from files following one of five different formats. The purpose of this wiki article is to describe these five compatible file formats. The five formats are:
-
.npy 3D:
.npyfile containing a 3D NumPyndarrayin NumPy[d, g, t] format.$\color{lime}{\textsf{(Recommended)}}$ -
.npy 2D:
.npyfile containing a 2D NumPyndarrayin NumPy[dn+g, t] format. -
.txt 2D:
.txtfile containing rows of whitespace-separated values, where A-scans form rows. When loaded withnumpy.loadtxtusing default parameters, a 2Dndarrayin NumPy[dn+g, t] format is returned. -
.mat 3D: Version 7.2 or lower
.matfile containing a single 3D MATLAB array in MATLAB(t+1, g+1, d+1) format. -
.mat 2D: Version 7.2 or lower
.matfile containing a single 2D MATLAB array in MATLAB(t+1, dn+g+1) format.
The following figure summarises these five compatible displacement data file formats, and illustrates that, no matter the external file format, all imported displacement data will be converted to the internal format used within Chorus: a 3D numpy ndarray in numpy[d,g,t] format.
The format used inside the Python code of Chorus to store the full matrix of displacement values (U) is illustrated in the figure above. The displacement values are stored in a 3D NumPy ndarray, which may be indexed by NumPy[d,g,t].
When U is an ndarray in NumPy[d,g,t] format:
- Individual A-scans form rows. An A-scan can be accessed as
U[d, g, :]or simplyU[d, g]. - An iso-detection plane can be accessed as
U[d, :, :]or simplyU[d]. - An iso-generation plane can be accessed as
U[:, g, :]or simplyU[:, g]. - An iso-time plane can be accessed as
U[:, :, t].
The author considered this format to be the most logical and performant arrangement of the measurement as a NumPy ndarray (which is a widely used and well documented python class). Documentation for the NumPy ndarray class can be found here. The notation for indexing on NumPy ndarrays is documented here.
The smallest logical sub-group of displacement values within the full matrix is a single A-scan. This is the series of displacement measurements recorded over time for a fixed combination of generation and detection element. Chorus performs a variety of processing operations on individual A-scans (such as de-trending, frequency-domain filtering & 1D interpolation). These can be performed most rapidly if the displacement values for a single A-scan are stored contiguously in memory. Since NumPy stores array data in row-major (C-contiguous) order by default, it is performant to store A-scans as the rows of a NumPy ndarray. With A-scans stored as rows in an ndarray, they can be accessed without striding. Many functions that operate on ndarrays (in NumPy and other Python libraries) operate along the last axis (rows) by default for this reason.
The .npy file format is a simple format for saving a single NumPy ndarray. Documentation for this file format can be found here.
.npy files can be produced using the numpy.save function on an ndarray created in a Python session.
Chorus can import a full matrix of displacement data from .npy files containing an ndarray in one of two array formats:
-
3D: The
ndarrayis 3-dimensional, and indexable by numpy[d,g,t]. This is identical to the internal representation of the full matrix within Chorus, and is hence the$\color{lime}{\textsf{recommended}}$ file format. -
2D: The
ndarrayis 2-dimensional, and indexable by numpy[dn+g,t]. With A-scans stored as rows in the 2D array, rows$0$ to$n-1$ contain the set of A-scans with detection index$d=0$ , and increasing generation index$g$ . The detection index$d$ increments to$d=1$ at row$n$ , which contains A-scan$d=1$ ,$g=0$ .
Chorus automatically detects if the ndarray is 3D or 2D. If 2D, Chorus automatically re-shapes the array to obtain its internal 3D numpy[d,g,t] format.
Internally, Chorus uses the numpy.load function to import data from .npy files. See the function load_full_matrix_from_npy_file in the Chorus codebase.
Chorus can import a full matrix of displacement data from a .txt file with the following format:
- Displacement values are separated by whitespace (default setting of
numpy.savetxt). - Rows are separated by newline characters
'\n'(default setting ofnumpy.savetxt). - Displacement data from one A-scan forms one row, with time index
$t$ increasing from left to right. - Rows
$0$ to$n-1$ contain the set of A-scans for detection index$d=0$ , with generation index$g$ increasing by 1 every row. - Row
$n$ contains the A-scan for$d=1$ ,$g=0$ . - Detection index increases by 1 every
$n$ rows. - The total number of rows is
$n^2$ .
A .txt file of this format can be produced using the numpy.savetxt function, passing a 2D ndarray in numpy[dn+g,t] format and using default parameters otherwise.
Internally, Chorus uses the numpy.loadtxt with default parameters to load displacement measurements from .txt files with the above format. See the function load_full_matrix_from_txt_file in the Chorus codebase.
Chorus can import a full matrix of displacement measurements from a MATLAB .mat file with the following characteristics:
- The
.matfile must contain only one variable (the array of displacement values). - The
.matfile must be of version 7.2 or earlier.
The version requirement is set by the scipy.io.loadmat function which Chorus uses internally to load data from binary HDF-based .mat files. Current versions of MATLAB will save .mat files in version 7.3 by default. This is not compatible with the scipy.io.loadmat function, and hence not compatible with Chorus.
.mat files of version 7 (which is compatible with scipy.io.loadmat) can be saved easily from a MATLAB workspace using the MATLAB function save, specifying the 'version' parameter as '-v7'.
Chorus further requries that the MATLAB array stored within the .mat file obeys one of the following two formats:
- 3D: The MATLAB array is 3-dimensional, and indexable in MATLAB via MATALB(t+1, g+1, d+1).
- 2D: The MATLAB array is 2-dimensional, and indexable in MATLAB via MATLAB(t+1, dn+g+1).
Chorus automatically detects if the ndarray returned by scipy.io.loadmat is 3D or 2D, and applies appropriate transpositions and/or reshaping to convert to the internal numpy[d,g,t] format.
See the function load_full_matrix_from_mat_file in the Chorus codebase.
Internally, Chorus expects displacement values to be expressed in units of nanometres (nm). Therefore, external files would ideally store displacement values in nm to match. However, converting between units is typically a trivial operation of multiplying by a scalar conversion factor (e.g. to convert m to nm, multiply by
Chorus was originally designed to handle displacement measurements from a laser interferometer, which output voltage values (in V) that could be converted to units of nm according to the scale factor