<font size = "5"> **Chapter 1: [Introduction](CH1_00-Introduction.ipynb)** </font>


<hr style="height:1px;border-top:4px solid #FF8200" />

# Open DM3 Images, Spectra, Spectrum-Images and  Image-Stacks with pyNSID 

part of 

<font size = "5"> **[MSE672:  Introduction to Transmission Electron Microscopy](_MSE672_Intro_TEM.ipynb)**</font>

by Gerd Duscher, Fall 2021

Microscopy Facilities<br>
Joint Institute of Advanced Materials<br>
Materials Science & Engineering<br>
The University of Tennessee, Knoxville

Background and methods to analysis and quantification of data acquired with transmission electron microscopes.

---
Reading a dm file and translating the data in a **[pyNSID](https://pycroscopy.github.io/pyNSID/)** style hf5py file to be compatible with  the **[pycroscopy](https://pycroscopy.github.io/pycroscopy/)** package.

Because, many other packages and programs for TEM data manipulation are based on the ``hdf5`` file-formats it is relatively easy to convert back and forward between them.



## First, we load the necessary packages
Please visit the  section for [pyTEMlib](CH1-Prerequesites.ipynb#TEM-Library) of the [Prerequesites](CH1-Prerequesites.ipynb) section for information of necessary packages.

In [1]:
# import matplotlib and numpy
#                       use "inline" instead of "notebook" for non-interactive plots
%pylab --no-import-all notebook
%gui qt

# import TEMlib from pyTEM
import pyTEMlib
import file_tools_nsid  as ft     # File input/ output library

# import packages for pyUSID, not really needed her, but important for low level access to data.
#import pyNSID as nsid
#import h5py

# For archiving reasons it is a good idea to print the version numbers out at this point
print('pyTEM version: ',pyTEMlib.__version__)
__notebook__='CH1-Reading_File'
__notebook_version__='2020_06_08'


Populating the interactive namespace from numpy and matplotlib
windows
pyTEM version:  0.2020.04.2


## Open a file 

This function opens a hfd5 file in the pyUSID style which enables you to keep track of your data anlysis.

Please see the **[Installation](CH1-Prerequisites.ipynb#TEM-Library)** notebook for installation.

We want to consolidate files into one dataset that belongs together.  For example a spectrum image dataset consists of: 
* Survey image, 
* EELS spectra 
* Z-contrast image acquired simulatonioulsy with the spectra.


So load the top dataset first in the above example the survey image.

Please note that the plotting routine of ``matplotlib`` was introduced in **[Matplotlib and Numpy for Micrographs](CH1-Data_Representation.ipynb)** notebook.

**Use the file p1-3hr.dm3 from TEM_data directory for a practice run**

In [2]:
# We might run this code cell several times and so we first try to close the file if open.
try:
    h5_file.close()
except:
    pass

# Load file
h5_file = ft.h5_open_file()#os.path.join(current_directory,filename))
current_channel = h5_file['Measurement_000/Channel_000']
current_dataset = current_channel['nDim_Data']

ft.h5_plot(current_dataset)

parent
Found  EELS_spectrum  in dm3 file
Cannot overwrite file. Using:  01-EELS Acquire_STO-30.hf5


NameError: name 'validate_dimensions' is not defined

## Data Structure

The data themself reside in a ``hdf5 dataset`` which we name ``current_dataset``.

In [None]:
print(f'size of current dataset is {current_dataset.shape}')

The current_dataset has additional information stored as attributs which can be accessed through the ``attrs`` function.

In [None]:
print(current_dataset.attrs.keys())

The current_channel (like a directory in a file system) contains several groups.

Below I show how to access one of those groups.

In [None]:
print(current_channel.keys())

print(current_channel['data_type'][()])

An important group in ``current_channel`` is the ``original_metadata`` group, where all the original metadata of your file reside in the ``attributes``. This is usually a long list for ``dm3`` files.

In [None]:
for key,value in current_channel['original_metadata'].attrs.items():
    print(key, value)

In [None]:
print(current_channel.keys())

## Adding Data

To add another dataset that belongs to this measurement we will use the **h5_add_channel** from  **filetools** in the  pyTEMlib package.

Here is how we add a channel there.

We can also add a new measurement group (add_measurement in pyTEMlib) for similar datasets.

This is equivalent to making a new directory in a file structure on your computer.

In [None]:
import pyUSID as usid

def h5_add_channel(current_channel):
    measurement_group = h5_file[current_channel.name.split('/')[1]]
    name = usid.io.hdf_utils.assign_group_index(measurement_group,'Channel')
    
    additional_channel = measurement_group.create_group(name)

def h5_add_measurement(h5_file):
    new_measurement_group = usid.io.hdf_utils.create_indexed_group(h5_file,'Measurement')
ft.h5_add_measurement(h5_file)    
ft.h5_tree(h5_file)  #wraps usid.hdf_utils.print_tree(h5_file)

We use above functions to add the content of a (random) data-file to the current file.

This is important if you for example want to add a Z-contrast or survey-image to an spectrum image.

Therefore these functions enable you to collect the data from different files that belong together.


In [None]:
new_channel = ft.h5_add_data(current_channel)

ft.h5_tree(h5_file)  #wraps usid.hdf_utils.print_tree(h5_file)

## Adding additional information

Similarly we can add a whole new measurement group or a structure group.

This function will be contained in the KinsCat package of pyTEMlib.

If you loaded the example image, with graphite and ZnO both are viewed in the [1,1,1] zone axis.


In [None]:
import pyTEMlib.KinsCat as ks         # Kinematic sCattering Library
                             # with Atomic form factors from Kirklands book

def h5_add_crystal_structure(h5_file, crystal_tags):
    structure_group = usid.io.hdf_utils.create_indexed_group(h5_file,'Structure')
    
    structure_group['unit_cell'] = crystal_tags['unit_cell']
    structure_group['relative_positions'] = crystal_tags['base']
    structure_group['title'] = str(crystal_tags['crystal_name'])
    structure_group['_'+crystal_tags['crystal_name']] = str(crystal_tags['crystal_name'])
    structure_group['elements'] = np.array(crystal_tags['elements'],dtype='S')
    if 'zone_axis' in structure_group:
        structure_group['zone_axis'] = np.array(crystal_tags['zone_axis'], dtype=float)
    else:
        structure_group['zone_axis'] = np.array([1.,1.,1.], dtype=float)
        
    h5_file.flush()
    return structure_group
                                                                                 
crystal_tags = ks.structure_by_name('Graphite')
h5_add_crystal_structure(h5_file, crystal_tags)
                                                                                
crystal_tags = ks.structure_by_name('ZnO')
ft.h5_add_crystal_structure(h5_file, crystal_tags)

usid.hdf_utils.print_tree(h5_file)


## Keeping Track of Analysis and Results
A notebook is notorious for getting confusing, especially if one uses different notebooks for different task, but store them in the same file.

If you like a result of your calculation, log it.
|
The function will write your calculation to the pyUSID style file and attaches a time stamp.

The two functions below are part of  file_tools of pyTEMlib.

In [None]:
info_dictionary = {}
info_dictionary['analysis'] = 'Nothing'
info_dictionary['name'] = 'Nothing'

log_group = ft.log_results(current_dataset, info_dictionary)

usid.hdf_utils.print_tree(h5_file)


## An example for a log
We log the Fourier Transform of the image we loaded

First we perform the calculation

In [None]:
## Access the data of the loaded image
data = current_dataset

## The data log goes in the dictionary out_tags
out_tags = {}
## data tag contains the newly calculated result
out_tags['data'] = np.fft.fftshift(np.fft.fft2(data))

## meta data (can be anything, but good practice is to be compatible with pyUSID data set)
out_tags['analysis']= 'Fourier_Transform'

out_tags['spatial_origin_x'] = data.shape[0]/2
out_tags['spatial_origin_y'] = data.shape[1]/2

for dim in current_dataset.dims:
    if dim.label == 'x': scale_x = dim[0][1]-dim[0][0]
    if dim.label == 'y': scale_y = dim[0][1]-dim[0][0]     
        
out_tags['spatial_scale_x'] = 1.0/scale_x/data.shape[0]
out_tags['spatial_scale_y'] = 1.0/scale_y/data.shape[1]
out_tags['spatial_size_x'] = data.shape[0]
out_tags['spatial_size_y'] = data.shape[1]
out_tags['spatial_units'] = '1/nm'


FOV_x = out_tags['spatial_origin_x']* scale_x
FOV_y = out_tags['spatial_origin_y']* scale_y
out_tags['image_extent'] = [-FOV_x,FOV_x,FOV_y, -FOV_y]
fig = plt.figure()
plt.imshow(np.log2(1+np.abs(out_tags['data'])),origin='upper', extent = out_tags['image_extent'])
plt.xlabel('reciprocal distance ['+ out_tags['spatial_units']+']');


Now that we like this we log it.

Please note that just saving the fourier transform would not be good as we also need the scale and such.

In [None]:
import importlib
importlib.reload(ft)


out_tags['name'] = 'fft'
out_tags['units'] = '1/nm'
out_tags['data_type'] = 'image'

log_group = ft.log_results(current_dataset, out_tags)
log_dataset = log_group['nDim_Data']
ft.h5_tree(h5_file)
fig = plt.figure()
plt.title(log_group['analysis'][()])
plt.imshow(np.log2(1+np.abs(log_dataset)),origin='upper', extent = log_group['image_extent'][()])
plt.xlabel('reciprocal distance ['+ log_group['units'][()]+']');


Please close the file

In [None]:
print(h5_file.filename)
h5_file.close()


## Open h5_file
Open the h5_file that we just created

In [None]:
h5_file = ft.h5_open_file()

current_channel = h5_file['Measurement_000/Channel_000']
current_dataset = current_channel['nDim_Data']

ft.h5_plot(current_dataset)

In [None]:
plt.figure()
plt.imshow(np.array(current_dataset));

### Short check if we got the data right
we print the tree and we plot the data

In [None]:
# See if a tree has been created within the hdf5 file:
ft.h5_tree(h5_file)
image_tags = dict(h5_file['Measurement_000/Channel_000'].attrs)
for key in image_tags:
    if 'original' not in key:
        #print(key,': ',image_tags[key])
        pass
current_channel = h5_file['Measurement_000/Channel_000']



### Add more data to this set

Often more than one data set belong together.
For instance a spectrum image has a survey image and a Z-contrast image recorded with the survey image.

Here we jsut load another image for example *p1-3-hr3b.dm3*

In [None]:
current_channel = ft.h5_add_data(current_channel)
    
measurement_group = current_channel.parent
    
for key in list(measurement_group.keys()):
    if 'title' in measurement_group[key].keys(): 
        print(key,': ',measurement_group[key]['title'][()])
    else:
        print(key,': ')   

Let's see what you selected


In [None]:
current_dataset = current_channel['nDim_Data']

ft.h5_plot(current_dataset)

## If we are done, we close the pyUID style file.

This is necessary to make the file ready to be opened by another notebook or program.

In [None]:
h5_file.close()

## Navigation

<font size = "4"> 
    
**Back: [Matplotlib and Numpy for Micrographs](CH1_03-Data_Representation.ipynb)**<br>
**Next: [Diffraction](CH2_00-Diffraction.ipynb)**<br>
**Up Chapter 1: [Introduction](CH1_00-Introduction.ipynb)**<br>
**List of Content: [Front](_MSE672_Intro_TEM.ipynb)**
</font>