# Initial test of data plotting
- Early stage build of a project. 
- This is a test at pulling images from a h5 file and displaying them.
- Future projects will use a modified version of this code. 

The data used is from Oak Ridge National Labratory<br>
**Big Data Analytics for Scanning Transmission Electron Microscopy Ptychography**

# Using this program

To use this program, you need to have the h5 file saved to your device where this is running. <br>
**You will need to install pycroscopy if you do not already have it installed.**<br>
Note: pip install pycroscopy will work if your system is configured for such a command.

# Initilaization

The code will import the required packages as well as allow for the import of the data.

In [1]:
# System check
import sys
!conda install --yes --prefix {sys.prefix} numpy scipy matplotlib scikit-learn Ipython ipywidgets h5py
!{sys.executable} -m pip install -U --no-deps pyUSID==0.0.4
!{sys.executable} -m pip install -U --no-deps pycroscopy==0.60.1

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Requirement already up-to-date: pyUSID==0.0.4 in d:\booge\anaconda3\lib\site-packages (0.0.4)
Requirement already up-to-date: pycroscopy==0.60.1 in d:\booge\anaconda3\lib\site-packages (0.60.1)


In [2]:
# Import Libraries
import os
import h5py
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from IPython.display import display, HTML
import ipywidgets as widgets
from sklearn.cluster import KMeans

sys.path.append('..')
import pyUSID as usid
import pycroscopy as px

# Make Notebook take up most of page width
display(HTML(data="""
<style>
    div#notebook-container    { width: 95%; }
    div#menubar-container     { width: 65%; }
    div#maintoolbar-container { width: 99%; }
</style>
"""))

# set up notebook to show plots within the notebook
%matplotlib notebook

# Load the dataset

This will create a popup window for the user to select and load in the h5 file. 

In [3]:
# The user will select the location of the data
h5_path = px.io_utils.file_dialog('*.h5', '4D STEM dataset formatted according to USID')
print('Working on:\n' + h5_path)
# Open the file
h5_file = h5py.File(h5_path, mode='r')

Working on:
C:/Users/booge/Documents/Globus/62.tar/62/20120212_21_GB.h5


# Raw Data

This section does the following three things
- Selects the dataset containing the raw data
- Converts the data from a HDF5 sataset to a USIDataset
- Reads the parameters

In [4]:
usid.hdf_utils.print_tree(h5_file)

/
├ Measurement_000
  ---------------
  ├ Mean_Ronchigram
  ├ Position_Indices
  ├ Position_Labels
  ├ Position_Values
  ├ Ptychography_Data
  ├ Spectroscopic_Indices
  ├ Spectroscopic_Labels
  ├ Spectroscopic_Mean
  ├ Spectroscopic_Values


# Error present

There is currently an issue with this code that will not allow it to pass beyond this step. <br>
According to the tree you will see generated, there is no Raw_Data file included in the h5. <br>
I need to learn if there is another way to deal with the data, or if the issue is that the origional program wrote over the data file. <br>
The origional program had r+ mode for the file and not just r. 

In [5]:
# Select the dataset containing the raw data to start working with:
h5_main = usid.hdf_utils.find_dataset(h5_file, 'Raw_Data')[-1]

# Upgrade this object from a regular HDF5 dataset to a USIDataset:
h5_main = usid.USIDataset(h5_main)

# Read some necessary parameters:
h5_pos_inds = h5_main.h5_pos_inds
num_rows, num_cols = h5_main.pos_dim_sizes
h5_spec_inds = h5_main.h5_spec_inds
num_sensor_rows, num_sensor_cols = h5_main.spec_dim_sizes

IndexError: list index out of range

# Raw Ronchigrams

The raw ronchigrams are plotted in this section.

In [None]:
coarse_row = int(0.5*num_rows)
coarse_col = int(0.5*num_cols)
coarse_pos = coarse_row * num_rows + coarse_col

current_ronch = np.reshape(h5_main[coarse_pos], (num_sensor_rows, num_sensor_cols))

fig, axes = plt.subplots(ncols=2, figsize=(14,7))
axes[0].hold(True)
axes[0].set_title('Mean Response')
main_map = axes[0].imshow(np.reshape(h5_main.parent['Spectroscopic_Mean'], (num_rows, num_cols)), 
                          cmap=px.plot_utils.cmap_jet_white_center(), origin='lower')
main_vert_line = axes[0].axvline(x=coarse_col, color='k')
main_hor_line = axes[0].axhline(y=coarse_row, color='k')
axes[1].set_title('Ronchigram at current pixel')
img_zoom = axes[1].imshow(current_ronch,cmap=px.plot_utils.cmap_jet_white_center(), origin='lower')

def move_zoom_box(event):
    if not main_map.axes.in_axes(event):
        return
    
    coarse_col = int(round(event.xdata))
    coarse_row = int(round(event.ydata))
    main_vert_line.set_xdata(coarse_col)
    main_hor_line.set_ydata(coarse_row)
    
    coarse_pos = coarse_row * num_rows + coarse_col
    current_ronch = np.reshape(h5_main[coarse_pos], (num_sensor_rows, num_sensor_cols))

    img_zoom.set_data(current_ronch)
    #img_zoom.set_clim(vmax=ronch_max, vmin=ronch_min)
    fig.canvas.draw()
    

cid = main_map.figure.canvas.mpl_connect('button_press_event', move_zoom_box)
# widgets.interact(move_zoom_box, coarse_row=(0, num_rows, 1), 
#                  coarse_col=(0, num_cols, 1));

# Single Value Decomposition

Here we perform some basic linear algebra to get the eigenvectors and eigenvalues. 

In [None]:
# Choose how many components you want
num_svd_comps = 256

proc = px.processing.SVD(h5_main, num_components=num_svd_comps)

h5_svd_group = proc.compute()
    
h5_u = h5_svd_group['U']
h5_v = h5_svd_group['V']
h5_s = h5_svd_group['S']

# SVD Visualization

The varience, eigenvalues, and eigenvectors are visaulized. 
- S is the varience
- U is the eigenvalues
- V is the eigenvectors 

In [None]:
# Choose how many components of U and V to display
num_plot_comps = 16

In [None]:
# Visualize variance of the principal components
fig, axes = usid.plot_utils.plot_scree(h5_s, title='Variance')

In [None]:
# Visualize the abundance maps from SVD:
loadings = np.reshape(h5_u[:, :num_plot_comps], (num_rows, num_cols, -1)).transpose([2, 0, 1])
fig, axes = usid.plot_utils.plot_map_stack(loadings, num_comps=num_plot_comps, title='Abundance Maps',
                                         cmap=px.plot_utils.cmap_jet_white_center())

In [None]:
# Visualize the Endmembers from SVD:
eigenvectors = np.reshape(h5_v[:num_plot_comps], (-1, num_sensor_rows, num_sensor_cols))
fig, axes = usid.plot_utils.plot_map_stack(eigenvectors, num_comps=num_plot_comps, title='Endmembers',
                                         cmap=px.plot_utils.cmap_jet_white_center())

# Clustering

Here, the varience is lmited and the data is clustered according to the varience. 

In [None]:
# Choose how many SVD components to use in clustering
spectral_components = 128
# Choose how many clusters to use
num_clusters = 32

In [None]:
estimator = KMeans(n_clusters=num_clusters)

proc = px.processing.Cluster(h5_u, estimator, num_comps=spectral_components)

h5_kmeans_group = proc.compute()
    
h5_labels = h5_kmeans_group['Labels']
h5_centroids = h5_kmeans_group['Mean_Response']

# In case we take existing results, we need to get these parameters
num_comps_for_clustering = h5_centroids.shape[1]

In [None]:
label_mat = np.reshape(h5_labels, (num_rows, num_cols))
fig, axes = px.viz.cluster_utils.plot_cluster_labels(label_mat, num_clusters=num_clusters)

In [None]:
e_vals = np.reshape(h5_u[:, :spectral_components], 
                    (num_rows, num_cols, -1))
fig = px.viz.cluster_utils.plot_cluster_dendrogram(label_mat, e_vals, 
                                                   num_comps_for_clustering, 
                                                   num_clusters, 
                                                   last=num_clusters);

# Close and Save

In [None]:
h5_file.close()