# 3-channel image compiler for EPMA elemental maps of petrographic thin sections

This notebook contains a script with step-by-step explanations to create a widget that stacks three 8-bit grayscale `.tiff` files into a single 3-channel RGB image and saves it your local drive as a new `.tiff` file. 

The first three code blocks were provided by Akshay Mehra.

## Overview


### This is going to need to be written after I have worked on this some more.

The purpose of this notebook is to composite EPMA elemental map data into 3-channel RGB images. This is part of a workflow for mineral grain classification of petrographic thin sections, developed as a class project for MLGeo 2023 within the scope of Nicole Aikin's PhD thesis. 

training and testing datasets for a machine learning model that classifies mineral grains based on elemental compositions at each microprobe measurement point, or pixel
mineral assemblages based on the primary 
produce 3-channel RGB composite images for 
a machine learning 

#### Steps
1. make widget
2. create RGB `.tiffs`
3. ...
4. ..

### Data
Each thin section has  elemental maps `.tiff` files, each for a different element. 
Thin section 1 images are 703 x 1100 pixels
Thin section 2 images are 575 x 1026 pixels
Thin section 3 images are 695 x 1152 pixels
Thin section 3 images are 695 x 1152 pixels

|Thin section|Index|Dimensions|# of maps|
|---|---|---|---|
|84.7-NA-2-1_Cold|0|695 x 1152|9|
|84.7-NA2-2_Cold|1|695 x 1152|9|
|78.7-10-1_Hot|2|703 x 1100|10|
|78.7-10-2_Hot|3|575 x 1026|10|

 

#### **Creating the widget**
##### 1. Loading the images and storing them as 3D numpy array image stacks for each thin section
##### 2. Defining the widget interface and 3-channel compiler and file saving functions. 


In [16]:
# load all packages

import numpy as np                      
import os                               # to work with filepaths and local directories
import matplotlib.pyplot as plt         # to display the output images in the notebook
%matplotlib inline

# Python image library (PIL) lets you access .tiff files
from PIL import Image

# to create and add functionality to a widget
import ipywidgets as widgets

# to display the widget
from IPython.display import display

# to save arrays to .tiff files
import imageio
# to check your OS in order to open a file saving prompt
import platform
import tempfile
import shutil
from tkinter import Tk, filedialog


NameError: name 'HTML' is not defined

The first step is to load in `.tiffs` and turn them into 2D arrays. We'll do this with `image.open('filePath')` and `np.asarray(imageName)`. 

`np.asarray(imageName)` parses the `.tiff` file row by row, so the 2D array that it creates represent the `.tiff` raster visually:

This image: <br>
 `image = ` <br>   
 `1 2 3` <br>
 `4 5 6` <br>
 `7 8 9`  

becomes this array:
  
`image_array =` <br> 
`[[1, 2, 3]` <br>
`[4, 5, 6]` <br>
`[7, 8, 9]]`               

We're doing this step separately because it allows us to look at the data before we continue working with it:

In [2]:
# LOADING THE TIFFS
# Load in a single image
singleMap = Image.open('/Users/jonathanlindenmann/jonspace/000_aut_23/ESS469/Git/ML-Geo-Pixel-Poppers/Aikin_Data/78.7-10-1_Hot/RAW/UGG-W3-87.7-10.1-Full_Al Ka_Al Ka EDS.Tiff')
# Turn this image into an array
singleMapArray = np.asarray(singleMap)


#print(singleImageArray)

Just from looking at the values in some of the channels aka maps this way, we can see that we need to format the data.
- there are decimal places (slightly off whole integers) that we need to round to the nearest integer value
- there are values as high as 960, when we need values to be between 0 and 255.

We can do both by first scaling the values from the current range (0 - maxValueOfChannel) to the range 0 - 255, then rounding to nearest integers. BUT, if we do this to each channel, we will scale each channel a different amount. 
Instead, we want to keep the current relative size of each value across the entire map stack by scaling all channels from the range 0 - maxValueOfMapStack to the range 0 - 255. 

# RESOLVE DATA SCALING - IMPORTANT!:
Confirm that the microprobe sensitivity is the same for each chemical map. Otherwise, we need to account for this in our data, or make a deliberate choice of scaling system for the available chemical maps.

### ALSO IMPORTANT FOR JON TO DO: get the use of filenames using the os package down so that we can have names of elements and thin sections instead of index numbers!!



We have four thin sections, and each one has nine or ten `.tiff` files of elemental maps.  These `.tiff` files are stored in `projectRoot/Aikin_Data/thinSectionName/RAW/`. We want to turn all `.tiff`s for a given thin section into 2D arrays and stack them along the third dimension of an `mapStack` array. We can then consolidate all these map stacks in to a `list` object `thinSections` which contains an entry ('for each thin section, contfor each thin section in an array, then make a list object that contains all four image stacks, together with a name for each. 


## Break down of the original code provided by Akshay (now modified):
1. Create the list `thinSections`. This is a `dict` object that contains the thin section `name` and `stack` (the root folder )
2. Create a loop that iterates over every directory in the parent folder with the name 'RAW' and extracts the .tiffs within into a 3D array, then adds the image stack to the `thinSections` list object:
    2.1 Navigates to a directory called 'RAW'
    
    2.1.1 In the 'RAW' directory, use the name of the folder the 'RAW' folder is in as the `thinSectionName` and adds it to the thinSections object as a key:`name` `dict` object.
        
    2.1.2 Find the number of files in the 'RAW' folder. 
        
    2.1.3 For all 'RAW' folders that have files in them, 
            
    2.1.3.1 Create boolean that logs whether or not the image stack is complete. 
    
    2.1.3.2 Create an empty array for the image stack. 
    
    2.1.3.3 Count the number of image stacks/RAW folders being iterated over.
    
    2.1.4 For each .tiff, open the file, turn it into an array, and add it to the `mapStack`, which is the loop's working image stack, at the layer specified by section count. The section count indexes each 2D layer by counting the layers.
            
    Refer to the code to understand the rest. 
            

Things added: 
- Map filename element within each thin section element. 

In [3]:
# We know that we can turn one image into an array...can we stack all the images for a thin section into one, 3D array?

# Create an list, aka array (NOT a NumPy array(!!) that will contain all thin sections and their names as dict objects.

thinSections = [] # empty LIST for the all stacked thin section arrays

iteration = 0 # tracker for 

# Everything in this for loop will iterate over thin section. Everything in this loop is happening to ONE THIN SECTION at a time. 
# Okay, for every folder in our parent folder

for root, dirs, files in os.walk('.//'):
    # './/' means the CURRENT working directory. In this case, it is the one that the jupyter notebook is stored in (perfect!)
        
        #didnt work # the walk() function is parsing in a weird order, so this fixes that with more computation:
        #files = sorted(files)
        
        # Does the directory contain the word RAW?
        if "RAW" in root:
            # If so, what is the name of the thin section? Use the root (as in /root/RAW/mapsAreHere.tiff) folder as the name of the thin section.
            thinSectionName = os.path.basename(os.path.dirname(root))
            # Add the name to the list.
            thinSections.append(dict({'name': thinSectionName}))
            
            mapNamesArray = []
            # Are there files in the RAW folder?
            numberOfFiles = len(files)
                
            # If yes, let's figure out what the files are. Particularly, we want to identify files that aren't .tiffs, since we're not using those. 
            if numberOfFiles > 0:
                # make an empty list for all filenames
                fileList = []
                # make an empty list for booleans: Is this file a .tif(f)? True/False (for later)
                tiff_filter = []

                # check each file in the RAW folder
                for file in files:
                    # get the name of the file
                    filename = file
                    # get a TRUE for a .tif(f) and a FALSE for other filetypes
                    is_tiff = os.path.splitext(filename)[1].lower() == '.tiff', '.tif'
            
                    #add the results to the lists 
                    fileList.append(filename)
                    tiff_filter.append(is_tiff[0])

                #print("There are", numberOfFiles, "files in ", thinSectionName, "\u200B/RAW/:", fileList)
                #print("Booleans (if .tif(f) then True): ", tiff_filter)

                # now we'll make a list of the filenames of ONLY the .tif(f) files in the RAW folder. 
                # In our case, there are only .tiff files, anyways.
                tiffList = [value for value, condition in zip(fileList, tiff_filter) if condition]
                numberOfTiffs = len(tiffList)
                # print(tiffList)
                # We'll also add this list of .tif(f) filenames, so that we can reference it later and keep track of which array
                # is for which element
                thinSections[iteration]['mapNames'] = tiffList

                # Next we want to actually load the .tiffs and put them in our mapStack in our thinSection list 
                # along with the thin section name and the map filenames. 
                # by default, when we start making the map stack now, there is no map stack array, 
                # so we need to make one, but need to make it the shape of the thin section map data.
                # We do this by setting its condition now (False), then creating the array and setting the condition to true in an 
                # if statement that only runs if the condition is false. This way, we run the for loop for every file, but only create
                # the mapStack array during the first iteration, using the shape of the first map. 
                arraySet = False 
                # We also keep track of the number of maps we've added.
                tsMapCount = 0


                # for each .tiff file,
                for file in tiffList:
                    # open the file
                    single_tsMap = Image.open(os.path.join(root, file))
                    # convert it to a 2D numpy array
                    single_MapArray = np.asarray(single_tsMap)
                    #print(single_MapArray.shape)
                    
                    #the next two lines of code aren't needed anymore because of the fileList and tiff_filter functionalities
                    # get the filename of the map .tiff
                    #tsMapName = file
                    #print(tsMapName)
                    
                    #during the first iteration of the loop, create the map stack array:
                    if not arraySet: 
                        # get the shape of the image data
                        tsShape = single_MapArray.shape
                        # create an empty array with that shape and a third dimension with a spot for each map layer
                        mapStack = np.empty([tsShape[0], tsShape[1], numberOfTiffs])
                        # set the map stack condition to created.
                        arraySet = True


                    # Now we want to append the 2D map array to the mapstack at the right 
                    # 3D index (0 for the first map, 1 for the second, ... 8 for the ninth map)
                    mapStack[:, :, tsMapCount] = single_MapArray
                    # increase the map counter and repeat for the next .tif(f) file. 
                    tsMapCount += 1
                
                # Now, we've got a list with each thin section as a dictionary element. 
                # In each element, there are two keys (the name of the thin section, and the list of .tiffs used in the map stack, both of which we added earlier)
                # Next we'll add the mapStack we just created for the thin section we are currently iterating over. 
                thinSections[iteration]['stack'] = mapStack 
                

            # we increase the interation counter and iterate over the next RAW folder/thin section.     
            iteration += 1




#these print commands are for troubleshooting.
#print(thinSectionName)
#print("tsArray.shape:", tsArray.shape)
#print ("Number of thin sections ", iteration)
#print("sectionCount:", sectionCount)
#print("thinSections:", thinSections)
#print("length of thinSections:", len(thinSections))
for section in thinSections:
    tsName = section.get('name', 'N/A') 
    print("Keys for ", tsName)
    for key in section.keys():
        print(f"  {key}")
    print()


Keys for  84.7-NA-2-1_Cold
  name
  mapNames
  stack

Keys for  84.7-NA2-2_Cold
  name
  mapNames
  stack

Keys for  78.7-10-1_Hot
  name
  mapNames
  stack

Keys for  78.7-10-2_Hot
  name
  mapNames
  stack



We now have `thinSections`, which is a list that contains 4 elements, one for each thin section. Each element is a thin section, and contains 
- `name`: (name of thin section from root dir folder name), and
- `stack`: (3D numpy array that is an image stack of all the .tiffs of that thin section), 
- `mapNames`: List of the names of the individual .tiff files
- (`mappedElements`: List of the individual elements in each map)

Next we want to create a widget to work with these image stacks. 

## Creating the widget

We want to be able to

- create RGB images
    - choose a thin section
    - select a map from the thin section's map stack for each channel
    - compile the selected maps into and RGB image and display the image
    - save the image to the right location with right filename

It would be nice to add additional functionalities, such as the ability to view the individual 8-bit chemical maps with different colormaps.

1. Add the names of each layer to the dropdown lists for each channel. (customized dropdown fields) Add a 'display vector'.
2. 



In [20]:
# create the dropdown list items lists:
#for thin section names:
tsNamesList = []
for k in range (len(thinSections)):
    thisThinSection = thinSections[k]
    tsName = thisThinSection['name']
    tsNamesList.append(tsName)
print(tsNamesList)

# mapNames within each thin section:
#we''l make a 2D array with tuples of mapNames for each thin section (shapes = (9,4) and (10,4) for cold and hot respectively)
print(len(thinSections))
mapNamesList = []
for k in range(len(thinSections)):
    # #Create the array at the beginning of the loop
    # if mapNamesListArraySet == False:
    
    #     mapNamesListArraySet = True

    thisThinSection = thinSections[k]
    this_tsMapNames = thisThinSection['mapNames']
    mapNamesList.append(this_tsMapNames)
#print(mapNamesList)

thinSectionNames = thinSections[:][0]
# Create a dropdown widget for thin section selection
thinSectionDropdown = widgets.Dropdown(
    options= [None] + tsNamesList,
    description='Thin Section:'
)

# Create dropdown widgets for the three dimensions
channel1Dropdown = widgets.Dropdown(
    options=[None],
    description='Red'
)

channel2Dropdown = widgets.Dropdown(
    options=[None],
    description='Green'
)

channel3Dropdown = widgets.Dropdown(
    options=[None],
    description='Blue'
)

# i need a function that can figure out the thin section list index from just the thin section name:
def find_index_by_name(thinSections, selected_thinSectionName):
    for index, section in enumerate(thinSections):
        if section['name'] == selected_thinSectionName:
            return index  # return the index value of the name is found. 
    return None  # in case there is no match


# Update channel dropdown options based on the selected thin section
def update_channel_options(change):
    selected_thinSectionName = change.new
    selected_thinSectionIndex = find_index_by_name(thinSections, selected_thinSectionName)
    channel1Dropdown.options = [None] + mapNamesList[selected_thinSectionIndex]
    channel2Dropdown.options = [None] + mapNamesList[selected_thinSectionIndex]
    channel3Dropdown.options = [None] + mapNamesList[selected_thinSectionIndex]

# Attach the update_channel_options function to the observe method of thinSectionDropdown
thinSectionDropdown.observe(update_channel_options, names='value')


# Create a button widget to trigger visualization
visualizeBtn = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='success',  # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Click to compile selection to RGB and view the output.',
    icon='magnifying-glass'
)

# Create a button widget to save files
savefileBtn = widgets.Button(
    description='Save as...',
    disabled=False,
    button_style='success',  # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Compile to RGB and save as .tiff.',
    icon='download'
)


# Create an output widget to display the image
output = widgets.Output()

   





def visualize(b):

    # compile should: collect the three maps selected in the dropdowns and stack them into an RGB array, then return the RGB array.
    # the dropdowns contain names, not the indices of the data we want to access, so we need to get the indices first. 
    # get the map stack and its shape from the selected thin section
    
    
    # initialize the RGB map stack array
    # What shape is the selected thin section?
    selected_thinSectionName = thinSectionDropdown.value
    selected_thinSectionIndex = find_index_by_name(thinSections, selected_thinSectionName)
    thisThinSection = thinSections[selected_thinSectionIndex]
    thisStack = thisThinSection['stack']
    thisStackShape = thisStack.shape

    # initialize the RGB map stack array with the selection thin section shape and 3 channels. (dtype='int' cleans up decimals)
    RGB_stack = np.empty([thisStackShape[0], thisStackShape[1], 3], dtype='int')
     

    # What are the mapStack indices of the selected channels?
    # compare the selected map name to a list of all maps for the thin section and retrieve the index number of the map.
    # this is the same index number as for the mapStack array, so we can use it to access the data we need. 
    
    # make a list of all the map filenames in this thin section
    this_mapStackNames = mapNamesList[selected_thinSectionIndex]
    #print("this_mapStackNames:", this_mapStackNames)

    # there seem to be some hidden whitespaces making the filenames not match. remove any whitespaces from all the filenames in the list:
    this_mapStackNames_stripped = np.core.defchararray.strip(this_mapStackNames)
    
    # channel 1    
    # remove whitespaces from the selected filename, as well
    selected_mapName1 = channel1Dropdown.value.strip()
    # find the index number of the selected map (same for mapStackNames (where the names are) and thisStack (where the data is))
    selected_mapNameIndices1 = np.nonzero(np.char.strip(this_mapStackNames_stripped) == selected_mapName1) #thanks, chatgpt, for this line
            # here is a quick breakdown if you'd like to know:
                # - np.char.strip(this_mapStackNames_stripped) strips whitespaces again (seems like a good type of redundancy to have here)
                # - adding == selected_mapName1 creates an array the same shape as this_mapStackNames_stripped, but with False at every 
                # position without selected_mapName1 as the value, and True otherwise.
                # - np.nonzero() returns the indices of all the True values (we only have one). Note: the output is in this form: 
                # selected_mapNameIndices1 = (array([x]),)  where x is the index of where selected_MapName1 is found in this_mapStackNames_stripped.
                # --> the result is that I am using ...Indices1 as a variable, then extracting ...Index1 from it, even though there's nothing else in it. 
    # did that return an index number?
    if selected_mapNameIndices1[0].size > 0: 
    # if yes, extract the first element of the array (b/c of the np.nonzero output format)
        selected_mapNameIndex1 = selected_mapNameIndices1[0][0]
        print(f"Index of {selected_mapName1} in this_mapStackNames: {selected_mapNameIndex1}")
    else:
        print(f"{selected_mapName1} not found in this_mapStackNames")

    # channel 2, repeat
    # Get the selected map name and strip whitespaces
    selected_mapName2 = channel2Dropdown.value.strip()
    # Find the index of the selected_mapName in this_mapStackNames_stripped
    selected_mapNameIndices2 = np.nonzero(np.char.strip(this_mapStackNames_stripped) == selected_mapName2)
    # Check if the selected_mapName is found in this_mapStackNames
    if selected_mapNameIndices2[0].size > 0:
    # Extract the first element of the array (assuming unique values)
        selected_mapNameIndex2 = selected_mapNameIndices2[0][0]
        print(f"Index of {selected_mapName2} in this_mapStackNames: {selected_mapNameIndex2}")
    else:
        print(f"{selected_mapName2} not found in this_mapStackNames")


     # channel 3, repeat
    # Get the selected map name and strip whitespaces
    selected_mapName3 = channel3Dropdown.value.strip()
    # Find the index of the selected_mapName in this_mapStackNames_stripped
    selected_mapNameIndices3 = np.nonzero(np.char.strip(this_mapStackNames_stripped) == selected_mapName3)
    # Check if the selected_mapName is found in this_mapStackNames
    if selected_mapNameIndices3[0].size > 0:
    # Extract the first element of the array (assuming unique values)
        selected_mapNameIndex3 = selected_mapNameIndices3[0][0]
        print(f"Index of {selected_mapName3} in this_mapStackNames: {selected_mapNameIndex3}")
    else:
        print(f"{selected_mapName3} not found in this_mapStackNames")

    # now the indices of the selected channels are known. we can use them to assign the right channels from this thin section's map stack
    # to the R, G, and B channels.
    RGB_stack[:, :, 0] = thisStack[:, :, selected_mapNameIndex1]
    RGB_stack[:, :, 1] = thisStack[:, :, selected_mapNameIndex2]
    RGB_stack[:, :, 2] = thisStack[:, :, selected_mapNameIndex3]
        
    # finally, display the output in the widget.
    with output:
        output.clear_output() # just in case
        plt.figure(figsize=(11, 8.25))
        plt.imshow(RGB_stack)  # You can change the colormap as needed
        plt.show()




system = platform.system()

def savefile(b):

     # compile should: collect the three maps selected in the dropdowns and stack them into an RGB array, then return the RGB array.
    # the dropdowns contain names, not the indices of the data we want to access, so we need to get the indices first. 
    # get the map stack and its shape from the selected thin section
    
    
    # initialize the RGB map stack array
    # What shape is the selected thin section?
    selected_thinSectionName = thinSectionDropdown.value
    selected_thinSectionIndex = find_index_by_name(thinSections, selected_thinSectionName)
    thisThinSection = thinSections[selected_thinSectionIndex]
    thisStack = thisThinSection['stack']
    thisStackShape = thisStack.shape

    # initialize the RGB map stack array with the selection thin section shape and 3 channels. (dtype='int' cleans up decimals)
    RGB_stack = np.empty([thisStackShape[0], thisStackShape[1], 3], dtype='int')
     

    # What are the mapStack indices of the selected channels?
    # compare the selected map name to a list of all maps for the thin section and retrieve the index number of the map.
    # this is the same index number as for the mapStack array, so we can use it to access the data we need. 
    
    # make a list of all the map filenames in this thin section
    this_mapStackNames = mapNamesList[selected_thinSectionIndex]
    #print("this_mapStackNames:", this_mapStackNames)

    # there seem to be some hidden whitespaces making the filenames not match. remove any whitespaces from all the filenames in the list:
    this_mapStackNames_stripped = np.core.defchararray.strip(this_mapStackNames)
    
    # channel 1    
    # remove whitespaces from the selected filename, as well
    selected_mapName1 = channel1Dropdown.value.strip()
    # find the index number of the selected map (same for mapStackNames (where the names are) and thisStack (where the data is))
    selected_mapNameIndices1 = np.nonzero(np.char.strip(this_mapStackNames_stripped) == selected_mapName1) #thanks, chatgpt, for this line
            # here is a quick breakdown if you'd like to know:
                # - np.char.strip(this_mapStackNames_stripped) strips whitespaces again (seems like a good type of redundancy to have here)
                # - adding == selected_mapName1 creates an array the same shape as this_mapStackNames_stripped, but with False at every 
                # position without selected_mapName1 as the value, and True otherwise.
                # - np.nonzero() returns the indices of all the True values (we only have one). Note: the output is in this form: 
                # selected_mapNameIndices1 = (array([x]),)  where x is the index of where selected_MapName1 is found in this_mapStackNames_stripped.
                # --> the result is that I am using ...Indices1 as a variable, then extracting ...Index1 from it, even though there's nothing else in it. 
    # did that return an index number?
    if selected_mapNameIndices1[0].size > 0: 
    # if yes, extract the first element of the array (b/c of the np.nonzero output format)
        selected_mapNameIndex1 = selected_mapNameIndices1[0][0]
        print(f"Index of {selected_mapName1} in this_mapStackNames: {selected_mapNameIndex1}")
    else:
        print(f"{selected_mapName1} not found in this_mapStackNames")

    # channel 2, repeat
    # Get the selected map name and strip whitespaces
    selected_mapName2 = channel2Dropdown.value.strip()
    # Find the index of the selected_mapName in this_mapStackNames_stripped
    selected_mapNameIndices2 = np.nonzero(np.char.strip(this_mapStackNames_stripped) == selected_mapName2)
    # Check if the selected_mapName is found in this_mapStackNames
    if selected_mapNameIndices2[0].size > 0:
    # Extract the first element of the array (assuming unique values)
        selected_mapNameIndex2 = selected_mapNameIndices2[0][0]
        print(f"Index of {selected_mapName2} in this_mapStackNames: {selected_mapNameIndex2}")
    else:
        print(f"{selected_mapName2} not found in this_mapStackNames")


     # channel 3, repeat
    # Get the selected map name and strip whitespaces
    selected_mapName3 = channel3Dropdown.value.strip()
    # Find the index of the selected_mapName in this_mapStackNames_stripped
    selected_mapNameIndices3 = np.nonzero(np.char.strip(this_mapStackNames_stripped) == selected_mapName3)
    # Check if the selected_mapName is found in this_mapStackNames
    if selected_mapNameIndices3[0].size > 0:
    # Extract the first element of the array (assuming unique values)
        selected_mapNameIndex3 = selected_mapNameIndices3[0][0]
        print(f"Index of {selected_mapName3} in this_mapStackNames: {selected_mapNameIndex3}")
    else:
        print(f"{selected_mapName3} not found in this_mapStackNames")

    # now the indices of the selected channels are known. we can use them to assign the right channels from this thin section's map stack
    # to the R, G, and B channels.
    RGB_stack[:, :, 0] = thisStack[:, :, selected_mapNameIndex1]
    RGB_stack[:, :, 1] = thisStack[:, :, selected_mapNameIndex2]
    RGB_stack[:, :, 2] = thisStack[:, :, selected_mapNameIndex3]
 
    
        #turn the output array back into a .tiff (?)
        #rgb_array_uint8 = np.clip(RGB_stack, 0, 255).astype(np.uint8)
        #save the file to the working directory
        #imageio.imwrite('output_image.tiff', rgb_array_uint8)  

# # Save the file using the appropriate file saving prompt
#     if system == 'Darwin':  # macOS
#         from tkinterdnd2 import TkinterDnD, TkinterDnDThemed
#         from tkinterdnd2.simpledialog import askstring
#         root = TkinterDnDThemed.Tk()
#         root.withdraw()  # Hide the main window
#         file_path = askstring('Save File', 'Enter a file name:', initialfile='output_image.tiff', filetypes=[('TIFF files', '*.tiff')])
#     elif system == 'Windows':
#         from tkinter import Tk, filedialog
#         root = Tk()
#         root.withdraw()  # Hide the main window
#         file_path = filedialog.asksaveasfilename(defaultextension='.tiff', filetypes=[('TIFF files', '*.tiff')])

#     if file_path:
#         rgb_array_uint8 = np.clip(RGB_stack, 0, 255).astype(np.uint8)
#         imageio.imwrite(file_path, rgb_array_uint8)
#         print(f'File saved to: {file_path}')

    # Save the file to a temporary directory
    temp_dir = tempfile.mkdtemp()
    temp_file_path = os.path.join(temp_dir, 'output_image.tiff')
    rgb_array_uint8 = np.clip(RGB_stack, 0, 255).astype(np.uint8)
    imageio.imwrite(temp_file_path, rgb_array_uint8)

    # Open a file dialog to select the destination directory
    root = Tk()
    root.withdraw()  # Hide the main window
    file_path = filedialog.asksaveasfilename(defaultextension='.tiff', filetypes=[('TIFF files', '*.tiff')])

    if file_path:
        shutil.copy(temp_file_path, file_path)
        print(f'File saved to: {file_path}')

    # Remove the temporary directory
    shutil.rmtree(temp_dir)

# Connect the button's click event to the visualization function
visualizeBtn.on_click(visualize) #when clicking the visualize button, run visualize(b)
savefileBtn.on_click(savefile)
# Display the widgets
display(thinSectionDropdown, channel1Dropdown, channel2Dropdown, channel3Dropdown, visualizeBtn, savefileBtn)
display(output)


['84.7-NA-2-1_Cold', '84.7-NA2-2_Cold', '78.7-10-1_Hot', '78.7-10-2_Hot']
4


Dropdown(description='Thin Section:', options=(None, '84.7-NA-2-1_Cold', '84.7-NA2-2_Cold', '78.7-10-1_Hot', '…

Dropdown(description='Red', options=(None,), value=None)

Dropdown(description='Green', options=(None,), value=None)

Dropdown(description='Blue', options=(None,), value=None)

Button(button_style='success', description='Visualize', icon='magnifying-glass', style=ButtonStyle(), tooltip=…

Button(button_style='success', description='Save as...', icon='download', style=ButtonStyle(), tooltip='Compil…

Output()

Index of CC-84.7-R21-NA2-1_full_Ca Ka_Ca Ka EDS.Tiff in this_mapStackNames: 2
Index of CC-84.7-R21-NA2-1_full_Fe Ka_Fe Ka EDS.Tiff in this_mapStackNames: 0
Index of CC-84.7-R21-NA2-1_full_Ti Ka_Ti Ka (Sp 5).Tiff in this_mapStackNames: 4


Index of CC-84.7-R21-NA2-1_full_Ca Ka_Ca Ka EDS.Tiff in this_mapStackNames: 2
Index of CC-84.7-R21-NA2-1_full_Fe Ka_Fe Ka EDS.Tiff in this_mapStackNames: 0
Index of CC-84.7-R21-NA2-1_full_Ti Ka_Ti Ka (Sp 5).Tiff in this_mapStackNames: 4


# Data  analysis

At this point, we don't have a training dataset yet to give us information about what the values in 9 or 10 channels for each pixel represent, but we can analyze the relationships in the data and try to figure out what the most informative channels are. Of course, without labels, we can only go so far to interpret what is going on.

1. Variance
2. Mutual information
3. PCA

We'll find the variance of each channel's data (from the mean). We're not thinking about structure in the data at this point, just basic variance. 

In [6]:
#### 1. Variance ####

# for each channel, get the variance

#this gets the variance for a channel. 
#it needs to be placed inside a for loop that iterates over the entire map stack
#inside another for loop that iterates over the entire thinSections list.

# get the 2D array for the map
thinSectionIndex = 0
this_ts = thinSections[thinSectionIndex]
this_tsStack = this_ts['stack']
mapChannelIndex = 0
this_mapChannel = this_tsStack[:,:,mapChannelIndex]

# find the variance (= sum of squares / number of values)

# mean
mean = np.mean(this_mapChannel)

# sum of squares
sumOfSquares = 0
for row in this_mapChannel:
    for pixel in row:
        pixelDifference = pixel - mean
        sumOfSquares += pixelDifference**2

# number of values
dimensions = this_mapChannel.shape
n = dimensions[0] * dimensions[1]

channelVariance = np.sum(sumOfSquares/(n))

# the quick way:
# channelVariance = np.var(this_mapChannel)

print(channelVariance)

# I'll need to save all the variances to a list, say 'channelVariances' (note the 's')


# next, we want to see which channels have the highest variance.

# Get the indices of the top three channels with the highest variances
# topThree = np.argsort(channelVariances)[::-1][:3] to get the three highest variance channels. 
# print(topThree)

# visualize as a bar plot for each thin section with a bin for each channel

#plt.bar(range(len(channel_variances)), channel_variances)
#plt.xlabel("Channel Index") # add the element names
#plt.ylabel("Variance")
#plt.title("Channel Variances")
#plt.show()

635.7768907585347


In [7]:
2. Mutual Information


SyntaxError: invalid syntax (1938871220.py, line 1)

In [None]:
3. PCA



In [None]:
import pandas as pd
from sklearn.decomposition import PCA
from sklearn import preprocessing


# this bit of code needs to be finished. it is meant to format the map stacks into lists of tuplets, 
# where each tuplet is a set of the values from all channels for a pixel. Each pixel is a datapoint with 10 different 'columns' of data.
# the end result should be a pandas DataFrame. 

thinSectionIndex = 0
data = []

for thin_section in thinSections:
    this_ts = thinSections[thinSectionIndex] 
    this_tsStack = this_ts['stack']
    print(this_tsStack.shape)

    pixelCount = 0
    for row in this_tsStack:
        pixelStack = this_tsStack[pixelCount,pixelCount,:]
        data.append(pixelStack)
        pixelCount += 1
    thinSectionIndex += 1

print(pixelCount)
# print(pixelStacks)
# print(pixelStacks.shape)
# print(this_tsStack)
# print(this_tsStack.shape)
# data = pd.DataFrame(columns = [this_tsStack])
print(len(data))


# we can now scale and center the data:
scaled_data = preprocessing.scale(data.T)

# pcs = PCA()
# pca.fit(scaled.data)
# pca_data = pca.transform(scaled_data)

# we want to make a scree plot to see how each PC did.

# first find the percentage of variation accounted for by each PC:
#percent_variation = np.round(pca.explained_variance_ratio * 100, decimals = 1)

# labels for the scree plot:
#labels = ['PC' + str(x) for x in range (1, len(percent_variation+1))]

# # now we want plot the bar plot
# plt.bar(x=range(1, len(percent_variation)+1), height=percent_variation, tick_label=labels)
# plt.ylabel("Percentage of explained variance")
# plt.xlabel("Principal component")
# plt.title("Scree plot")
# plt.show()

# # we can then decide if the first to PCs will be good enough to describe the data. If yes, make a PCA plot:

# pca_df = pd.DataFrame(pca_data, index=[NOTSUREWHATTOPUTHEREYET], columns=labels)

# plt.scatter(pca_df.PC1, pca_df.PC2)
# plt.ylabel("PC2")
# plt.xlabel("PC1")
# plt.title("PCA graph")
# plt.show()

# last, we want to know the loading scores of PC1:

# loading_scores = pd.Series(pca.components_[0], index=elements)
# sorted_loading_scores = loading_scores.abs().sort_values(ascending=False)

# topThreeElements = sorted_loading_scores[0:10].index.values

# print(loading_scores[topThreeElements])

In [None]:
,

/Users/jonathanlindenmann/.jupyter
