<a href="https://colab.research.google.com/github/nneibaue/ocean_explorer/blob/staging/explorer_official.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1>Ocean Data Explorer</h1>

# Introduction

This notebook is written to provide some basic visualization tools in Python using some of Colab's nice output features. It's fairly basic, but should provide a decent example of how Python and Colab can be useful for something like this. Google provides a free runtime in the cloud, so no need to install Python and set anything up on the computer. The free version of Colab has more than enough features, memory, and drive space for our purposes here 


## Research Context

Samples are collected at different depths for a given location in the ocean (e.g. lat, long). Each of these samples is measured for concentrations of various different elements via a 2D scan, yielding concentration values at individual pixels. A given pixel may contain a non-trivial concentration value for one or more elements. 

It is of particular interest how a given element (Cu in this case) is distributed among different element groups for a given scan. For example, one pixel could contain non-trivial concentrations of Cu, Mg, Br, and Zn, whereas another pixel might only contain Fe and Mg. 


## Problem Statement

* Given a dataset for a single location, how does the distribution of an element vary with depth? Assumptions include:
  * There can be many scans at a given depth
  * No two scans overlap in space
  * Concentration values ($[x]$) at a pixel are only considered non-trivial if:
  $$
  [x] > \bar{[x]} + 2 \cdot \sigma_x 
  $$
  where $\bar{[x]}$ is the average concentration value and $\sigma$ is the standard deviation
  * Concentration values filtered by an element are only considered non-trivial if the element in question satisfies the above condition
    * E.g. a pixel may contain non-trivial amounts of Ca and Mg, but not Cu. If we are filtering by Cu, then this pixel is rejected


**Please don't edit this notebook directly. To make changes, first make a copy of the notebook.**

#Setup

The following cell clones the github repo so private libraries can be imported.

In [1]:
#@title Clone Github Repo

BRANCH_NAME = "staging" #@param {type:"string"}

import os
import sys
import shutil

ROOT = '/content'
os.chdir(ROOT)
REPO_NAME = 'ocean_explorer'
REPO_URL = f'https://github.com/nneibaue/{REPO_NAME}'
REPO_PATH = os.path.join(ROOT, REPO_NAME)


def import_from_github(branch):
  # Remove old repo
  print('Removing old repo...')
  !rm -rf $REPO_PATH
  
  print('Cloning from github...')
  !git clone $REPO_URL
  os.chdir(REPO_PATH)
  
  if branch != 'master':
    !git checkout --track origin/$branch
    !git config user.email "colab_anon@gmail.com"
  else:
    !git pull
    
  if REPO_PATH not in sys.path:
    print(f'Adding {REPO_PATH} to path')
    sys.path.append(REPO_PATH)
  
  os.chdir(ROOT)

import_from_github(BRANCH_NAME)

Removing old repo...
Cloning from github...
Cloning into 'ocean_explorer'...
remote: Enumerating objects: 190, done.[K
remote: Counting objects: 100% (190/190), done.[K
remote: Compressing objects: 100% (126/126), done.[K
remote: Total 599 (delta 111), reused 134 (delta 55), pack-reused 409[K
Receiving objects: 100% (599/599), 587.60 KiB | 827.00 KiB/s, done.
Resolving deltas: 100% (368/368), done.
Branch 'staging' set up to track remote branch 'staging' from 'origin'.
Switched to a new branch 'staging'
Adding /content/ocean_explorer to path


In [2]:
#@title Imports
#@markdown `refresh_module` function

# etsp stuff
import ocean
import plotting as op
import ocean_utils as utils

# Colab output stuff
from google.colab import drive
from google.colab import widgets
from IPython.display import display, HTML
import ipywidgets

# General
import numpy as np
import random
import re
import pandas as pd
from importlib import reload

# Plotting
from cycler import cycler
import altair as alt
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
#from IPython import display, html
#Namespace class to keep things organized
class Namespace:
  def __init__(self, **kwargs):
    self.__dict__.update(**kwargs)

# Convenience function to reload github and reimport
def refresh_modules(branch):
  import_from_github(branch)
  reload(ocean)
  reload(op)
  reload(utils)

In [3]:
#@title Connect Google Drive

drive.mount('/content/gdrive')
DRIVE_BASE = '/content/gdrive/My Drive'

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


# Analysis

In [4]:
#@title Data Import

#@markdown Enter drive path to folder containing Profiles:
BASE_DIR = "software_development/etsp/XRF data deglitch" #@param{type:"string"}
experiment_dir = os.path.join(ocean.DRIVE_BASE, BASE_DIR)
#depth_path = "software_development/etsp/XRF data/25m" #@param {type:"string"}
#@markdown Enter elements separated by comma
ELEMENTS_OF_INTEREST = "Br,Ca,Cu,Fe,K,Cl,Mn,S,Si,Zn" #@param {type:"string"}
ELEMENTS_OF_INTEREST=ELEMENTS_OF_INTEREST.split(',')
ORBITALS = "K" #@param {type:"string"}
profiles = ocean.load_profiles(experiment_dir,
                               elements_of_interest=ELEMENTS_OF_INTEREST,
                               orbitals=[ORBITALS],
                               normalized=True)

.DS_Store is not a valid name for a Depth!
/content/gdrive/My Drive/software_development/etsp/XRF data deglitch/ETSP_PS3_profile/900m/.DS_Store is not a valid name for Scan directory
Successfully imported data for 900m
/content/gdrive/My Drive/software_development/etsp/XRF data deglitch/ETSP_PS3_profile/10m/.DS_Store is not a valid name for Scan directory
Successfully imported data for 10m
/content/gdrive/My Drive/software_development/etsp/XRF data deglitch/ETSP_PS3_profile/25m/.DS_Store is not a valid name for Scan directory
Successfully imported data for 25m
/content/gdrive/My Drive/software_development/etsp/XRF data deglitch/ETSP_PS3_profile/40m/.DS_Store is not a valid name for Scan directory
Successfully imported data for 40m
/content/gdrive/My Drive/software_development/etsp/XRF data deglitch/ETSP_PS3_profile/50m/.DS_Store is not a valid name for Scan directory
Successfully imported data for 50m
/content/gdrive/My Drive/software_development/etsp/XRF data deglitch/ETSP_PS3_profile

## Data Preview



In [5]:
list(profiles.keys())

['ETSP_PS3', 'ETSP_PS2', 'ETSP_poorlydeglitched']

In [7]:
#@title Flag Noisy Scans { form-width: "45%" }
run_noise_flagging_ui = True #@param {type:"boolean"}
PROFILE_NAME = "ETSP_PS2" #@param {type:"string"}

def noise_flagging_ui(profile_name):
  profile = profiles[profile_name]
  children = []
  def flag_scan_wrapper(scan, val):
    utils.set_noisy_scan_flag(scan, val, profile.experiment_dir)
  for depth in profile.depths:
    plots = []
    for scan in depth.scans:
      fig, ax = plt.subplots(figsize=(5, 5))
      plot = ipywidgets.HTML(
          scan.get_detsum('Cu').plot(ax=ax, raw=True, base64=True))
      checkbox = ipywidgets.Checkbox(value=scan.isNoisy, description='Noisy')
      checkbox.observe(lambda val, scan=scan: flag_scan_wrapper(scan, val['new']), 'value')
      plots.append(ipywidgets.VBox([plot, checkbox]))
    children.append(ipywidgets.HBox(plots))
  tab = ipywidgets.Tab(children=children, layout=ipywidgets.Layout(width='100%'))
  for i, depth in enumerate(profile.depths):
    tab.set_title(i, depth.depth)
  display(tab)

if run_noise_flagging_ui:
  noise_flagging_ui(PROFILE_NAME)

Tab(children=(HBox(children=(VBox(children=(HTML(value="<img src='data:image/png;base64,iVBORw0KGgoAAAANSUhEUg…

In [None]:
#@title Plot All Detsums (Out of Order 8/26!!)

elements_to_plot = 'Br,Ca,Cu,Fe,K,Cl,Mn,S,Si,Zn' #@param {type:"string"}
sort_by = "element" #@param ["element", "depth"]

#@markdown To show plots, check box below and run cell
show_plots = False #@param {type:"boolean"}

def plot_all_detsums(depths, elements=None, sort_by='element'):
  '''Plots the raw data from all detsums of the given elements.

  Args:
    depths: list of Depth objects
    elements: optional list of elements. E.g. ['Cu', 'Fe']. If this 
      is `None`, then all elements will be plotted
    sort_by: string. Can either be 'element' or 'depth'. This will
      determine how the detsums are sorted before they are rendered
      to the screen. This is set to 'element' by default
    
  Returns: raw detsums plotted in a grid
  '''

  # Triple looping to get detsums from all depths
  detsums = []
  for d in depths:
    for s in d.scans:
      for detsum in s.detsums:
        if elements is not None:
          if detsum.element not in elements:
            continue # skip to next iteration
        detsums.append(detsum)

  # Determine sorting function 
  if sort_by == 'element':
    sort_func = lambda d: d.element
  elif sort_by == 'depth':
    sort_func = lambda d: int(d.depth.split('m')[0]) # Turn depth into integer for sorting
  else:
    raise ValueError("`sort_by` must be 'element' or 'depth'")
  
  # Sort detsums
  detsums = sorted(detsums, key=sort_func)

  # Build grid
  ncols = 4
  nrows = 1 + (len(detsums) // ncols)
  g = widgets.Grid(nrows, ncols)
  row = 0
  col = 0
  for i, detsum in enumerate(detsums):
    with g.output_to(row, col):
      #print(f'Element: {detsum.element}, Depth: {detsum.depth}, Scan: {detsum.scan_name}')
      print(f'    {detsum.element}    |    {detsum.depth}    |    {detsum.scan_name}')
      display(ipywidgets.HTML(detsum.plot(raw=True, base64=True)))
    if (col + 1) % 4 == 0:
      row += 1
      col = 0
    else:
      col += 1

##Example Usage
#=====================================
#Uncomment this line to plot all detsums from Iron and Copper, e.g:
#plot_all_detsums(depths, elements=['Fe', 'Cu'])

#Uncomment this line to plot all detsums from all elements and sort by depth:
if show_plots:
  plot_all_detsums(depths,
                  elements=elements_to_plot.split(','),
                  sort_by=sort_by)

del(elements_to_plot, sort_by, show_plots)

## **Plotting UIs**

In [None]:
list(profiles.keys())

In [None]:
#@title Ribbon Plot UI { form-width: "200px" }
run_ribbon_plot_ui = True #@param {type:"boolean"}
PROFILE_NAME = "ETSP_PS3" #@param {type:"string"}

from ipywidgets.embed import embed_minimal_html
def ribbon_plot_ui(profile_name):
  profile = profiles[profile_name]
  experiment_dir = profile.experiment_dir
  status_indicator = ipywidgets.Output()
  with status_indicator:
    display(ipywidgets.HTML('<h3 style="color:green">Ready</h3>'))
  graph_output = ipywidgets.Output()
  # element_inputs = {}
  # element_filter = {}
  test = {}
  smalltextbox = ipywidgets.Layout(width='50px', height='25px')
  # filter_func = lambda n: lambda x: np.mean(x) + np.std(x)*n
  
  element_filter = op.ElementFilterPanel(profile,
                                    input_type='text',
                                    orientation='horizontal',
                                    experiment_dir=experiment_dir)#, **layout_kwargs)
  filter_settings = op.SettingsController(element_filter)
  
  # for e in ELEMENTS_OF_INTEREST:
  #   element_inputs[e] = ipywidgets.Textarea(value='2', layout=smalltextbox)
  #   element_filter[e] = filter_func(2)
    
  filter_by_control = ipywidgets.Dropdown(options=ELEMENTS_OF_INTEREST,
                                          value='Cu', description='Filter by:',
                                          layout=ipywidgets.Layout(width='200px'))
  
  combine_scans_checkbox = ipywidgets.Checkbox(value=True, description='Combine Scans')
  
  combine_detsums_checkbox = ipywidgets.Checkbox(value=False, description='Combine Detsums')
  
  normalize_by_control = ipywidgets.Dropdown(options=['counts', 'pixels'],
                                            value='counts',
                                            description='Normalize By',
                                            layout=ipywidgets.Layout(width='200px'))
  
  
  N_input = ipywidgets.Textarea(value='8', layout=ipywidgets.Layout(width='150px'), description='N')
  update_button = ipywidgets.Button(description='Update Plot')                          
  clear_output_control = ipywidgets.Checkbox(value=False, description='Clear output after each run')
  
  # element_filter_input = ipywidgets.HBox(
  #     [ipywidgets.VBox([ipywidgets.HTML(f'<h3>{e}</h3>'), element_inputs[e]]) for e in ELEMENTS_OF_INTEREST]
  # )
  
  save_html_button = ipywidgets.Button(description='Save HTML')
  def update_plot(b):
    element_filter.save_settings()
    status_indicator.clear_output()
    # for e in ELEMENTS_OF_INTEREST:
    #   val = float(element_inputs[e].value)
    #   element_filter[e] = filter_func(val)
    with status_indicator:
      display(ipywidgets.HTML('<h3 style="color:red">Working...</h3>'))

    info_banner_html = (f'Filter by: {filter_by_control.value} | '
                      f'Comb. Scans: {combine_scans_checkbox.value} | '
                      f'Comb. Detsums: {combine_detsums_checkbox.value} | '
                      f'N: {N_input.value} | '
                      f'Normalize By: {normalize_by_control.value} | ')
    info_banner = ipywidgets.HTML(info_banner_html)

    plot = op.ribbon_plot(profile, element_filter=element_filter.filter_dict,
                filter_by=filter_by_control.value,
                combine_detsums=combine_detsums_checkbox.value,
                combine_scans=combine_scans_checkbox.value,
                N=int(N_input.value),
                normalize_by=normalize_by_control.value,
                base64=True,
                experiment_dir=experiment_dir)
    if clear_output_control.value:
      graph_output.clear_output()

    with graph_output:
      #display(ipywidgets.HTML(plot))
      display(ipywidgets.VBox([ipywidgets.HTML(plot), ipywidgets.HTML(info_banner_html)]))

    status_indicator.clear_output()
    with status_indicator:
      display(ipywidgets.HTML('<h3 style="color:green">Ready</h3>'))
  
  update_button.on_click(update_plot)
  

  update_plot('this param does not matter here')  

  #https://stackoverflow.com/questions/55336771/align-ipywidget-button-to-center
  controls_bot = ipywidgets.Box([element_filter.widget, filter_settings.widget],
                                layout=ipywidgets.Layout(display='flex', align_items='center'))
  controls_top = ipywidgets.HBox([ipywidgets.VBox([update_button, status_indicator]),
                              ipywidgets.VBox([filter_by_control, normalize_by_control]),
                              ipywidgets.VBox([combine_scans_checkbox, combine_detsums_checkbox]),
                              N_input, clear_output_control])

  controls = ipywidgets.VBox([controls_top, controls_bot],
                            layout=ipywidgets.Layout(
                                border='1px solid black',
                                width='100%',
                            ))
  app = ipywidgets.VBox([graph_output, controls])
  display(app)
  return filter_settings

  # def save_html(b):
  #   embed_minimal_html(os.path.join(experiment_dir, 'myplots.html'), views=[graph_output], title='Ribbon Plot Examples')

  # save_html_button.on_click(save_html)
if run_ribbon_plot_ui:    
  a = ribbon_plot_ui(PROFILE_NAME)

In [None]:
#@title Image UI { form-width: "200px" }
run_image_ui = False #@param {type:"boolean"}

PROFILE_NAME = 'ETSP_PS3' #@param {type:"string"}
def image_ui(profile_name):
  profile = profiles[profile_name]
  experiment_dir = profile.experiment_dir
  plot_area = ipywidgets.Output()
  status_indicator = ipywidgets.Output()
  group_indicator = ipywidgets.Output()
  
  with group_indicator:
    display(ipywidgets.HTML('<h3 style="color:orange">No Group Selected</h3>'))

  with status_indicator:
    display(ipywidgets.HTML('<h3 style="color:green">Ready</h3>'))

  
  # Making the controls

  layout_kwargs = dict(width='85%', border='1px solid black')
  #settings_layout = dict(width='20%', border='1px solid blue')

  depth_selector = op.PropSelector(profile.depths, orientation='horizontal', title='Depths to plot',
                                   description_func=lambda d: d.depth, **layout_kwargs)
  element_filter = op.ElementFilterPanel(profile,
                                    orientation='horizontal',
                                    experiment_dir=experiment_dir)

  filter_settings = op.SettingsController(element_filter)

  element_plot_selector = op.PropSelector(ELEMENTS_OF_INTEREST,
                                          orientation='horizontal',
                                          title='Elements to plot',
                                          **layout_kwargs)

  element_group_selector = op.PropSelector(ELEMENTS_OF_INTEREST,
                                           orientation='horizontal',
                                           title='Groups to show',
                                           **layout_kwargs)

  combine_detsums_checkbox = ipywidgets.Checkbox(value=False, indent=False, description='Combine Detsums')
  update_button = ipywidgets.Button(description='Update')
  raw_data_toggle = ipywidgets.ToggleButtons(value='Filtered', options=['Filtered', 'Raw'])
  show_groups_toggle = ipywidgets.ToggleButton(value=True, description='Show Groups')
  exclusive_groups_toggle = ipywidgets.ToggleButtons(value='Exclusive', options=['Exclusive', 'Nonexclusive'])

  controls_bottom = ipywidgets.HBox([update_button,
                                          show_groups_toggle,
                                          exclusive_groups_toggle,
                                          raw_data_toggle,
                                          #combine_detsums_checkbox,
                                          status_indicator],
                                         layout=ipywidgets.Layout(padding='5px', **layout_kwargs))


  controls_right = ipywidgets.HBox([
      element_filter.widget, ipywidgets.VBox([
        combine_detsums_checkbox,
        filter_settings.widget])
      ],
      layout=ipywidgets.Layout(**layout_kwargs))
  controls_top = ipywidgets.VBox([depth_selector.widget,
                                   element_plot_selector.widget,
                                   controls_right,
                                   #ipywidgets.HBox([group_indicator, show_groups_toggle]),
                                   element_group_selector.widget,
                                   group_indicator])

  # controls_right = ipywidgets.VBox([ipywidgets.HTML('Hightlight group\ncontaining elements:'),
  #                                   element_group_selector.widget],
  #                                  layout=ipywidgets.Layout(border='1px solid black'))

  # controls = ipywidgets.HBox([controls_left, controls_right], layout=ipywidgets.Layout(width='85%'))
  controls = ipywidgets.VBox([controls_top, controls_bottom])
                            
  rows = []
  rows_raw = []
  rows_groups_exclusive = []
  rows_groups_exclusive_raw = []
  rows_groups_nonexclusive = []
  rows_groups_nonexclusive_raw = []
  
  current_group = []


  def show_plots(val):
    plot_area.clear_output()
    show = show_groups_toggle.value
    exclusive = exclusive_groups_toggle.value == 'Exclusive'
    with plot_area:
      if raw_data_toggle.value == 'Raw':
        if show and exclusive:
          display(ipywidgets.VBox(rows_groups_exclusive_raw))
        elif show and not exclusive:
          display(ipywidgets.VBox(rows_groups_nonexclusive_raw))
        elif not show:
          display(ipywidgets.VBox(rows_raw))
      elif raw_data_toggle.value == 'Filtered':
        if show and exclusive:
          display(ipywidgets.VBox(rows_groups_exclusive))
        if show and not exclusive:
          display(ipywidgets.VBox(rows_groups_nonexclusive))
        elif not show:
          display(ipywidgets.VBox(rows))
        
    status_indicator.clear_output()
    with status_indicator:
      display(ipywidgets.HTML('<h3 style="color:green">Ready</h3>'))


  def update_group(element, val):
    nonlocal current_group
    if val:
      current_group.append(element)
    elif not val:
      current_group.remove(element)

    sorted_group = [ELEMENTS_OF_INTEREST[ELEMENTS_OF_INTEREST.index(element)] if
                                          element in current_group else None for element in ELEMENTS_OF_INTEREST]
    sorted_group = list(filter(lambda x: x, sorted_group))
    group_indicator.clear_output()
    with group_indicator:
      if not current_group:
        display(ipywidgets.HTML('<h3 style="color:orange">No group selected</h3>'))
      else:
        group_str = '  |  '.join(sorted_group)
        display(ipywidgets.HTML(f'<h3 style="color:red">Group Selected: {group_str}</hp>'))
    current_group = sorted_group

  def generate_plots(b):
    print(element_filter.filter_dict)
    element_filter.save_settings()
    status_indicator.clear_output()
    with status_indicator:
      display(ipywidgets.HTML('<h3 style="color:red">Working....</h3>'))

    depths_to_plot = depth_selector.selected_props
    if not depths_to_plot:
      status_indicator.clear_output()
      with status_indicator:
        display(ipywidgets.HTML('<h3 style="color:orange">No Depth Selected!</h3>'))
      return

    elements_to_plot = element_plot_selector.selected_props

    group = '|'.join(element_group_selector.selected_props)

    for depth in depths_to_plot:
      depth.apply_element_filter(element_filter.filter_dict[depth.depth],
                                 combine_detsums=combine_detsums_checkbox.value)

    def get_row(depth, elements, raw, show_groups, exclusive):
      detsums = sorted(depth.detsums, key=lambda d: d.element)
      plots = []
      #group = '|'.join(current_group)
      for scan in depth.scans:
        data = scan.data['element_group'].values.reshape(scan.detsums[0].shape)
        for detsum in scan.detsums:
          if detsum.element not in elements:
            continue
          fig, ax = plt.subplots(figsize=(15, 15))
          detsum.plot(raw=raw, ax=ax)
          if show_groups and current_group:
            fn = np.vectorize(lambda group: utils.check_groups(group, current_group, exclusive=exclusive))
            rows, cols = np.where(fn(data))
            ax.scatter(cols, rows, s=20, color='red')
          plot = op.encode_matplotlib_fig(fig)
          plt.close()
          plots.append(ipywidgets.HTML(plot))
      return plots

    nonlocal rows
    nonlocal rows_raw
    nonlocal rows_groups_exclusive
    nonlocal rows_groups_exclusive_raw
    nonlocal rows_groups_nonexclusive
    nonlocal rows_groups_nonexclusive_raw

    
    rows = [ipywidgets.HBox(get_row(depth, elements_to_plot, False, False, False)) for depth in depths_to_plot]
    rows_raw = [ipywidgets.HBox(get_row(depth, elements_to_plot, True, False, False)) for depth in depths_to_plot]
    rows_groups_exclusive = [ipywidgets.HBox(get_row(depth, elements_to_plot, False, True, True)) for depth in depths_to_plot]
    rows_groups_exclusive_raw = [ipywidgets.HBox(get_row(depth, elements_to_plot, True, True, True)) for depth in depths_to_plot]
    rows_groups_nonexclusive = [ipywidgets.HBox(get_row(depth, elements_to_plot, False, True, False)) for depth in depths_to_plot]
    rows_groups_nonexclusive_raw = [ipywidgets.HBox(get_row(depth, elements_to_plot, True, True, False)) for depth in depths_to_plot]

    show_plots(_)


  update_button.on_click(generate_plots)
  raw_data_toggle.observe(show_plots, 'value')
  show_groups_toggle.observe(show_plots, 'value')
  exclusive_groups_toggle.observe(show_plots, 'value')
  element_group_selector.observe(update_group)
  # for selector in depth_selectors.values():
  #   selector.observe(update_plot)
  display(ipywidgets.VBox([controls, plot_area]))
  #print('\n'.join(dir(plot_area)))
  
  
if run_image_ui:
  image_ui(PROFILE_NAME)