In [None]:
#%load_ext autoreload
#%autoreload 2

<a name="top"></a>
# <center>Pegasus WMS Workflow Python Example</center>

## Abstract

- This Jupyter Notebook tool provides a template for running a Pegasus Workflow Management System (WMS) workflow, comprising Python scripts, on the University at Buffalo (UB)'s Center For Computational Research (CCR)'s generally accessible high performance compute cluster, UB-HPC.

- This tool's GitHub repository is located at https://github.com/GhubGateway/Ghub_Pegasus_WMS_Python_Example.

- The Ghub tool name for this template is ghubex1. The files provided by this template are specific this tool. You will need to update / replace the files with files specific for your tool as required. See the `Create Your Tool On Ghub` section for more details.

## Overview

- Select the Mapped Collection Folder and Modeling Groups. Click the `Run Workflow` button to run the workflow which summerizes and displays time series information from experiment files contained within the selected modeling groups. 

## User Guide

### [**Steps for using this tool**](#steps_for_using_this_tool)<br />

1. [Select the Mapped Collection Folder](#step_1) <br />
2. [Select the Modeling Groups](#step_2)<br />
3. [Run the Workflow](#step_3)<br />
4. [View Workflow Progress](#step_4)<br />
5. [View Workflow Results](#step_5)<br />
6. [View Log Output](#step_6)<br />

### [**Create Your Tool On Ghub**](#createyourtool)<br />

### [**Background**](#background)<br />

In [None]:
# As of 03/2024, tested with the Jupyter Notebook (202210) tool and the Python3 (ipykernel)

# Setup and preoprocessing:

import sys
import os
import getpass
import platform
import shutil
import atexit
import math
import numpy as np
import pandas as pd
import time

import ipywidgets as widgets
from IPython.display import display, HTML, Markdown, clear_output, Image, Javascript
#import xml.etree.ElementTree as et

import hublib
#print (help(hublib))
import hublib.ui as ui
#print (help(ui))
import hublib.use
#print (help(hublib.use))

#print(sys.path)

# Set up the environment for this notebook

# Setup paths to executables
scriptpath = os.path.realpath(" ")
        
# Get the parent dirs
self_tooldir = os.path.dirname(scriptpath)

# Setup path to python and bash scripts
self_bindir = os.path.join(self_tooldir, "bin")

# Add to PYTHONPATH
sys.path.insert (1, self_bindir)

# Setup path to python and bash scripts
self_remotebindir = os.path.join(self_tooldir, "remotebin")

# Set up path to the current data directory
self_datadir = os.path.join(self_tooldir, "data")

# Set up path to the current doc directory
self_docdir = os.path.join(self_tooldir, "doc")

# Set up path to the current session directory
self_workingdir = os.getcwd()

# Set up path to the user's home directory
self_homedir = os.path.expanduser("~")

# Initialize the dated run directory.
# Workflow results are not available until after a workflow is executed via Pegasus and completes
self_rundir = ""

self_user = getpass.getuser()

# Configuration parameters

import Configuration as cfg
if cfg.VERBOSE == True:
    print ('cfg.DISPERSION_MODEL: ', cfg.DISPERSION_MODEL, '\n')

# Version of Pegasus
%use pegasus-5.0.1
from launchWrapper import Wrapper

np.set_printoptions(threshold=np.inf) 

self_log_filepath = os.path.join(self_workingdir, 'ghubex1_log_file.txt')
self_log_snapshot_filepath = os.path.join(self_workingdir, 'ghubex1_log_snapshot_file.txt')
self_log_backup_filepath = os.path.join(self_workingdir, 'ghubex1_log_backup_file.txt')

widget_border_style = '1px solid black'
widget_output_border_style = '1px solid black'

BOLD = '\033[1m'
SUCCESS = '\033[92m'
WARNING = '\033[93m'
FAIL = '\033[91m'
END = '\033[0m'

dropdown_str_width = 16

dropdown_width = '965px'
dropdown_height = '30px'
button_width = '250px'
button_height = '40px'
button2_width = '150px'
button2_height = '30px'
ui_string_width = '96.5%'
ui_dropdown_width = '96.2%'

# Clean up: remove files from the data/results folder and the bin/__pycache__ folder
def exit_handler():
    
    for file in os.listdir(self_workingdir):
        
        if os.path.isfile(file):
            if file.endswith('.txt'):
                if file != 'README.txt' and file.endswith('netcdf_info.txt') == False and file != 'ghubex1_log_file.txt':
                    #print ("Deleting: %s\n" %file)
                    os.remove(file)
            if file.endswith(".yml"):
                #print ("Deleting: %s\n" %file)
                os.remove(file)
            elif file.endswith(".stdout"):
                #print ("Deleting: %s\n" %file)
                os.remove(file)
            elif file.endswith(".stderr"):
                #print ("Deleting: %s\n" %file)
                os.remove(file)

    #dirpath = os.path.join(self_bindir, "__pycache__")
    #if (os.path.exists(dirpath)):
        #print ("Deleting: %s\n" %dirpath)
        #shutil.rmtree(dirpath)
        
    FH1.flush()
    FH1.close()

atexit.register(exit_handler);   

In [None]:
# prevent In[] and Out[] from displaying on left
#HTML('''
#<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>
#''')

In [None]:
#https://api.jquery.com/ready/
HTML('''
<script>
    function scroll_to_top() {
        Jupyter.notebook.scroll_to_top();
    } 
    $( window ).on( "load", scroll_to_top() );
</script>
''')

In [None]:
# Button styles
HTML('''
<style>.buttontextclass { color:black ; font-size:130%}</style>
''')

In [None]:
#https://stackoverflow.com/questions/36757301/disable-ipython-notebook-autoscrolling

In [None]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}


In [None]:
# Initialize

# Note: ~/.pegasus/workflow.db is not consistent between Pegasus 4.8.1 and Pegasus 5.0.1
pegasus_workflow_db_filepath = os.path.join(self_homedir, '.pegasus', 'workflow.db')
#print ('pegasus_workflow_db_filepath: ', pegasus_workflow_db_filepath)
if os.path.exists(pegasus_workflow_db_filepath):
    if os.path.exists(pegasus_workflow_db_filepath + '.save'):
        os.remove(pegasus_workflow_db_filepath + '.save')
    shutil.copy (pegasus_workflow_db_filepath, pegasus_workflow_db_filepath + '.save')
    os.remove(pegasus_workflow_db_filepath)
    
if os.path.exists(self_log_filepath):
    shutil.move (self_log_filepath, self_log_backup_filepath)
    
FH1 = open(self_log_filepath, 'w')

show_log_output_button = widgets.Button(description="Show Log Output", disabled=False,\
    layout=widgets.Layout(width=button_width, height=button_height),\
    style= {'button_color':'lightgreen','font_weight':'bold'})

# Utility Functions

def log_info (message):
    
    if show_log_output_button.description == 'Hide Log Output': 
        with log_output:
            print (message)    
    FH1.write('%s\n' %message)
    FH1.flush()

def log_status (output_widget, message):
    
    with output_widget:
        print (message)
    log_info (message)
    
def log_success (output_widget, message):
    
    with output_widget:
        print ('%s%s%s' %(SUCCESS,message,END))
    log_info (message)
    
def log_warning (output_widget, message):
    
    with output_widget:
        print ('%s%s%s' %(WARNING,message,END))
    log_info (message)
    
def log_error (output_widget, message):
    
    with output_widget:
        print ('%s%s%s' %(FAIL,message,END))
    log_info (message)
    
if (1): #cfg.VERBOSE == True:
    
    log_info ('Operating System Platform: ' + platform.system() + ' ' + platform.release())
    log_info ('\n')

    log_info ('Environment:\n')
    log_info ('scriptpath: ' + scriptpath)
    log_info ('tooldir: ' + self_tooldir)
    log_info ('bindir: ' + self_bindir)
    log_info ('datadir: ' + self_datadir)
    log_info ('workingdir: ' + self_workingdir)
    log_info ('homedir: ' + self_homedir)
    log_info ('user: ' + self_user)
    log_info ('\n')
    
    #print (type(sys.path)) # <class 'list'>
    #print (sys.path)
    log_info ('sys.path: ' + ' '.join(str(path)+'\n' for path in sys.path))
    log_info ('\n')
    
    #print (type(os.environ["PATH"])) # <class 'str'>
    #print (os.environ["PATH"])
    log_info ('os.environ["PATH"]: ' + os.environ["PATH"])
    log_info ('\n')


In [None]:
environ = dict(os.environ)
#print (type(environ))
#print (environ)
key = 'SESSION'
if key in environ:
    session_num = str(environ[key])
else:
    session_num = 'session number unknown'
message = 'Ghub session number: ' + str(session_num)
#print ('%s%s%s' %(SUCCESS,message,END), flush=True)
log_info (message)


In [None]:
# Get the UB CCR's map collection folders information 
mapped_collections_filename = os.path.join(self_datadir, 'ub-ccr-ghub-ISMIP6-mapped_collections.xlsx')

mapped_collections_df = pd.read_excel (mapped_collections_filename)
#print (type(mapped_collections_df))
#print (mapped_collections_df)

num_mapped_collections = len(mapped_collections_df)
#print (num_mapped_collections)

HBox_layout = widgets.Layout(height='45px', width='98%', display='flex', flex_flow='row', justify_content='flex-start')

def append_modeling_groups (modeling_group):
    model_checkbox = widgets.Checkbox(
        value = True,
        description = modeling_group,
        indent = False,
        disabled = True,
        style = {'description_width':'200px'},
        layout= {'height': '40px', 'width': '200px'})
    model = widgets.HBox(children=[model_checkbox], layout = HBox_layout)
    all_modeling_groups.append(model)

folder_list = []
description_list = []
modeling_groups_list = []

for i in range(num_mapped_collections):
    folder = str(mapped_collections_df['Folder'][i].strip(' \t\n\r'))
    folder_list.append(folder)
    description = str(mapped_collections_df['Description'][i].strip(' \t\n\r'))
    description_list.append(description)
    modeling_groups = (str(mapped_collections_df['Modeling Groups'][i].strip(' \t\n\r')))
    modeling_groups_list.append (modeling_groups)
log_info ('folder_list: ' + str(folder_list))
log_info ('description_list: ' + str (description_list))
log_info ('modeling_groups_list: ' + str (modeling_groups_list))

folder_index = 0
all_modeling_groups = []


<a name="step_1"></a>
## Step 1: Select the Mapped Collection Folder [&#8607;](#top)

Select one of four predetermined UB CCR mapped collection folders.


In [None]:
def folder_dropdown_callback(change):
    
    global folder_index
   
    if change['type'] == 'change' and change['name'] == 'value' and change['new'] != ' ' \
        and folder_dropdown.value != None:
        
        selected_folder = folder_dropdown.value
        log_info ('selected folder: ' + selected_folder)
        folder_index = folder_list.index(selected_folder)
        initialize()
            
folder_dropdown = widgets.Dropdown(
    description = 'Folder:',
    disabled = False,
    options = folder_list,
    value = folder_list[0],
    style = {'description_width': '150px'},
    layout = widgets.Layout(width=dropdown_width, height=dropdown_height)
)
folder_dropdown.observe(folder_dropdown_callback)


In [None]:
folder_form = ui.Form([folder_dropdown], name = 'Mapped Collection Folder')
display (folder_form)

 <a name="step_2"></a>
## Step 2: Select the Modeling Groups [&#8607;](#top)

Select the modeling groups within the selected mapped collection folder. Time series data from experiment files contained within the selected modeling groups are summerized and displayed when the workflow is executed. See the `Run the Workflow` section for more information. By default, the AWI and ITLS_PIK modeling groups are selected.

In [None]:
def select_default_modeling_groups():
    
    # Reduce the computational load by default
    global all_modeling_groups
    for i in range(len(all_modeling_groups)):
        modeling_group = all_modeling_groups[i].children[0].description
        if modeling_group == 'AWI' or modeling_group == 'ILTS_PIK':
            all_modeling_groups[i].children[0].value = True
        else:
            all_modeling_groups[i].children[0].value = False
        
def select_default_models_button_callback(p):
    select_default_modeling_groups()
    
select_default_models_button = widgets.Button(description="Select Defaults", disabled=False,\
                             layout=widgets.Layout(width=button2_width, height=button2_height, justify_content="flex-end"),\
                             style= {'button_color':'lightgray','font_weight':'bold'})
select_default_models_button.add_class("button2textclass")
select_default_models_button.on_click (select_default_models_button_callback)

def select_all_modeling_groups():
    global all_modeling_groups
    for i in range(len(all_modeling_groups)):
        all_modeling_groups[i].children[0].value = True
        
def select_all_models_button_callback(p):
    select_all_modeling_groups()
    
select_all_models_button = widgets.Button(description="Select All", disabled=False,\
                             layout=widgets.Layout(width=button2_width, height=button2_height, justify_content="flex-end"),\
                             style= {'button_color':'lightgray','font_weight':'bold'})
select_all_models_button.add_class("button2textclass")
select_all_models_button.on_click (select_all_models_button_callback)

def unselect_all_modeling_groups():
    global all_modeling_groups
    for i in range(len(all_modeling_groups)):
        all_modeling_groups[i].children[0].value = False
        
def unselect_all_models_button_callback(p):
    unselect_all_modeling_groups()
    
unselect_all_models_button = widgets.Button(description="Unselect All", disabled=False,\
                             layout=widgets.Layout(width=button2_width, height=button2_height, justify_content="flex-end"),\
                             style= {'button_color':'lightgray','font_weight':'bold'})
unselect_all_models_button.add_class("button2textclass")

unselect_all_models_button.on_click (unselect_all_models_button_callback)

In [None]:
models_form_output = widgets.Output(layout={'border': widget_border_style})
display(models_form_output)

In [None]:
 # Run Workflow

maxwalltime = ui.Number(
    name = 'Maximum Walltime',
    description = 'Maximum Walltime [min]',
    units = 'min',
    value = '30.0',
    min = '30.0',
    max = '60.0'
)

workflow_run_options_form = ui.Form([maxwalltime], name = 'Workflow Run Options')

def run_workflow(p):
    
    # print (p) #Button    
    
    global self_workflow_succeeded
    self_workflow_succeeded = False
    global self_workflow_results_filepath
    
    workflow_progress.clear_output()
    workflow_results.clear_output()
        
    with workflow_progress:
        
        runWorkflowButton.disabled = True
        show_log_output_button.disabled = True
        
        start_time = time.time()

        try:
            
            log_info ('folder_index: ' + str(folder_index))
            
            selected_modeling_groups = []
            for i in range(len(all_modeling_groups)):
                if all_modeling_groups[i].children[0].value == True:
                      selected_modeling_groups.append (all_modeling_groups[i].children[0].description)
                        
            if len(selected_modeling_groups) > 0:
                
                ice_sheet = folder_dropdown.value.split('/')[-1]
            
                self_workflow_results_filepath = os.path.join(self_workingdir, '%s_processed_netcdf_info.txt' %ice_sheet)
                log_info ('self_workflow_results_filepath: ' + self_workflow_results_filepath)
                
                for file in os.listdir(self_workingdir):
                    if os.path.isfile(file):
                        if file.endswith('netcdf_info.txt'):
                            os.remove(file)
 
                #Note: Workflow execution time depends on the current UB CCR workload.
                log_status (workflow_progress, "Pegasus workflow in progress...")
            
                #'''
                Wrapper (" ", \
                    self_tooldir, self_bindir, self_datadir, self_workingdir, self_rundir, \
                    folder_list[folder_index], description_list[folder_index], ','.join(selected_modeling_groups), int(maxwalltime.value))
                #'''
            
                log_status (workflow_progress, "\nWorkflow elapsed time: " + str((time.time() - start_time)/60.0) + " minutes\n")

                # Check if the results files were created and transferred from CCR 
                # to determine if workflow completed successfully

                if os.path.exists(self_workflow_results_filepath):

                    log_status (workflow_progress, "Workflow completed successfully\n")
                    self_workflow_succeeded = True

                    with workflow_results:

                        print("%s: \n\n" %self_workflow_results_filepath)
                        f = open(self_workflow_results_filepath,'r')
                        for line in f:
                            print(line.rstrip())
                        f.close()

                else:

                    log_error (workflow_progress, "Workflow did not complete successfully")
                    log_error (workflow_progress, "%s not generated by the workflow\n" %self_workflow_results_filepath)
                    self_workflow_succeeded = False

                    filepath = os.path.join(self_workingdir, 'pegasus.analysis')
                    if (os.path.exists(filepath)):
                        print("pegasus.analysis:\n")
                        f = open(filepath, 'r')
                        output = f.read()
                        f.close()
                        print (output)
                
                finish_workflow_processing()
                    
            else:
                
                log_error (workflow_progress, '\nERROR: No modeling groups are selected. Please select at least one modeling group.')

        except Exception as e:
        
            log_error (workflow_progress, "Workflow Exception: %s\n" %str(e))
            
        runWorkflowButton.disabled = False
        show_log_output_button.disabled = False
            

runWorkflowButton = widgets.Button(description="Run Workflow", disabled=False,\
    layout=widgets.Layout(width=button_width, height=button_height),\
    style= {'button_color':'lightgreen','font_weight':'bold'})
runWorkflowButton.add_class("buttontextclass")
runWorkflowButton.on_click (run_workflow)
#help (runWorkflowButton)


<a name="step_3"></a>
## Step 3: Run the Workflow [&#8607;](#top)

Click the `Run Workflow` button to run the workflow.  

- The Python scripts are encapsulated as a workflow by the launchWrapper.py script in the tool's bin directory. The Pegasus Workflow Management System (WMS) automates and manages the execution of the workflow jobs, including staging the jobs, distributing the work, submitting the jobs to run in parallel on CCR's UB-HPC compute cluster, as well as handling data flow dependencies and overcoming job failures. See the `Background` section for more information on the Pegasus WMS.<br />

- The get_netcdf_info.py Python script is executed in parallel for each of the selected modeling groups. This script uses the Python xarray package to analyze time series data from experiment files contained within the mapped collection folder's selected modeling groups and creates a json file for each of the selected modeling groups. The process_netcdf_info.py Python script reads the json files created by get_netcdf_info.py, determines unique time series information for each experiment type, and creates a text file containing the summerized time series information. This text file is returned from CCR and the results are displayed in the `View Workflow Results` section when the workflow completes.

- You will receive an email when the workflow completes and the results are ready for review.<br />

- If an error is encountered while running the workflow, the cause of the error will be written to the log output file, ghubex1_log_file.txt. See the `View Log Output File` section for more information.<br />


In [None]:
display(workflow_run_options_form)
display(runWorkflowButton)

In [None]:
def send_user_email(workflow_succeeded):

    email_subject = 'Ghub session #' + session_num + '.'
    
    if workflow_succeeded:
        email_text = 'Your ghubex1 job is complete!\r'
        email_text = email_text+'\rOutput files can be accessed on theghub.org in the following directory:'
        email_text = email_text+'\r' + str(self_workingdir)
    else:
        email_text = 'Your ghubex1 job Failed.'
        email_text = email_text+'\rPlease check theghub.org for further information, in the directory:'
        email_text = email_text+'\r' + str(self_workingdir)        
        
    email_cmd = 'submit --progress silent mail2self -t "'+email_text+'" -s "'+email_subject+'"'
    
    # email debugging
    #start_time = time.time()
    os.system(email_cmd)
    #elapsed_time = time.time() - start_time
    #print ('email elapsed time: ', elapsed_time)
    
def finish_workflow_processing():
    
    try:

        log_info ('\nfinish_workflow_processing...')
        
        # workflow.yml is created by Wrapper.py
        #filepath = os.path.join(self_workingdir, 'workflow.yml')
        #if os.path.exists(filepath):
            #print ("Deleting: %s\n" %filepath)
            #os.remove(filepath)

        for file in os.listdir(self_workingdir):
            if os.path.isfile(file):
                
                if file[0] == ".":
                    if file != ".gitattributes":
                        #print ("Deleting: %s\n" %file)
                        os.remove(file)

                #if file.startswith('python') and file.endswith('.stdout'):
                    #log_info ('\n%s:\n' %file)
                    #f = open(file,'r')
                    #for line in f:
                        #log_info (line)
                    #f.close()
                    #os.remove(file)
                    
                if file.startswith('python') and file.endswith('.stderr'):
                    log_info ('\n%s:\n' %file)
                    f = open(file,'r')
                    for line in f:
                        log_info (line)
                    f.close()
                    os.remove(file)
         
        filepath = os.path.join(self_workingdir, 'pegasus.analysis')
        if (os.path.exists(filepath)):
            filesize = os.path.getsize(filepath)
            log_info ('\npegasus.analysis filesize: ' + str(filesize))
            log_info ('pegasus.analysis:\n')
            f = open(filepath, 'r')
            output = f.read()
            f.close()
            log_info (output)
            os.remove(filepath)
        
        filepath = os.path.join(self_workingdir, "pegasusstatus.txt")
        if os.path.exists(filepath):
            #print ("Deleting: %s\n" %filepath)
            os.remove(filepath)

        filepath = os.path.join(self_workingdir, "pegasusjobstats.csv")
        if os.path.exists(filepath):
            #print ("Deleting: %s\n" %filepath)
            os.remove(filepath)

        filepath = os.path.join(self_workingdir, "pegasussummary-time.csv")
        if os.path.exists(filepath):
            #print ("Deleting: %s\n" %filepath)
            os.remove(filepath)

        filepath = os.path.join(self_workingdir, "pegasussummary.csv")
        if os.path.exists(filepath):
            #print ("Deleting: %s\n" %filepath)
            os.remove(filepath)

        # send email to user
        send_user_email(self_workflow_succeeded)
                
        log_info ('finish_workflow_processing done.')
        
    except Exception as e:
        log_error (workflow_progress, "EXCEPTION: %s\n" % str(e))


<a name="step_4"></a>
## Step 4: View Workflow Progress [&#8607;](#top)


In [None]:
workflow_progress = widgets.Output(layout={'border': widget_output_border_style})
display(workflow_progress)

<a name="step_5"></a>
## Step 5: View Workflow Results [&#8607;](#top)


In [None]:
workflow_results = widgets.Output(layout={'border': widget_output_border_style})
display(workflow_results)

<a name="step_6"></a>
## Step 6: View Log Output [&#8607;](#top)

- If an error is encountered while running this tool,
the cause of the error will be written to the log output file, ghubex1_log_file.txt.

- Click the `Show Log Output` button to open the `Log Output` window and view the log output file.


In [None]:
def show_log_output(change):
    
    if os.path.exists(self_log_filepath):
            
        if show_log_output_button.description == 'Show Log Output':
        
            show_log_output_button.description = 'Hide Log Output'
        
            with log_output:
            
                if os.path.exists(self_log_filepath):
                    print("%s: \n\n" %self_log_filepath)
                    f = open(self_log_filepath,'r')
                    for line in f:
                        print(line.rstrip())
                    f.close()
                else:
                    log_error (log_output, '%s does not exist ' %self_log_filepath + '. Please contact us.')
        else:
        
            show_log_output_button.description = 'Show Log Output'
            log_output.clear_output()
    else:
        log_error (log_output, '%s does not exist ' %self_log_filepath + '. Please contact us.')

show_log_output_button.add_class("buttontextclass")
show_log_output_button.on_click(show_log_output)
display (show_log_output_button)

In [None]:
log_output = widgets.Output(layout={'border': widget_output_border_style})
display (log_output)

In [None]:
def disable_widgets():

    global all_modeling_groups

    #modeling_groups_selection_form.disabled = True
    # Children need to be explictly disabled.
    for i in range(len(all_modeling_groups)):
        all_modeling_groups[i].children[0].disabled = True
    show_log_output_button.disabled = True
    #downloadTXTButton.disabled = True
    
def enable_widgets():
    
    global all_modeling_groups

    #modeling_groups_selection_form.disabled = False
    # Children need to be explictly enabled.
    for i in range(len(all_modeling_groups)):
        all_modeling_groups[i].children[0].disabled = False
    show_log_output_button.disabled = False
    #downloadTXTButton.disabled = False
         
def initialize():
    
    global all_modeling_groups
    
    disable_widgets()
    
    # Display form
    with models_form_output:
        
        modeling_groups = list(modeling_groups_list[folder_index].split(','))
        all_modeling_groups = []
        for i in range (len(modeling_groups)):
            append_modeling_groups(modeling_groups[i])
        modeling_groups_checkbox_form = ui.Form(all_modeling_groups, name = '')
        modeling_groups_buttons_form = ui.Form([select_default_models_button, select_all_models_button, unselect_all_models_button], name = '')
        modeling_groups_selection_form = ui.Form([modeling_groups_checkbox_form, modeling_groups_buttons_form], name = 'Modeling Group Selections')
        #Reduce the computational load by default
        select_default_modeling_groups()
        
        clear_output()
        display(modeling_groups_selection_form)
        
    enable_widgets()


In [None]:
# Start processing.
initialize()


<a name="createyourtool"></a>
## Create Your Tool On Ghub [&#8607;](#top)

### Host GIT repository on HUB

Follow the instructions on the https://theghub.org/tools/create web page.  Enter a name for your tool, for this template, ghubex1 was entered. Select the Repository Host, Host GIT repository on HUB. Select the Publishing Option, Jupyter Notebook. 

Note: when a new tool is created you will receive an email with a link to the tool's status page. The tool's status page will allow you to let the Ghub administrators know when you are ready to update, install, approve or publish your tool.

Note: published tools are launched from the Ghub Dashboard's My Tools component.

### Update Your Tool

1) Launch the Workspace 10 Tool from the Ghub Dashboard's My Tools component and in a xterm terminal window enter:

	git clone https://github.com/GhubGateway/Ghub_Pegasus_WMS_MATLAB_Example ghubex1

	git clone https://theghub.org/tools/<your tool name>/git/<your tool name> <your tool name>

2) Copy template files from the ghubex1 bin and remotebin directories to your tool's bin and remotebin directories.<br />
3) Update the launchWrapper.py script in your tool's bin directory with the script required to plan the workflow your tool. See  Ghub_Pegasus_WMS_Workflow_Python_Example.pdf in the ghubex1 doc directory for more information on Ghub Pegaus WMS workflows.<br />
4) Compare the invoke script in your tool's middleware directory with the invoke script in the ghubex1 middleware directory and update as required.

### Launch the Python Scripts for Your Tool

1) Launch the Jupyter Notebooks (202210) tool from the Ghub Dashboard's My Tools component and open the \<your tool name\>/\<your tool name\>.ipynb Jupyter Notebook.<br />
2) Update \<your tool name\>/\<your tool name\>.ipynb with the user interface required for your tool.<br />
3) Save the notebook updates.<br />
4) Click the Appmode button.<br />
5) Click the Run Workflow button to launch the Python Scripts.<br />

### Commit Your Tool Updates:

1) Enter git add to add a new file or to update an existing file.<br />
2) Enter git commit -m "commit message" to describe your updates.<br />
3) Enter git push origin master to push your updates to GIT repository on Ghub.<br />


<a name="background"></a>
## Background [&#8607;](#top)

- The Python scripts are encapsulated as a workflow. The Pegasus Workflow Management System (WMS) automates and manages the execution of the workflow jobs, including staging the jobs, distributing the work, submitting the jobs to run in parallel on CCR's UB-HPC compute cluster, as well as handling data flow dependencies and overcoming job failures. See https://pegasus.isi.edu/documentation/index.html for more information on the Pegasus Workflow Management System (WMS).

- The submit command enables Ghub users to execute code on CCR's UB-HPC compute cluster. See https://theghub.org/kb/development/using-submit for more information on the submit command. See https://help.hubzero.org/documentation/current/tooldevs/grid/pegasuswf for more information on submitting a pegasus-plan for a Pegasus WMS workflow.

- This Jupyter-based tool uses Python 3. See https://theghub.org/resources?alias=jupyterexamples for more information on developing Jupyter-based tools on Ghub.

- This tool is deployed on Debian 10 to run in Tool or App mode style. See https://theghub.org/kb/development/deploy-styles-for-jupyter-tools for more information on deploying Jupyter-based tools on Ghub.