<h1><span style="color:red">Welcome to the SuAVE Jupyter Notebook Server</span></h1>

This is the Jupyter Notebook Dispatcher module of the SuAVE platform. This environment enables you to write and execute Python scripts to process and analyze data in SuAVE surveys and image galleries. In most included scripts, the derived data (secondary variables, image characteristics, predictive labels, etc.) can be added to SuAVE surveys for visual analysis.  

Look several cells below for the types of operations supported by your selected JupyterHub.

You can execute cells in sequence (by clicking 'Run', or pressing Shift-Enter). From this module ("dispatcher") you can launch other notebooks to perform computations, image processing,  modeling and statistical tasks.



In [None]:
!pip install papermill

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

Mounted at /content/drive


## 1. Check if the passed parameters are correct 

In [None]:
# Check if the parameters are correct
import webbrowser
import ntpath
import os
from IPython.display import Markdown, display
def printmd(string):
    display(Markdown(string))

url_partitioned = full_notebook_url.partition('/SuaveDispatch')
base_url = url_partitioned[0];
images_available = False


if dzc_file == "undefined":
    dzc_file = ""
    localdzc = "" 
    full_images = "full images not available on NFS storage"
if len(dzc_file) > 20:
    if "lib-staging-uploads" in dzc_file:
        localdzc = dzc_file.replace("https://maxim.ucsd.edu/dzgen/lib-staging-uploads","/lib-nfs/dzgen")
        full_images = localdzc.replace("/content.dzc","/full_images/")
    else:
        localdzc = "dzc not available on NFS storage"
        full_images = "full images not available on NFS storage"
        images_available = False
else:
    localdzc = "dzc not available on NFS storage"
    full_images = "full images not available on NFS storage"
    images_available = False
    

printmd("<b><span style='color:red'>Verify survey parameters: </span></b>")

print("Base Survey URL: ", survey_url)
print("Enabled Views: ", views)
print("Default View: ", view)
print("User ID: ", user)
print("Additional Parameters: ", params)
print("Data File: ", csv_file)
print("Image Tile Collection URL: ", dzc_file)
print("Active Object: ", active_object)
print("Jupyter Hub URL: ", base_url)
print("Local Tile Collection Path : ", localdzc)
print("Local Full-size Image Path: ", full_images)
if os.path.exists(full_images):
    print("Full-size Images Available")
else:
    print("Full-size Images Not Available")
    

<b><span style='color:red'>Verify survey parameters: </span></b>

Base Survey URL:  https://suave2.sdsc.edu/main/file=jil146_SDG_Indicators_2018___clone__.csv
Enabled Views:  
Default View:  bucket
User ID:  jil146
Additional Parameters:  none
Data File:  jil146_SDG_Indicators_2018___clone__.csv
Image Tile Collection URL:  https://maxim.ucsd.edu/dzgen/uploads/275b305e3f83cad50623f9c6946abb4b/content.dzc
Active Object:  null
Jupyter Hub URL:  https://datahub.ucsd.edu/user/jil146/notebooks/jupyter-suave
Local Tile Collection Path :  dzc not available on NFS storage
Local Full-size Image Path:  full images not available on NFS storage
Full-size Images Not Available


<h2><span style="color:red">2. Load notebooks from github repository</span></h2>

<span style="color:red">Skip this cell if you are already running notebooks from your repo  - don't clone you repo the second time!</span> 

In [None]:
!rm -rf myclone
!git clone --depth 1 "https://github.com/DDDyylan/Suave_on_Colab.git" myclone
url1 = ('{base_url}/SuaveDispatch.ipynb?'+'survey_url=' + survey_url + '&' + 'views=' + views + '&' 'view=' + view + '&' + 'user='+user+'&'+'csv='+csv_file+'&'+'dzc='+dzc_file+"&"+'active_object='+active_object).format(base_url=base_url +"/myclone")

## 3. Retrieve the survey file for processing

As before, Shift-Enter to run this cell and continue to the next one. This merely prepares the data for subsequent processing

In [None]:
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

csv_url = survey_url.split("main")[0] + "surveys/" + csv_file

# get the survey data file
http = urllib3.PoolManager()
r = http.request('GET', csv_url, preload_content=False)

# place the file into temp_csvs 
path = "/content/drive/MyDrive/suave/" + csv_file 
with open(path, 'wb') as out:
    while True:
        data = r.read(1024)
        if not data:
            break
        out.write(data)

r.release_conn()
printmd("<b><span style='color:red'>Survey file retrieved. Run next cell to continue.</span></b>")


<b><span style='color:red'>Survey file retrieved. Run next cell to continue.</span></b>

## 4. Now, select a notebook to do some work

Select a notebook, then continue to the next cell. Note that you will see only those operations that are supported on your selected hub.

In [None]:
from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

from collections import OrderedDict

nb_menu = OrderedDict()
nb_menu_counter = 1

menulist = [
('Arithmetic Operations','arithmetic/SuaveArithmetic.ipynb','any'),
('Descriptive Statistics','stats/DescriptiveStats.ipynb','any'),
('Generate Contingency Tables','stats/Generate_Contingency_Tables.ipynb','any'),
('Generate Factor Contributions','stats/Generate_Factor_Contributions.ipynb','any'),
('Named Entity Recognition','tagger/NER.ipynb','any'),
('Color Statistics','colors/ColorStats.ipynb','image'),
('Classify Images','classify/ImageClassify.ipynb','image'),
('Generate LeNet CNN Model v2','predict/PredictiveModel_v2.ipynb','image'),
('Extend LeNet CNN Model','predict/ExtendModel.ipynb','image'),
('Generate SVM Model','svm/SVMPredictiveModel.ipynb','image'),
('Generate Aggregate Maps', 'maps/Generate_Aggregate_Maps_Suave.ipynb','any'),
('Extend SVM Model','svm/ExtendSVM.ipynb','image'),
('Generate SDG Dataset','SDG/GenerateSDGDataset.ipynb','largedataset'),
('Explore with Holoviz','holoviz/holoviz.ipynb','any'),
('Enhance Dataset','wrangling/qualgeoimage.ipynb','any'),
('Knowledge Graph Query','kg/kg_query.ipynb','any'),
('Annotate with NEMO','nemo/suave_nemo.ipynb','any'),
('Transfer Learning','transfer_learning/transfer_learning.ipynb','image'),
('Spatial Statistics','spatialstats/SpatialStats.ipynb','any')

]

if os.path.isfile(localdzc) and not os.path.isdir(full_images):
    print("ATTENTION!  This hub supports image-based processing, but full-size images are not available for this survey. \n Full-size image operations are not available from this menu.\n Contact the admin at zaslavsk@sdsc.edu to re-generate images from image tiles.")
elif not os.path.isfile(localdzc):
    print("ATTENTION!  Image tiles must be available on nfs storage.\n This hub does not support nfs mounted storage. \n Full-size image operations are not available from this menu.")
elif not os.path.isdir('/lib-nfs/largedatasets'):
    print("ATTENTION!  Large datasets such as the SDG database must be available on nfs storage.\n This hub does not support nfs mounted storage.")


# For a setup where dzc's are only on NFS share, use this:

for label, nb, nbtype in menulist:
    if nbtype == 'any':
        nb_menu[str(nb_menu_counter) + '. ' + label] = nb
        nb_menu_counter +=1
    elif nbtype == 'image':
        if os.path.isfile(localdzc) and os.path.isdir(full_images):
            nb_menu[str(nb_menu_counter) + '. ' + label] = nb
            nb_menu_counter +=1
    elif nbtype == 'largedataset':
        if os.path.isdir('/lib-nfs/largedatasets'):
            nb_menu[str(nb_menu_counter) + '. ' + label] = nb
            nb_menu_counter +=1
        

def f(notebooks_menu):
    return notebooks_menu
out = interact(f, notebooks_menu=nb_menu.keys());

printmd("<b><span style='color:red'>Select a Jupyter notebook and then run next cell</span></b>")


ATTENTION!  Image tiles must be available on nfs storage.
 This hub does not support nfs mounted storage. 
 Full-size image operations are not available from this menu.


interactive(children=(Dropdown(description='notebooks_menu', options=('1. Arithmetic Operations', '2. Descript…

<b><span style='color:red'>Select a Jupyter notebook and then run next cell</span></b>

## 5. Open the selected notebook and pass survey parameters to it

Once the URL is for the next notebook is constructed, click that URL to open it

In [None]:
import urllib.parse
import papermill as pm
chosen_nb_name = nb_menu[out.widget.result]
specific = 'myclone/operations/'+chosen_nb_name
url = urllib.parse.urlparse(url1)
query_dict = urllib.parse.parse_qs(url.query)
input_parameters = {k: v[0] for k, v in query_dict.items()}
input_parameters['full_notebook_url'] = url1
# Execute the notebook with the input parameters
try:
    pm.execute_notebook(
        input_path=specific,
        output_path=path+chosen_nb_name.split('/')[-1],
        parameters=input_parameters
    )
except:
    print("The chosen notebook has been created. It may takes a while to access it")

<b><span style='color:red'>Click the URL to open the selected notebook:</span></b>

https://datahub.ucsd.edu/user/jil146/notebooks/jupyter-suave/operations/arithmetic/SuaveArithmetic.ipynb?surveyurl=https://suave2.sdsc.edu/main/file=jil146_SDG_Indicators_2018___clone__.csv&views=&view=bucket&user=jil146&csv=jil146_SDG_Indicators_2018___clone__.csv&dzc=https://maxim.ucsd.edu/dzgen/uploads/275b305e3f83cad50623f9c6946abb4b/content.dzc&activeobject=null
