<h1 id="Using-Python-code-and-Dataframes">Using Python code and Dataframes</h1>

<p>This notebook will walk you through some of the programmatic features available in the GenePattern Notebook Environment. The <a href="http://www.genepattern-notebook.org/programmatic/">GenePattern Notebook Programmatic Features Guide</a> contains information and examples about additional features not covered here.</p>

<div class="alert alert-info">First, sign in below.</div>


In [1]:
# Requires GenePattern Notebook: pip install genepattern-notebook
import gp
import genepattern

# Username and password removed for security reasons.
genepattern.GPAuthWidget(genepattern.register_session("https://gp-beta-ami.genepattern.org/gp", "", ""))

# Toggle Code View <a id="toggle_code"/>

We will start by looking at the code for running a simple GenePattern module (ConvertLineEndings: it just makes sure that the way a line is finished is correct for the GenePattern server). We will also run it just so that we can use its output file later. 

<div class="alert alert-info"><ol>
<li>Click the gear icon for the cell below and select <b>Toggle Code View</b></li>
<li>Run the module</li>
</ol></div>


In [2]:
convertlineendings_task = gp.GPTask(genepattern.get_session(0), 'urn:lsid:broad.mit.edu:cancer.software.genepattern.module.analysis:00002')
convertlineendings_job_spec = convertlineendings_task.make_job_spec()
convertlineendings_job_spec.set_parameter("input.filename", "https://datasets.genepattern.org/data/ccmi_tutorial/2017-12-15/BRCA_HUGO_symbols.preprocessed.gct")
convertlineendings_job_spec.set_parameter("output.file", "BRCA_40_samples.cvt.gct")
genepattern.GPTaskWidget(convertlineendings_task)

# Send to Code

The GenePattern Python Library seamlessly integrates with GenePattern cells. Code examples of how to call GenePattern jobs or GenePattern result files are available in GenePattern Job Cells by clicking a job result and selecting “Send to Code” in the menu.

<div class="alert alert-info">Once the above job is completed, click on the *BRCA_40_samples.cvt.gct* output and select *Send to Code.* This will create a code cell immediately below the job cell.</div>

# Send to Dataframe

The GenePattern Python Library also provides functionality for common GenePattern file formats, allowing them seamlessly integrate with [Pandas](http://pandas.pydata.org/), a popular Python data analysis library.

Both the [GCT and ODF file formats](http://software.broadinstitute.org/cancer/software/genepattern/file-formats-guide) are easily loaded as Pandas Dataframes. Code examples of how to load these files are available in GenePattern Job Cells by clicking a GCT or ODF job result and selecting “Send to Dataframe” in the menu.

<div class="alert alert-info">Scroll up and click on the same <a href="#toggle_code">*BRCA_40_samples.cvt.gct*</a> output you clicked on before and then select *Send to Dataframe.* This will generate a code cell which loads the output data into a pandas Dataframe. Execute this code and the Dataframe will display as a table in your notebook.</div>

# UI Builder

The UI Builder is a way to display any Python function as an interactive widget. This will render the parameters of the function as a web form.

The UI Builder also makes use of many Python features. It will display the docstring as the function description, will infer parameter types from default values and will display parameter annotations as helpful text near each input.

Python variables may be used as input when filling out a UI Builder form. To do this, simply type the name of the variable into the input field. When the form is submitted, the widget will pass a reference to the variable to the resulting function call.

The simplest way to render a function using the UI Builder is to import the *genepattern* package and then attach the *build_ui* decorator to the function's definition.

<div class="alert alert-info"><ol>
<li>Click the gear icon and select <b>Toggle Code View</b> in the cell below to see how the code was rendered using the UI Builder.</li>
</ol></div>

Inputs for any GenePattern cell can optionally take Python variable names.  For example, from your <B>Send to Dataframe</B> example above you should have a variable that looks something like this:  <pre>brca_40_samples_cvt_gct_1595725</pre> (The number at the end will be different for each run of the module. We can give the value of this variable (i.e. the Dataframe) by putting the variable name into the UI input for dataframe below.  Make sure that you have run the cell that creates the dataframe before this next step.
<div class="alert alert-info"><ol start=2>
<li>Copy the Dataframe variable name from your cell above and paste it into the <b>dataframe</b> variable in the <b>sort_results</b> cell below</li>
<li>Run the cell</li>
</ol></div>


In [3]:
from IPython.display import display

@genepattern.build_ui(parameters={

    # use this to hide the output variable on the UIBuilder form if you don't want users of this
    # notebook to see it
    "output_var": {
        "name": "results",
        "description": "There are the results",
        "hide": True
    }
})
def sort_results(dataframe: "The variable name of the GCT dataframe", 
                 column_name: "The name of the column by which to sort." = 'TCGA-A7-A0CE-11.htseq'):
    """
    Sort the samples in the dataframe by the specified column.
    """
    display(dataframe.dataframe.sort_values('TCGA-A7-A0CE-11.htseq', ascending=False))

## Parameter Types
The function below has defined its input as a <b>file parameter</b> type.  This allows you to use this in the 'Send to' from GenePattern job cells or to do the reverse and 'pull' results from previously executed Genepattern jobs. This is done by adding a parameter type definition to the UIBuilder definition.
<pre>
   parameters={
    "url": {
        "type": "file",
        "kinds": ["gct"]
    }
</pre>
You can also optionally define what kind of files it accepts.  These are defined by their three letter extensions (i.e. the bit after the period at the end of a filename, usually 3 letters long).  In the example above, it states that the <b>url</b> parameter accepts only files of the type <b>gct</b> which we saw in the Data Preparation section.

<div class="alert alert-info"><ol>
<li>Click the gear icon and select <b>Toggle Code View</b> to see how the parameter definition is added</li>
<li>Click on the "url" parameter and select the result file from the ConvertLineEndings job above.</li>
<li>Run the <b>print_url</b> cell.</li>
</ol></div>

In [4]:
import genepattern
import gp.data

@genepattern.build_ui(parameters={
    "url": {
        "type": "file",
        "kinds": ["gct"]
    },
    # use this to hide the output variable on the UIBuilder form if you don't want users of this
    # notebook to see it
    "output_var": {
        "name": "results",
        "description": "There are the results",
        "hide": True
    }
})
def print_url(url):
    return print("The result file URL is: " + url)

## The Tool Menu
By virtue of using the GenePattern UI Builder, the two functions we defined above (sort_results and print_url) have been added to the Tool menu. This allows you to reuse them, and their new UI, again elswhere in the notebook.

<div class="alert alert-info"><ol>
<li>Click the <b>Insert</b> menu item and select <b>Insert Cell Below</b> to add a cell to the notebook.</li>
<li>Click the <b>Cell</b> menu item and select <b>Cell Type</b> and then <B>GenePattern</B> to make it a GenePattern cell.  This should also cause the Tool Manager to appear at the left of your page.</li>
<li>Click on the Tool Manager and select the <b>Notebook</b> tab at the top.</li>
<li>Click on <b>print_url</b> tool to make this new cell another instance of the print_url function we used earlier.</li>

</ol></div>

<h1 id="Extra-Credit">Extra Credit</h1>

<p>Below extra credit exercises to complete if you have time.</p>

<h3 id="1.-Change-UI-Builder-Description">1. Change UI Builder Description</h3>

<div class="alert alert-info">View the code for the UI Builder <b>sort_results</b> cell above, change the function&#39;s description to specify whether the sorting is ascending or descending, then execute the cell to see the new description take effect.</div>

<h3 id="2.-Custom-Code">2. Custom Code</h3>

<div class="alert alert-info">Write a simple Python function and insert <b>@genepattern.build_ui</b>on the line above the function definition to tell GenePattern to wrap it with a user interface.</div>

<h3 id="3.-Installing libraries">3. Installing Libraries</h3>


<p >Most PIP-installable libraries can be added to your notebook server.  Try installing one of your favorite libraries with the following in a code cell, replace <a href="https://pythontips.com/2013/07/30/20-python-libraries-you-cant-live-without/">"Pillow"</a> with a library of your choice. </p>
<div class="alert alert-info">!pip install Pillow --target='/home/jovyan/.ipython'</div>
<br/>
<p>Then</p>
<div class="alert alert-info">import Pillow</div>


