# Seurat Pipeline in GenePattern

---

**Notebook Authors**: Jonathan Zamora, Alex Wenzel, and Edwin F. Juárez, PhD
<br>
**Contact Information**: jzamoraa@ucsd.edu & ejuarez@ucsd.edu

## Background & Summary

`Seurat` is an R toolkit for single cell genomics, and it is widely used by both dry-lab and wet-lab researchers to conduct Quality Control (QC), Preprocessing, Batch Correction, and Clustering of single-cell RNA-seq (scRNA-seq) data, as well as other analyses that pertain to scRNA-seq data.

The following notebook serves as an introduction to the expansive pipeline of `Seurat` analyses by providing a walkthrough of the following:
- `Seurat.QC`
- `Seurat.Preprocessing`
- `Seurat.Clustering`
- `Seurat.VisualizeMarkerExpression`

Notably, our `Seurat` pipeline notebook pulls data from the **`Human Cell Atlas`**, allowing our users to perform analyses on any scRNA-seq dataset from their wide collection of readily available data.

The underlying code for this notebook is built using the `Seurat v4.0.1` package, and the source code for `Seurat` can be found here: https://github.com/satijalab/seurat



![GenePattern Seurat Pipeline](Seurat-Pipeline.png)

## Activate Demo Mode

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
Click run in the cell below to activate demo mode.
</div>

In [55]:
import nbtools
import demo_mode
from IPython.display import display, HTML

@nbtools.build_ui(parameters={'output_var': {'hide': True}})
def activate_demo_mode():
    """Enable demo mode for this notebook"""
    
    # Any job you enable as a demo job will need to be listed here, 
    # along with the module name and any parameters you want matched.
    # Parameters that are not listed will not be matched. You can 
    # list the same module multiple times, assuming you use different 
    # parameter match sets.
    
    demo_mode.set_demo_jobs([
        {
            'name': 'Seurat.QC',
            'job': 396891,                                # Make sure to set the permissions of any demo job to 'public'
            'params': {
                'input_file':'demo_mode.py.zip',
                'hca_url': 'MM', 
                'file_name': 'MultipleMyeloma'
            }
        }, 
        {
            'name': 'Seurat.Preprocessing',
            'job': 396941,
            'params': {
                'input_rds': 'MultipleMyeloma.rds',   # For file parameters, just list the file name, not URL
                'file_name': 'MultipleMyeloma.Preprocessed_1',
                'min_n_features': '0',
                'max_n_features': '6000',
                'max_percent_mitochondrial': '80',
            }
        },
        {
            'name': 'Seurat.Clustering',
            'job': 396942,
            'params': {
                'input.seurat.rds.file':'MultipleMyeloma.Preprocessed_1.rds',
                'output.filename': 'MultipleMyeloma.clustered_1', 
            }
        },
        {
            'name': 'Seurat.VisualizeMarkerExpression',
            'job': 396944,
            'params': {
                'input_file':'MultipleMyeloma.clustered_1.rds',
                'genes': 'CD14, LYZ, CCR7, IL7R, S100A4, MS4A1, CD8A, CD19, CD38', 
                'output_file_name': 'MultipleMyeloma.Markers_1', 
            }
        },
    ])
    
    # To activate demo mode, just call activate(). This example wraps 
    # the activation call behind a UI Builder cell, but you could have 
    # it called in different ways.

    demo_mode.activate()
    display(HTML('<div class="alert alert-success">Demo mode activated</div>'))
    
    # The code in this call has been left expanded for tutorial purposes

UIBuilder(description='Enable demo mode for this notebook', function_import='nbtools.tool(id="activate_demo_mo…

# Login to GenePattern

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
Login with your GenePattern credentials.
</div>

In [33]:
# Requires GenePattern Notebook: pip install genepattern-notebook
import gp
import genepattern

# Username and password removed for security reasons.
genepattern.display(genepattern.session.register("https://cloud.genepattern.org/gp", "", ""))

GPAuthWidget()

# Data Input: Human Cell Atlas

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
    
To kick off our Seurat Pipeline, we select scRNA-seq data from the **`Human Cell Atlas`**. 
    
Once we have selected some scRNA-seq datasets for our analysis, we can provide each individual scRNA-seq dataset as input to the `Seurat.QC` module via the `hca url` parameter.
    
Below, please choose a project of your liking from the Human Cell Atlas:

</div>

<div class="well well-sm">  
    
Gist of the papers/data: Development of a software technique to analyze single cells from a progressive multiple myeloma (MM) patient to identify major genetic subclones that exhibit distinct transcriptional signatures relevant to cancer progression.

Further reading: 
- https://data.humancellatlas.org/explore/projects/0c3b7785-f74d-4091-8616-a68757e4c2a8
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6071640/
- https://clincancerres.aacrjournals.org/content/26/4/935

</div>

In [54]:
import requests
import os
from tqdm.notebook import tqdm
import pandas as pd
import numpy as np
import nbtools
from nbtools import UIBuilder, UIOutput
import IPython
                
def iterate_matrices_tree(tree, keys=()):
    if isinstance(tree, dict):
        for k, v in tree.items():
            yield from iterate_matrices_tree(v, keys=(*keys, k))
    elif isinstance(tree, list):
        for file in tree:
            yield keys, file
    else:
        assert False


def get_hca_data(project):
    '''
    Gets data for a given project from HCA using API
    '''
    title_list = [] # project titles
    size_list = [] # file sizes for each project file
    name_list = [] # file names for each project file
    source_list = [] # file source for each project file
    uuid_list = [] # unique identification for each project
    url_list = [] # url to download each project file

    for i in range(len(project['hits'])):

        current_project = project['hits'][i]['projects'][0]

        if 'matrices' in current_project.keys():
            
            # initialize sublists to hold each project's data
            size_sublist = []
            name_sublist = []
            source_sublist = []
            uuid_sublist = []
            url_sublist = []

            for path, file_info in iterate_matrices_tree(current_project['matrices']):

                if file_info['url'] != "": # ensure the file url is not empty

                    size = file_info['size']
                    name = file_info['name']
                    source = file_info['source']
                    uuid = file_info['uuid']
                    url = file_info['url']

                    size_sublist.append(size)
                    name_sublist.append(name)
                    source_sublist.append(source)
                    uuid_sublist.append(uuid)
                    url_sublist.append(url)

            if name_sublist != []: # ensure there is >= 1 file name for a project
                title_list.append(current_project['projectTitle'])
                size_list.append(size_sublist)
                name_list.append(name_sublist)
                source_list.append(source_sublist)
                uuid_list.append(uuid_sublist)
                url_list.append(url_sublist)
    
    return title_list, size_list, name_list, source_list, uuid_list, url_list


def choose_hca_catalog(catalog_version):
    """Choose HCA Catalog Version and display info"""
    catalog = catalog_version
    endpoint_url = 'https://service.azul.data.humancellatlas.org/index/projects'

    # get initial response to query size of project list
    init_response = requests.get(endpoint_url, params={'catalog': catalog})
    init_response.raise_for_status()
    init_response_json = init_response.json()
    init_project = init_response_json
    num_projects = init_project['pagination']['total']

    # get response with size = num projects to ensure all projects are displayed
    response = requests.get(endpoint_url, params={'catalog': catalog, 'size': num_projects})
    response.raise_for_status()

    response_json = response.json()
    project = response_json
    
    titles, sizes, names, sources, uuids, urls = get_hca_data(project) # get all project data from HCA via API
    
    # return dictionary for use with the UIBuilder Cell
    return dict(zip(titles, titles)), dict(zip(titles, urls)), dict(zip(titles, names)), dict(zip(titles, sizes)), dict(zip(titles, sources))

# "dcp7" is used to access HCA DCP v2.0 catalog
hca_titles_titles, hca_titles_urls, hca_titles_names, hca_titles_sizes, hca_titles_sources = choose_hca_catalog("dcp11")


@nbtools.build_ui(
    
    name = "Choose Human Cell Atlas Project",
    
    description = """ \
                    The following cell will allow you to select a Human Cell Atlas project \
                    from the v2.0 Data Cloud Portal. \
                    \
                    To proceed, simply select a project from the drop-down\
                    menu, and after selecting your project, the cell will then display all download URLs \
                    for the selected project's matrices. \
                  """,
    
    parameters =
    {
        "project_title":
        {
            "name": "Project Title",
            "type": "choice",
            "choices": hca_titles_titles
        },
        
        "output_var":
        {
            "hide": True
        }
    }
)


def choose_hca_project(project_title):
    
    print(f"SELECTED PROJECT TITLE: '{project_title}'\n")
    print("THE DOWNLOAD URLS FOR YOUR PROJECT'S MATRICES ARE BELOW:\n")
    
    file_urls = set()
    
    for key, value in enumerate(hca_titles_urls[project_title]):
        name = hca_titles_names[project_title][key]
        print("FILE NAME:", name)
        url = hca_titles_urls[project_title][key]
        print("FILE URL:", url)
        
        if url not in file_urls:
            file_urls.add(url) # url lib to check if loom file gets downloaded from the url
        
        file_size = round((hca_titles_sizes[project_title][key] / (1024 * 1024)), 2)

        if file_size < 1024:
            print(f"FILE SIZE: {file_size} MB")
        else:
            print(f"FILE SIZE: {round(file_size / 1024, 2)} GB")
        
        #print("SUGGESTED JOB MEMORY FOR SEURAT.QC:", suggested_memory)
        
        print("FILE SOURCE:", hca_titles_sources[project_title][key])
        print()

    return UIOutput(name="Your HCA Files are now available!",
                    description="The file urls for your HCA project are shown below:",
                    files=list(file_urls),
                    status="File URLs:"
                   )

UIBuilder(description="                     The following cell will allow you to select a Human Cell Atlas pro…

# QC

In [53]:
from nbtools import UIBuilder, UIOutput
UIOutput(name="Temp file for demo mode",
                    description="Use this file for the parameter 'input file' in Seurat.QC below to activate demo mode:",
                    files=['demo_mode.py.zip'],
                   )

UIOutput(description="Use this file for the parameter 'input file' in Seurat.QC below to activate demo mode:",…

<div class="alert alert-info">
<p class="lead"> Loading your HCA Dataset into Seurat.QC in demo mode <i class="fa fa-info-circle"></i></p>  
    
- For the **input file** parameter, select `demo_mode.py.zip` from the dropdown menu
- For the **hca url** oarameter, type `MM` 
    - **Note**: Normally you would drag and drop the URL created by the UI Builder cell above, but for demo mode, we are using this dummy parameter.
- Under the `QC Plots` section of the parameters, change the **file name*** parameter to `MultipleMyeloma`
</div>

In [52]:
seurat_qc_task = gp.GPTask(genepattern.session.get(0), 'urn:lsid:genepattern.org:module.analysis:00416')
seurat_qc_job_spec = seurat_qc_task.make_job_spec()
seurat_qc_job_spec.set_parameter("input_file", "")
seurat_qc_job_spec.set_parameter("hca_url", "")
seurat_qc_job_spec.set_parameter("column_name", "percent.mt")
seurat_qc_job_spec.set_parameter("pattern", "MT-")
seurat_qc_job_spec.set_parameter("file_name", "seurat_qcd_dataset")
seurat_qc_job_spec.set_parameter("first_feature", "nFeature_RNA")
seurat_qc_job_spec.set_parameter("second_feature", "nCount_RNA")
seurat_qc_job_spec.set_parameter("third_feature", "percent.mt")
seurat_qc_job_spec.set_parameter("export_txt", "False")
seurat_qc_job_spec.set_parameter("job.memory", "2 Gb")
seurat_qc_job_spec.set_parameter("job.walltime", "02:00:00")
seurat_qc_job_spec.set_parameter("job.cpuCount", "1")
genepattern.display(seurat_qc_task)


GPTaskWidget(lsid='urn:lsid:genepattern.org:module.analysis:00416')

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
    
- Click on the dropdown menu for **PDF to display*** parameter and choose `MultipleMyeloma.pdf`
</div>

In [51]:
import os
import requests
import genepattern
from IPython.display import IFrame
@genepattern.build_ui(name="Display PDF", parameters={
    "image": {
        "name": "PDF to display:",
        "description": "PDF file (typically named Rplots.pdf) from the Seurat.QC module",
        "type": "file",
        "kinds": ["pdf"]
    },
    "height":{"default":850, "hide":True},
    "width":{"default":850, "hide":True},
    "output_var": {
        "default":"output_var",
        "hide": True
    }
})
def displayPdf(image, height, width):
    job_widget = nbtools.UIOutput(status="Getting file from the GenePattern server...")
    display(job_widget)
    f = gp.GPFile(genepattern.session.get(0), image)
    basename=os.path.basename(image)
    resp = requests.get(image, headers={
        'Authorization': f.server_data.authorization_header(), 
        'User-Agent': 'GenePatternRest'})
    job_widget.status = 'Writing pdf file to your workspace. This may take a minute.'
    with open(basename, "wb") as f:
        f.write(resp.content)
    
    job_widget.status = basename+' successfully written to the same folder as this notebook!'
    display(IFrame(basename,width, height))
    return

UIBuilder(function_import='nbtools.tool(id="Display PDF", origin="Notebook").function_or_method', name='Displa…

<div class="well well-sm">  
    
- nFeature_RNA: Number of genes per cell
- nCount_RNA: Number of reads per cell
- percent.mt: Percent of counts associated to mitochondrial genes
</div>

# Preprocessing

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>  
    
Select the following parameters:  
- **input rds***: Select `MultipleMyeloma.rds` from the dropdown menu
- **file name***: Type `MultipleMyeloma.Preprocessed_1`
- **min n features***: Type `0` -- This corresponds to the minimum number of genes per cell
- **max n features***: Type `6000` -- This corresponds to the maximum number of genes per cell
- **max percent mitochondrial***: Type `80` -- this corresponds to the maximum percent of counts associated to mitochondrial genes per cell
</div>

In [50]:
seurat_preprocessing_task = gp.GPTask(genepattern.session.get(0), 'urn:lsid:genepattern.org:module.analysis:00415')
seurat_preprocessing_job_spec = seurat_preprocessing_task.make_job_spec()
seurat_preprocessing_job_spec.set_parameter("input_rds", "")
seurat_preprocessing_job_spec.set_parameter("file_name", "seurat_preprocessed_dataset")
seurat_preprocessing_job_spec.set_parameter("min_n_features", "100")
seurat_preprocessing_job_spec.set_parameter("max_n_features", "20000")
seurat_preprocessing_job_spec.set_parameter("max_percent_mitochondrial", "5")
seurat_preprocessing_job_spec.set_parameter("norm_method", "LogNormalize")
seurat_preprocessing_job_spec.set_parameter("scale_factor", "10000")
seurat_preprocessing_job_spec.set_parameter("feat_sel_method", "vst")
seurat_preprocessing_job_spec.set_parameter("num_features", "2000")
seurat_preprocessing_job_spec.set_parameter("num_to_label", "10")
seurat_preprocessing_job_spec.set_parameter("vdl_num_dims", "2")
seurat_preprocessing_job_spec.set_parameter("vdhm_num_dims", "15")
seurat_preprocessing_job_spec.set_parameter("cells", "500")
seurat_preprocessing_job_spec.set_parameter("keep_scale_data", "TRUE")
seurat_preprocessing_job_spec.set_parameter("numpcs", "50")
seurat_preprocessing_job_spec.set_parameter("job.memory", "2 Gb")
seurat_preprocessing_job_spec.set_parameter("job.walltime", "02:00:00")
seurat_preprocessing_job_spec.set_parameter("job.cpuCount", "1")
genepattern.display(seurat_preprocessing_task)

GPTaskWidget(lsid='urn:lsid:genepattern.org:module.analysis:00415')

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
    
- Click on the dropdown menu for **PDF to display*** parameter and choose `MultipleMyeloma.Preprocessed_1.pdf`
</div>

In [48]:
import os
import requests
import genepattern
from IPython.display import IFrame
@genepattern.build_ui(name="Display PDF", parameters={
    "image": {
        "name": "PDF to display:",
        "description": "PDF file (typically named Rplots.pdf) from the Seurat.QC module",
        "type": "file",
        "kinds": ["pdf"]
    },
    "height":{"default":850, "hide":True},
    "width":{"default":850, "hide":True},
    "output_var": {
        "default":"output_var",
        "hide": True
    }
})
def displayPdf(image, height, width):
    job_widget = nbtools.UIOutput(status="Getting file from the GenePattern server...")
    display(job_widget)
    f = gp.GPFile(genepattern.session.get(0), image)
    basename=os.path.basename(image)
    resp = requests.get(image, headers={
        'Authorization': f.server_data.authorization_header(), 
        'User-Agent': 'GenePatternRest'})
    job_widget.status = 'Writing pdf file to your workspace. This may take a minute.'
    with open(basename, "wb") as f:
        f.write(resp.content)
    
    job_widget.status = basename+' successfully written to the same folder as this notebook!'
    display(IFrame(basename,width, height))
    return

UIBuilder(function_import='nbtools.tool(id="Display PDF", origin="Notebook").function_or_method', name='Displa…

<div class="well well-sm">
Sometimes, you'll want to try two or three different parameter options at the same time. Let's submit a second job with different parameters now.
</div>

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>  
    
- Insert a new GenePattern cell below    
- Select Seurat.Preprocessing from the list of available modules
- Select the same .rds file for the **input rds*** parameter (i.e., MultipleMyeloma.rds) from the dropdown menu
- Make sure to give it a helpful name by changing the value of the **file name*** parameter
- Change the rest of the parameters so your filtering is more stringent preprocessing (i.e., `min n features`, `max n features`, `max percent mitochondrial`)
- Under `Normalization & Dimension reduction` larger number of principal components to be computed
- Submit a job     
    + **Note:** when prompted "You are running in demo mode... would you like to submit job anyway?" click "OK")

</div>

# Clustering

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
  
Select the following parameters:  
- **input seurat rds file***: Select `MultipleMyeloma.Preprocessed_1.rds` from the dropdown menu
- **output filename***: Type `MultipleMyeloma.clustered_1`
</div>

In [47]:
seurat_clustering_task = gp.GPTask(genepattern.session.get(0), 'urn:lsid:broad.mit.edu:cancer.software.genepattern.module.analysis:00408')
seurat_clustering_job_spec = seurat_clustering_task.make_job_spec()
seurat_clustering_job_spec.set_parameter("input.seurat.rds.file", "")
seurat_clustering_job_spec.set_parameter("output.filename", "<input.seurat.rds.file_basename>.clustered")
seurat_clustering_job_spec.set_parameter("maximum_dimension", "10")
seurat_clustering_job_spec.set_parameter("resolution", "0.05")
seurat_clustering_job_spec.set_parameter("reduction", "umap")
seurat_clustering_job_spec.set_parameter("nmarkers", "50")
seurat_clustering_job_spec.set_parameter("seed", "17")
seurat_clustering_job_spec.set_parameter("job.memory", "2 Gb")
seurat_clustering_job_spec.set_parameter("job.walltime", "02:00:00")
seurat_clustering_job_spec.set_parameter("job.cpuCount", "1")
genepattern.display(seurat_clustering_task)

GPTaskWidget(lsid='urn:lsid:broad.mit.edu:cancer.software.genepattern.module.analysis:00408')

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>
        
- Click on the dropdown menu for **PDF to display*** parameter and choose `MultipleMyeloma.clustered_1.pdf`

</div>

In [46]:
import os
import requests
import genepattern
from IPython.display import IFrame
@genepattern.build_ui(name="Display PDF", parameters={
    "image": {
        "name": "PDF to display:",
        "description": "PDF file (typically named Rplots.pdf) from the Seurat.QC module",
        "type": "file",
        "kinds": ["pdf"]
    },
    "height":{"default":850, "hide":True},
    "width":{"default":850, "hide":True},
    "output_var": {
        "default":"output_var",
        "hide": True
    }
})
def displayPdf(image, height, width):
    job_widget = nbtools.UIOutput(status="Getting file from the GenePattern server...")
    display(job_widget)
    f = gp.GPFile(genepattern.session.get(0), image)
    basename=os.path.basename(image)
    resp = requests.get(image, headers={
        'Authorization': f.server_data.authorization_header(), 
        'User-Agent': 'GenePatternRest'})
    job_widget.status = 'Writing pdf file to your workspace. This may take a minute.'
    with open(basename, "wb") as f:
        f.write(resp.content)
    
    job_widget.status = basename+' successfully written to the same folder as this notebook!'
    display(IFrame(basename,width, height))
    return

UIBuilder(function_import='nbtools.tool(id="Display PDF", origin="Notebook").function_or_method', name='Displa…

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>  
    
- Insert a new GenePattern cell below    
- Select Seurat.Clustering from the list of available modules
- For the **input seurat rds file*** parameter, select the `.rds` output from the `Seurat.Preprocessing` job you submitted earlier
- Submit a job with different parameters
    - **Note:** when prompted "You are running in demo mode... would you like to submit job anyway?" click "OK")
- Make sure to give it a helpful name by changing the value of the **output filename*** parameter and that you use the output from the Seurat.Preprocessing job that you submitted earlier
</div>

# Visualize Marker Expression

Here we Visualize marker expression as violin plots and on a UMap.

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>  
    
- For the parameter named **input file** click on the down arrow (<i class="fa fa-sort-down"></i>) and select the `MultipleMyeloma.clustered_1.rds` created by `Seurat.Clustering` 
- For the parameter named **genes** type the list of genes/markers you would like to visualize. If you write more than one, they have to be separated by a comma and a space. e.g., <code>CD14, LYZ, CCR7, IL7R, S100A4, MS4A1, CD8A, CD19, CD38</code>
- For the **output file name*** parameter, type `MultipleMyeloma.Markers_1`
</div>

In [38]:
seurat_visualizemarkerexpression_task = gp.GPTask(genepattern.session.get(0), 'urn:lsid:genepattern.org:module.analysis:00421')
seurat_visualizemarkerexpression_job_spec = seurat_visualizemarkerexpression_task.make_job_spec()
seurat_visualizemarkerexpression_job_spec.set_parameter("input_file", "")
seurat_visualizemarkerexpression_job_spec.set_parameter("genes", "")
seurat_visualizemarkerexpression_job_spec.set_parameter("group_plots", "Horizontally")
seurat_visualizemarkerexpression_job_spec.set_parameter("output_file_name", "")
seurat_visualizemarkerexpression_job_spec.set_parameter("job.memory", "2 Gb")
seurat_visualizemarkerexpression_job_spec.set_parameter("job.walltime", "02:00:00")
seurat_visualizemarkerexpression_job_spec.set_parameter("job.cpuCount", "2")
genepattern.display(seurat_visualizemarkerexpression_task)

GPTaskWidget(lsid='urn:lsid:genepattern.org:module.analysis:00421')

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>  
    
- For the `PDF to display*` parameter, choose `MultipleMyeloma.Markers_1.pdf`
- Click <b>Run</b>
</div>

In [32]:
import os
import requests
import genepattern
from IPython.display import IFrame
@genepattern.build_ui(name="Display PDF", parameters={
    "image": {
        "name": "PDF to display:",
        "description": "PDF file (typically named Rplots.pdf) from the Seurat.QC module",
        "type": "file",
        "kinds": ["pdf"]
    },
    "height":{"default":850, "hide":True},
    "width":{"default":850, "hide":True},
    "output_var": {
        "default":"output_var",
        "hide": True
    }
})
def displayPdf(image, height, width):
    job_widget = nbtools.UIOutput(status="Getting file from the GenePattern server...")
    display(job_widget)
    f = gp.GPFile(genepattern.session.get(0), image)
    basename=os.path.basename(image)
    resp = requests.get(image, headers={
        'Authorization': f.server_data.authorization_header(), 
        'User-Agent': 'GenePatternRest'})
    job_widget.status = 'Writing pdf file to your workspace. This may take a minute.'
    with open(basename, "wb") as f:
        f.write(resp.content)
    
    job_widget.status = basename+' successfully written to the same folder as this notebook!'
    display(IFrame(basename,width, height))
    return

UIBuilder(function_import='nbtools.tool(id="Display PDF", origin="Notebook").function_or_method', name='Displa…

<div class="well well-sm">
<p class="lead"> Results interpretation <i class="fa fa-info-circle"></i></p>   
    
- Markers of T-Cells (two kinds)
- Markers of B-Cells

</div>

<div class="alert alert-info">
<p class="lead"> Instructions <i class="fa fa-info-circle"></i></p>  
    
- Insert a new GenePattern cell below    
- Select Seurat.VisualizeMarkerExpression from the list of available modules
- For the parameter named **input file** click on the down arrow (<i class="fa fa-sort-down"></i>) and select the `.rds` output from `Seurat.Clustering` job you ran above.
- Submit a job with the same genes to be visualized (e.g., <code>CD14, LYZ, CCR7, IL7R, S100A4, MS4A1, CD8A, CD19, CD38</code>)
- Make sure to give it a helpful name by changing the value of the **output file name*** parameter and that you select the output of the Seurat.Clustering job you submitted earlier 
    + **Note:** when prompted "You are running in demo mode... would you like to submit job anyway?" click "OK")
- While your job is running/pending, insert a new Display PDF UIBUilder cell below:

    + Select the cell below
    + Click on `Cell` from the menu above, then `Cell Type``GenePattern` 
    + In the catalogue that pops out, search `Display PDF` and click on one of the available UIBuilder cell
- Display the results of the outputs of the `Seurat.VisualizeMarkerExpression` job you ran
</div>

# References




1. Satija, R., Farrell, J., Gennert, D. et al. Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33, 495–502 (2015). https://doi.org/10.1038/nbt.3192

2. Hao, Yuhan, Stephanie Hao, Erica Andersen-Nissen, William M. Mauck III, Shiwei Zheng, Andrew Butler, Maddie J. Lee, et al. “Integrated Analysis of Multimodal Single-Cell Data.” Cell 184, no. 13 (June 2021): 3573-3587.e29. https://doi.org/10.1016/j.cell.2021.04.048.

3. HCA: https://data.humancellatlas.org/explore/projects/0c3b7785-f74d-4091-8616-a68757e4c2a8

4. Fan, Jean et al. “Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data.” Genome research vol. 28,8 (2018): 1217-1227. doi:10.1101/gr.228080.117 -- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6071640/

5. Alterations in the Transcriptional Programs of Myeloma Cells and the Microenvironment during Extramedullary Progression Affect Proliferation and Immune Evasion
Daeun Ryu, Seok Jin Kim, Yourae Hong, Areum Jo, Nayoung Kim, Hee-Jin Kim, Hae-Ock Lee, Kihyun Kim and Woong-Yang Park
Clin Cancer Res February 15 2020 (26) (4) 935-944; DOI: 10.1158/1078-0432.CCR-19-0694 -- https://clincancerres.aacrjournals.org/content/26/4/935.long#sec-6
