# Developer Jupyter Reviewer Tutorial

This tutorial is for developers who want to make a Reviewer object from scratch.

If you are looking to use a pre-made Reviewer, refer to the Quick Start Jupyter Reviewer Tutorial (TBD)


# Introduction


There are 3 parts you need to generate to make a standard Jupyter Reviewer:

**1. ReviewData Object** A ReviewData object consists of 3 pandas dataframes:
1. data: a dataframe with data that a user wants to review, row by row (ie samples, participants, mutations, etc.)
2. annot: annotations that a user wants to write for each row (ie notes, flags, etc.)
3. history: a timeline of changes a user makes to the annot table

**2. ReviewDataApp** A ReviewDataApp is a plotly.dash application to display data in a particular way to review data in a ReviewData object, and already includes prebuilt functionality for a user to add annotations and view history of the ReviewData.

As a developer, you will define custom dash components you want to display for the type of review you are implementing (purity, mutation, etc.). This includes tables, graphs, or other components of interest. plotly.dash also enables interactivity, so you can define special functions to allow for interactive viewing of charts and graphs, and to auto-calculate values you may want to use for your annotations (more on autofill below).

When a ReviewDataApp is passed a ReviewData object, it will read the ReviewData object to render your components.

**3. Autofill Dictionary** Autofill allows you to connect outputs of the ReviewDataApp to the annotations for the ReviewData object. At runtime, it will add buttons in the top annotation panel where a user can click and the current values of the selected components in the dash will map to the specified annotation inputs in the annotation panel.

The following tutorial will walk through each of these steps in more detail.

## What a user sees

A user using your custom Reviewer will have a notebook that looks like the following:


In [None]:
import YourReviewer

# instantiate Your reviewer
reviewer = YourReviewer()

# set the review object
reviewer.set_review_data()

# set the review app
reviewer.set_review_app()

# run the app
reviewer.run()


# The basic structure

To build your custom reviewer, you will need to do the following:

1. Create a new class that inherits `ReviewerTemplate`
1. Define 3 abstract methods:
    1. gen_review_data()
    2. gen_review_app()
    3. gen_autofill()
    
    
Your file will look something like the following:


## gen_review_data()

To crete a `ReviewData` object, the following parameters must be defined:
- `review_data_fn`: A file path to save the object as a pkl file.
- `description`: A string describing the ReviewData object's source of data and purpose
- `df`: pandas dataframe containing the data to review. Each row corresponds to a single item to be annotated.
- `review_data_annotation_dict`: A dictionary of `ReviewDataAnnotation` objects where you can define what annotations you want for your review (name with the key). This dictionary will be used to render the annotation inputs in the `ReviewDataApp`.
- `reuse_existing_review_data_fn`: You can reuse the annotations of a previous `ReviewData` object.

For your custom reviewer, you may have specific plots or calculations you want to make, and known/common annotations that a someone reviewing the data should use. You define these special features for your type of reviewer in `gen_review_data(self, ...)`. The main features would be:
1. Preprocessing the input `df` dataframe, such as precomputing data/graphs and adding columns you may need for your ReviewDataApp. 
2. `review_data_annotation_dict`: a str: `ReviewDataAnnotation` dictionary. The key string will be the column name in the review data object's annotation table. `ReviewDataAnnotation` consists of:
    1. `type`: view `ReviewData.AnnotationType` Enum
    1. `options`: a list of valid values (for checklist and radioitems)
    1. `validate_input`: a named, non-local function (cannot be a lambda function) that takes a single parameter and returns True/False

Once you have done any preprocessing and defined any default annotations, you can create and return the `ReviewData` object. 

## gen_review_app()

This creates a dash application where you can define what components to include. The `ReviewDataApp` already has built-in components to handle interating through the items in any `ReviewData` object, rendering the annotation inputs defined by the `ReviewData` object's `review_data_annotation_list`, and the history.

**What is Dash?**

Dash is a library that makes it easy to generate custom dashboards in python. I recommend reviewing the [Dash Tutorial](https://dash.plotly.com/installation) first before proceeding. 

In short, to create a dash app, you define:
1. Layout: how you want the app to look
2. Callbacks: functions to define interactivity with the components in your layout

The `ReviewDataApp` is built so it is simple for you to easily add components and interactivity without having to deal directly with some of the idiosyncrasies of the package. 


To create your custom app, you first instantiate a `ReviewDataApp`. Then you add a series of `AppComponent`'s.

In [None]:
app = ReviewDataApp()
app.add_component(AppComponent(...), ...)
app.add_component(AppComponent(...), ...)

**AppComponent()**
To create an `AppComponent`, you will specify:
- **`name`**: A string naming the particular component
- **`layout`**: Using plotly dash's html and boostrap libraries, define how your component will look like (Divs, Graphs, Tables, etc.)
- **`callback_output`**: A list of `Output()`'s. The first argument is the id of the subcomponent in your layout, and the second argument is what attribute of that component to update with your callback functions
- **`callback_input`**: A list of `Input()`'s. The arguments are similar to `Output()`. If these subcomponents' attributes change, it will run your `internal_callback` function.
- **`callback_state`**: A list of `State()`'s. The arguments are similar to `Output()`. If the `internal_callback` function is triggered, the current values of these subcomponent attributes will be passed as parameters to the `internal_callback` function.
- **`new_data_callback`**: A function that who's first two arguments are assumed to be (1) a `ReviewData` object's `data` table, and (2) an index value of the `ReviewData` object. The next parameters are defined IN ORDER of the `Input()`'s defined by the `callback_input` argument followed by the `State()`'s defined by the `callback_state`argument. The output of this function is a **list** that corresponds IN ORDER of the `Output()`'s listed in `callback_output`. This function will be called whenever a user switches to a new item to review.
- **`internal_callback`**: A function with the the EXACT signature as `new_data_callback`. This function will be called whenever a user changes the attributes of subcomponents listed in `callback_input`.


**Custom args for callback functions**
Sometimes your callback functions need parameters that may be specific to your reviewer type, or defined by the user (ex. pointing to a specific column name in the ReviewDataObject, specific parameters for displaying graphs, etc.). When adding a component to the app, you can also specify these arguments with keywords arguments.

```
premade_component = AppComponent(..., new_data_callback=lambda df, idx, y: df.loc[idx] + y, ...)

class PrebuiltReviewer(ReviewerTemplate):
    ...
    
    # Specific to reviewer type
    def gen_review_app(self):
        app = ReviewDataApp()
        app.add_component(premade_component,
                          y=10) 
        return app
        
    # OR define by the user
    def gen_review_app(self, y):
        app = ReviewDataApp()
        app.add_component(premade_component,
                          y=y) 
        return app
```

**Premade components**

`ReviewDataApp` objects also includes a function `add_table_from_path()` to create a simple table reading a file from a column.


In [None]:
def gen_review_app(self, test_param) -> ReviewDataApp:
        app = ReviewDataApp()
        app.add_table_from_path(table_name='DFCI MAF file', 
                                component_name='maf-component-id', 
                                table_fn_col='DFCI_bucket_sample_dfci_maf_fn', 
                                table_cols=['Hugo_Symbol', 'Chromosome', 't_alt_count', 't_ref_count', 'Tumor_Sample_Barcode'])


        def gen_data_summary_table(df, idx, cols):
            r = df.loc[idx]
            return [[html.H1(f'{r.name} Data Summary'), dbc.Table.from_dataframe(r[cols].to_frame().reset_index())]]

        app.add_component(AppComponent(name='sample-info-component', 
                                       layout=html.Div(children=[html.H1('Data Summary'), 
                                                       dbc.Table.from_dataframe(df=pd.DataFrame())],
                                                       id='sample-info-component'
                                                      ), 
                                       callback_output=[Output('sample-info-component', 'children')],
                                       new_data_callback=gen_data_summary_table),
                               cols=['BETA_ploidy',
                                     'BETA_purity',
                                     'BETA_purity_lower',
                                     'BETA_purity_upper']
                              )

        def plot_interactive_graph(df: pd.DataFrame, idx: str, slider_value, test_param):
            x = np.arange(0, 1, 0.1)
            fig = go.Figure()
            fig.add_trace(go.Scatter(x=x, y=x))
            print(f'plot_interactive_graph: {test_param}')
            return [fig, 0.5]

        def interactive_graph_change_lines(df: pd.DataFrame, idx:str, slider_value, test_param):
            fig = plot_interactive_graph(df, idx, slider_value, test_param)[0] # cache?
            fig.add_vline(slider_value)
            print(f'interactive_graph_change_lines: {test_param}')
            return [fig, dash.no_update] # or just return the original file


        app.add_component(AppComponent('test-interactive-graph',
                                      html.Div(children=[dcc.Graph(figure={}, id='a-figure'), 
                                                dcc.Slider(0, 1, 0.1, value=0.5, 
                                                           id='a-slider'
                                                          )
                                               ]),
                                      new_data_callback=plot_interactive_graph,
                                      internal_callback=interactive_graph_change_lines,
                                      callback_output=[Output('a-figure', 'figure'), Output('a-slider', 'value')],
                                      callback_input=[Input('a-slider', 'value')],
                                      callback_states_for_autofill=[State('a-slider', 'value')] # see Autofill
                                   ),
                       test_param=test_param
                       )
        
        return app
    

## gen_autofill()

Sometimes you may have a lot of annotations, or one of your components produces an value that you want to use as an annotation. It can be annoying sometimes to have to manually type things, so the `ReviewDataApp` has functionality to handle linking the current state of your subcomponents in the app to the annotation input panel.

You can add these "links" with `self.add_autofill()`:
- component_name: the name of the component to read from.
- autofill_dict: a dictionary linking the annotation column name in the reviewdata object (key) to the State of something in the layout of the app, or a constant value.

Prereqs: Only States specified when creating components with `AppComponent()` with parameter `callback_states_for_autofill` can be used for autofilling


In [None]:
def gen_autofill(self):
    self.add_autofill('test-interactive-graph', {'purity': State('a-slider', 'value'), 
                                                 'class': 'Option 1'})

# Full example

In [1]:
%load_ext autoreload
%autoreload 2

In [14]:
import pandas as pd
import numpy as np
import functools
import time
import os

In [15]:
from JupyterReviewer.ReviewData import ReviewData, ReviewDataAnnotation
from JupyterReviewer.ReviewDataApp import ReviewDataApp, AppComponent
from JupyterReviewer.ReviewerTemplate import ReviewerTemplate



import plotly.express as px
from plotly.subplots import make_subplots
from jupyter_dash import JupyterDash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output, State
from dash.exceptions import PreventUpdate
from dash import Dash, dash_table
import dash
import dash_bootstrap_components as dbc
import functools
import plotly.graph_objects as go

# For pickling to work, need to explicitly define function
def validate_purity(x):
    return x < 0.5
class PrebuiltReviewer(ReviewerTemplate):
    def gen_review_data(self,
                        review_data_fn: str, 
                        description: str='', 
                        df: pd.DataFrame = pd.DataFrame(), 
                        review_data_annotation_dict: {str: ReviewDataAnnotation} = {}, 
                        reuse_existing_review_data_fn: str = None,
                       ):
        
        df['new_column'] = 'preprocessing'

        return  ReviewData(review_data_fn=review_data_fn,
                           description=description,
                           df=df,
                           review_data_annotation_list = {'purity': ReviewDataAnnotation('number', validate_input=validate_purity),
                                                          'rating': ReviewDataAnnotation('number', options=range(10)),
                                                          'description': ReviewDataAnnotation('text'),
                                                          'class': ReviewDataAnnotation('radioitem', options=[f'Option {n}' for n in range(4)]),
                                                         })
    
    def gen_review_app(self, test_param) -> ReviewDataApp:
        app = ReviewDataApp()
        app.add_table_from_path(table_name='DFCI MAF file', 
                                component_name='maf-component-id', 
                                table_fn_col='DFCI_bucket_sample_dfci_maf_fn', 
                                table_cols=['Hugo_Symbol', 'Chromosome', 't_alt_count', 't_ref_count', 'Tumor_Sample_Barcode'])


        def gen_data_summary_table(df, idx, cols):
            r = df.loc[idx]
            return [[html.H1(f'{r.name} Data Summary'), dbc.Table.from_dataframe(r[cols].to_frame().reset_index())]]

        app.add_component(AppComponent(name='sample-info-component', 
                                              layout=html.Div(children=[html.H1('Data Summary'), 
                                                                 dbc.Table.from_dataframe(df=pd.DataFrame())],
                                                       id='sample-info-component'
                                                      ), 
                                              callback_output=[Output('sample-info-component', 'children')],
                                              new_data_callback=gen_data_summary_table, 
                                              ),
                               cols=['BETA_ploidy',
                                     'BETA_purity',
                                     'BETA_purity_lower',
                                     'BETA_purity_upper']
                              )

        def plot_interactive_graph(df: pd.DataFrame, idx: str, slider_value, test_param):
            x = np.arange(0, 1, 0.1)
            fig = go.Figure()
            fig.add_trace(go.Scatter(x=x, y=x))
            print(f'plot_interactive_graph: {test_param}')
            return [fig, 0.5]

        def interactive_graph_change_lines(df: pd.DataFrame, idx:str, slider_value, test_param):
            fig = plot_interactive_graph(df, idx, slider_value, test_param)[0] # cache?
            fig.add_vline(slider_value)
            print(f'interactive_graph_change_lines: {test_param}')
            return [fig, dash.no_update] # or just return the original file


        app.add_component(AppComponent('test-interactive-graph',
                                      html.Div(children=[dcc.Graph(figure={}, id='a-figure'), 
                                                dcc.Slider(0, 1, 0.1, value=0.5, 
                                                           id='a-slider'
                                                          )
                                               ]),
                                      new_data_callback=plot_interactive_graph,
                                      internal_callback=interactive_graph_change_lines,
                                      callback_output=[Output('a-figure', 'figure'), Output('a-slider', 'value')],
                                      callback_input=[Input('a-slider', 'value')],
                                      callback_states_for_autofill=[State('a-slider', 'value')]
                                   ),
                       test_param=test_param
                       )
        
        return app
    
    def gen_autofill(self):
        self.add_autofill('test-interactive-graph', {'purity': State('a-slider', 'value'), 
                                                     'class': 'Option 1'})
    
    

# User POV

In [None]:
test_reviewer = PrebuiltReviewer()
test_reviewer.set_review_data(review_data_fn = '/Users/cchu/Desktop/Methods/JupyterReviewer/data/Prebuilt_reviewer.Dev_Reviewer.pkl', 
                                description='testing', 
                                df = cchu_purities_df, # optional if directory above already exists. 
                                 review_data_annotation_dict = {'purity': ReviewDataAnnotation('number', validate_input=validate_purity),
                                                                'rating': ReviewDataAnnotation('number', options=range(10)),
                                                                'description': ReviewDataAnnotation('text'),
                                                                'class': ReviewDataAnnotation('radioitem', options=[f'Option {n}' for n in range(4)]),
                                                               }
                             )
test_reviewer.set_review_app(test_param='testing param kwargs')

# User customization
test_reviewer.app.add_component(AppComponent('test-add-component', html.Div(html.H1('New component'))))
test_reviewer.review_data.add_annotation(ReviewDataAnnotation('another_annotation', 'number'))

# Run
test_reviewer.run()

In [None]:
# User can reuse the app
another_reviewer = PrebuiltReviewer()
another_reviewer.set_review_data(...)
another_reviewer.app = test_reviewer.app # do not need to re-add customizations
another_reviewer.autofill_dict = test_reviewer.autofill_dict
test_reviewer.run()


# Reviewing Data 

1. Enforce consistent and meaningful annotation
1. Consolidate multiple sources of data in a single place
1. Make review dashboard flexible for different needs

The only constrains are:
1. Each row corresponds to specific data item you want to annotated. It is independent of the other rows in the table
1. History can not be "undone"

Below is an example of a table where each row corresponds to a sample. In each column is all the data that I plan to use or look at to annotat. 
- existing annotations (such as clinical data)
- paths to files I can plot as a graph or display as a table

Recommendations:
- Do as much automation for annotations as possible first. You can use this tool to manually check and update these annotations
- Preprocess your files so when each sample's data is rendered, it will take less time to switch between samples.

## Setting up your review session

`ReviewData` is an object meant to mirror how one may go about annotations by going row by row in a spreadsheet, and filling in/editing the corresponding columns. Instantiate your `ReviewData` session by specifying:
1. A directory to store the meta data related to your review session
1. A pandas dataframe with all the information you need for each data point. The data point id's must be the index
1. Specify what you want to annotate and set validation (in progress)

If the ReviewData Session directory already exists and as the expected files, it will simply reload those existing files. Some caveats:
1. If you add items to `annotation_data`, it will add the column to the annot table. However, deleting one from the list will not remove it from the table. However, later you will see you cannot update that column in the app
1. any changes to df between re-runs will NOT change any of the values or paths in the data table. You will have to manually update the path/data.tsv file if this is what you want to do. Depending on why you may want to update your input data, generally I recommend making a new session. (there is an option to "autofill" annotations, so you do not necessarily have to completely redo everything)

In [13]:
if not os.path.exists('review_sessions'):
    os.mkdir('review_sessions')
    
# get local path
def valid_purity(x):
    return (x >= 0) and (x < 1)

rd_path = 'review_sessions/Jupyter_Reviewer_Tutorial.pkl'
rd = ReviewData(review_data_fn=rd_path,
                df=data_df,
                description='Example jupyter reviewer description',
                review_data_annotation_list=[ReviewDataAnnotation('Purity', annot_type='number', validate_input=valid_purity),
                                             ReviewDataAnnotation('Flag', annot_type='number', options=range(10)),
                                             ReviewDataAnnotation('Notes', annot_type='text'),
                                             ReviewDataAnnotation('Follow up', annot_type='radioitem', options=['Continue', 'Rerun', 'Remove'])]
               )
rd.annot.head()


Loading existing review session review_sessions/Jupyter_Reviewer_Tutorial.pkl...


Unnamed: 0,Purity,Flag,Notes,Follow up
0,,,,
1,,,,
2,,,,
3,,,,
4,,,,


# Interactive Review data with Plotly Dash

Plotly dash is a packages that allows you to create dashboards pythonically. It has built in objects and functions to easily assemble components so you can display multiple things at once and implement interactivity.


In [22]:
import plotly.express as px
from plotly.subplots import make_subplots
from jupyter_dash import JupyterDash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output, State
from dash.exceptions import PreventUpdate
from dash import Dash, dash_table
import dash
import dash_bootstrap_components as dbc
import functools

## 1. Instantiate the App by passing in you `ReviewData` object

In [24]:
test_app = ReviewDataApp(test_rd)

To run, call `run_app()`

## 2. Add simple components


Already implemented is a table from a given path. 


In [25]:
test_app.add_table_from_path('DFCI MAF file', 
                             'maf-component-id', 
                             'DFCI_local_sample_dfci_maf_fn', 
                             ['Hugo_Symbol', 'Chromosome', 't_alt_count', 't_ref_count', 'Tumor_Sample_Barcode'])



## 3. Custom components

You may want to use this if you want to display:
- graphs
- implement interactive components
- utilize multiple inputs to produce a plot (note that it's better to precompute as much as possible)

To add a custom component, you need to define:
1. A name for the component
1. A dash layout (link to site on how to make these). Fill your components with empty data first.
1. callback_output: define which components in your dash layout 


In [None]:

def gen_data_summary_table(df, idx, cols):
    r = df.loc[idx]
    return [[html.H1(f'{r.name} Data Summary'), dbc.Table.from_dataframe(r[cols].to_frame().reset_index())]]

test_app.add_custom_component('sample-info-component', 
                              html.Div(children=[html.H1('Data Summary'), 
                                                 dbc.Table.from_dataframe(df=pd.DataFrame())],
                                       id='sample-info-component'
                                      ), 
                              callback_output=[Output('sample-info-component', 'children')],
                              new_data_callback=gen_data_summary_table, 
                              cols=['BETA_ploidy',
                                     'BETA_purity',
                                     'BETA_purity_lower',
                                     'BETA_purity_upper'])



**How to write a callback**

1. The first parameter must be a pandas series. Automatically, the `ReviewDataApp` will pass in the data associated with the current row as the first parameter. All data references are made via key access to the `ReviewData` data table. 
1. Any additional arguments can be specified in `test_app.add_custom_component()` as `kwargs`. This way, you can reuse existing functions, and customize arguments
1. Your output must look like your `callback_output` argument, with the outputs corresponding by order of which components to send the results to
    1. You may notive that in the above example, `gen_data_summary_table()` returns a nested list. This is because my `callback_output` parameter is a list with a single dash `Output()` object. This dash `Output()` object refers to the `children` of the component `'sample-info-component'`, which refers to the `html.Div` dash component. 
    1. The outer brackets correspond to the outer brackets of my input to `callback_output`. The inner bracket is to value sent to update the `children` attribute of the `html.Div` object, which consists of a list of two components. 
    
Alternatively, you can pass a dictionary to `callback_output` specifying keywords to assign values to each specified dash `Output()` object.

## 4. Custom Interactive Components

Each component can be made of multiple components. If you have multiple components you want to have interact with each other, then they need to be grouped into one large component. Each time you `add_custom_component()`, those components cannot interact with the components added in separate calls. 

Below is an example where an interactive table can be used to modify the graph and recalculate the purity based on the selected mutations. 


In [27]:
from scipy.stats import beta, kruskal
import plotly.graph_objects as go

tumor_f_bin_width = 1.0/500.0
tumor_f_bins = np.arange(0, 1, tumor_f_bin_width)
pval_threshold = 1.1E-4
def plot_beta(maf_df, data_id):
    
    if maf_df.empty:
        raise ValueError("There are no mutations in the maf dataframe.")

    for idx, r in maf_df.iterrows():
        pdf = beta.pdf(tumor_f_bins, r['t_alt_count'] + 1, r['t_ref_count'] + 1)
        maf_df.loc[idx, tumor_f_bins] = pdf / (sum(pdf) * tumor_f_bin_width)

    sum_pdf = maf_df[tumor_f_bins].sum(axis=0)
    sum_pdf = sum_pdf / (sum_pdf.sum() * tumor_f_bin_width)
    if 'tumor_f' not in maf_df.columns:
        maf_df['tumor_f'] = maf_df['t_alt_count'].astype(float) / (maf_df['t_alt_count'] + maf_df['t_ref_count'])
    maf_df = maf_df.sort_values(by='tumor_f',
                                ascending=False).reset_index()

    clonal_muts = [maf_df.index[0]]  # Get the first one
    for j in np.arange(maf_df.shape[0], 1, -1):
        h_stat, pval = kruskal(*maf_df.iloc[:j].apply(lambda x: np.concatenate((np.ones(x['t_alt_count']),
                                                                                np.zeros(x['t_ref_count']))),
                                                      axis=1).tolist())
        if pval > pval_threshold:
            clonal_muts = maf_df.index[:j].tolist()
            break

    subclonal_muts = maf_df.index[clonal_muts[-1] + 1:].tolist() if clonal_muts[-1] < maf_df.shape[0] else []

    clonal_prod_pdf = maf_df.loc[clonal_muts, tumor_f_bins].product(axis=0)
    clonal_prod_pdf = clonal_prod_pdf / (clonal_prod_pdf.sum() * tumor_f_bin_width)
    half_purity = clonal_prod_pdf.argmax()
    purity = clonal_prod_pdf.index[half_purity] * 2

    log_clonal_prod_pdf = np.log10(clonal_prod_pdf)
    log_clonal_prod_pdf = log_clonal_prod_pdf - np.max(log_clonal_prod_pdf)
    cis = log_clonal_prod_pdf[log_clonal_prod_pdf >= -1].index.tolist()
    purity_lower_ci = cis[0] * 2
    purity_upper_ci = cis[-1] * 2
    
    # plotly plot
    # Step 1: make the figure
    maf_df['clonal_status'] = maf_df.index.map(lambda x: 'clonal' if x in clonal_muts else 'subclonal')
    maf_df['Mut_Label'] = maf_df['Hugo_Symbol'] + ':' + maf_df['Start_position'].astype(str) + ':' + maf_df['Protein_Change'].astype(str) + ':' + maf_df['Variant_Classification'].astype(str)
    to_plot_maf_df = maf_df.set_index('Mut_Label')[list(tumor_f_bins)].stack().reset_index()
    to_plot_maf_df['clonal_status'] = to_plot_maf_df['Mut_Label'].map(maf_df[['Mut_Label', 'clonal_status']].set_index('Mut_Label')['clonal_status'])
    to_plot_maf_df['pdf_log10'] = np.log10(to_plot_maf_df[0])
    fig = px.line(to_plot_maf_df, x='level_1', y='pdf_log10', color='clonal_status', 
                  hover_data=['Mut_Label'], title=f'{data_id}: purity = {round(purity, 2)} [{round(purity_lower_ci, 2)} - {round(purity_upper_ci, 2)}]')
    fig.add_trace(go.Scatter(x=tumor_f_bins, y=np.log10(clonal_prod_pdf),
                    mode='lines',
                    name='clonal product pdf'))
    fig.add_trace(go.Scatter(x=tumor_f_bins, y=np.log10(sum_pdf),
                    mode='lines',
                    name='all mutations sum pdf'))
    
    fig.add_vrect(x0=cis[0], x1=cis[-1], line_width=0, fillcolor="red", opacity=0.2)
    fig.add_vline(x=clonal_prod_pdf.index[half_purity], name='Half purity')
    
    ylim_min=10 ** (-4)
    ylim_max=10 ** 2
    fig.update_yaxes(range=[np.log10(ylim_min), np.log10(ylim_max)])

    return fig, purity, purity_lower_ci, purity_upper_ci


In [28]:
beta_table_cols = ['CHIP_mut_status', 
                  'aSCNA', 
                  'Hugo_Symbol', 
                  'Chromosome', 
                  'Start_position', 
                  'Variant_Classification', 
                  'Protein_Change', 
                  't_alt_count', 
                  't_ref_count', 
                  'total_count', 
                  'tumor_f', 
                  'gnomADg_AF']
blank_beta_df = pd.DataFrame(columns=beta_table_cols)
blank_beta_df.loc[0, beta_table_cols] = 'Test'

@functools.lru_cache(maxsize=32) # faster to reload
def read_maf(fn):
    return pd.read_csv(fn, sep='\t')

def beta_graph_callback(df, idx, 
                        reload_beta_graph_button, 
                        selected_rows, 
                        beta_table_fn_col, 
                        beta_table_display_col):
    r = df.loc[idx]
    maf_df = read_maf(r[beta_table_fn_col])
    selected_rows = maf_df[maf_df['pass_known_driver_filter']].reset_index().index.tolist()
    fig, purity, purity_lower_ci, purity_upper_ci = plot_beta(maf_df.loc[selected_rows], r.name)
    return [fig, maf_df[beta_table_display_col].to_dict('records'), selected_rows, purity, 2]

def internal_beta_graph_callback(df, idx, 
                        reload_beta_graph_button, 
                        selected_rows, 
                        beta_table_fn_col, 
                        beta_table_display_col):
    r = df.loc[idx]
    maf_df = read_maf(r[beta_table_fn_col])
    fig, purity, purity_lower_ci, purity_upper_ci = plot_beta(maf_df.loc[selected_rows], r.name)
    return [fig, maf_df[beta_table_display_col].to_dict('records'), selected_rows, purity, 2]
    
test_app.add_custom_component('beta-graph', 
                              html.Div([html.H1("Beta MAF"), 
                                        html.Button('Reload Beta Plot', id='reload-beta-button', n_clicks=0),
                                        dash_table.DataTable(
                                                              id='beta-maf-table',
                                                              columns=[{"name": i, "id": i} for i in beta_table_cols],
                                                              data=blank_beta_df.to_dict('records'),
                                                              filter_action="native",
                                                              sort_action="native",
                                                              sort_mode="multi",
                                                              column_selectable="single",
                                                              row_selectable="multi",
                                                              selected_columns=[],
                                                              selected_rows=[0],
                                                              page_action="native",
                                                              page_current= 0,
                                                              page_size= 12,
                                         ), 
                                        html.Div([html.P('Purity: ', style={'display': 'inline'}), html.P(0, id='beta-graph-purity', style={'display': 'inline'})]), 
                                        html.Div([html.P('Ploidy: ', style={'display': 'inline'}), html.P(0, id='beta-graph-ploidy', style={'display': 'inline'})]), 
                                        dcc.Graph(id='beta-graph', figure={})]), # todo just make name the heading
                              callback_output=[Output('beta-graph', 'figure'), 
                                               Output('beta-maf-table', 'data'), 
                                               Output('beta-maf-table', 'selected_rows'),
                                               Output('beta-graph-purity', 'children'),
                                               Output('beta-graph-ploidy', 'children')
                                              ],
                              callback_input=[Input('reload-beta-button', 'n_clicks'), 
                                               State('beta-maf-table', 'selected_rows')],
                              new_data_callback=beta_graph_callback, 
                              internal_callback=internal_beta_graph_callback,
                              add_autofill=True,
                              autofill_dict={'purity': Input('beta-graph-purity', 'children')},
                              beta_table_fn_col='BETA_annot_maf_fn',
                              beta_table_display_col=beta_table_cols
                             )

This is also an example of where you can specify the outputs of this component can be used to autofill annotations if you recalculate something. The requirements are:
1. The data you want to prefill is the value of one of your output components (temporarily storing your data)
1. `autofill_dict` keys must correspond to the names of the columns in the review data object annotation table you made



# Run the app

If you are running the notebook in a VM, you may need to specify a host and port. To view, you will need to forward the correspoding port. 

You can run directly in the notebook with `mode='inline'`, or in a separate window with `mode='external'`

In [29]:
test_app.run(mode='external', port=8065)

Dash app running on http://0.0.0.0:8065/



The 'environ['werkzeug.server.shutdown']' function is deprecated and will be removed in Werkzeug 2.1.


divide by zero encountered in log10



This produces the baseline dashboard, which simply allows you to iterate through each row, add annotations, and view the history of annotations.

Run app parameters:
- mode:
- port:
- host: