# Developer Jupyter Reviewer Tutorial

This tutorial is for developers who want to make a Reviewer object from scratch.

If you are looking to use a pre-made Reviewer, refer to the Quick Start Jupyter Reviewer Tutorial (TBD)


# Introduction


There are 3 parts you need to generate to make a standard Jupyter Reviewer:

**1. ReviewData Object** A ReviewData object consists of 3 pandas dataframes:
1. data: a dataframe with data that a user wants to review, row by row (ie samples, participants, mutations, etc.)
2. annot: a dataframe with annotations that a user wants to write for each row (ie notes, flags, etc.)
3. history: a timeline of changes a user makes to the annot table


**2. ReviewData Annotations** Set what kind of data the reviewer needs to enter

**3. ReviewDataApp** A ReviewDataApp is a plotly.dash application to display data in a particular way to review data in a ReviewData object, and already includes prebuilt functionality for a user to add annotations and view history of the ReviewData.

As a developer, you will define custom dash components you want to display for the type of review you are implementing (purity, mutation, etc.). This includes tables, graphs, or other components of interest. plotly.dash also enables interactivity, so you can define special functions to allow for interactive viewing of charts and graphs, and to auto-calculate values you may want to use for your annotations (more on autofill below).

When a ReviewDataApp is passed a ReviewData object, it will read the ReviewData object to render your components.

**4. ReviewData Annotation display parameters** Define how to display the annotation inputs in the app

**5. Autofill Dictionary** Autofill allows you to connect outputs of the ReviewDataApp to the annotations for the ReviewData object. At runtime, it will add buttons in the top annotation panel where a user can click and the current values of the selected components in the dash will map to the specified annotation inputs in the annotation panel.

The following tutorial will walk through each of these steps in more detail.

## What a user sees

A user using your custom Reviewer will have a notebook that looks like the following:


# The basic structure

To build your custom reviewer, you will need to do the following:

1. Create a new class that inherits `ReviewerTemplate` in the `JupyterReviewer/Reviewers/` directory
1. Define 5 abstract methods:
    1. gen_review_data()
    1. gen_review_app()
    1. set_default_review_data_annotations()
    1. set_default_review_data_annotation_app_display()
    1. gen_autofill()
    
Your file will look something like the following:


## gen_review_data()

To crete a `ReviewData` object, the following parameters must be defined, either by the developer or the user, or both:
- `review_data_fn`: A file path to save the object as a pkl file.
- `description`: A string describing the ReviewData object's source of data and purpose
- `df`: pandas dataframe containing the data to review. Each row corresponds to a single item to be annotated.
- `review_data_annotation_dict`: A dictionary of `ReviewDataAnnotation` objects where you can define what annotations you want for your review (name with the key). This dictionary will be used to render the annotation inputs in the `ReviewDataApp`.
- `reuse_existing_review_data_fn`: You can reuse the annotations of a previous `ReviewData` object.

For your custom reviewer, you may have specific plots or calculations you want to make, and known/common annotations that a someone reviewing the data should use. You define these special features for your type of reviewer in `gen_review_data(self, ...)`. The main features would be:
1. Preprocessing the input `df` dataframe, such as precomputing data/graphs and adding columns you may need for your ReviewDataApp. 
2. `review_data_annotation_dict`: a str: `ReviewDataAnnotation` dictionary. The key string will be the column name in the review data object's annotation table. `ReviewDataAnnotation` consists of:
    1. `type`: view `ReviewData.AnnotationType` Enum
    1. `options`: a list of valid values (for checklist and radioitems)
    1. `validate_input`: a named, non-local function (cannot be a lambda function) that takes a single parameter and returns True/False

Once you have done any preprocessing and defined any default annotations, you can create and return the `ReviewData` object. 

## set_default_review_data_annotations()

Here you define what data to collect during the review session. Use the prebuilt method `self.add_review_data_annotation()`, which will add the columns to the `ReviewData.annot` table and also store the corresponding validation parameters. These are associated with the `ReviewData` object generated from the above `gen_review_data()`.

Paramters for `ReviewerTemplate/add_review_data_annotation()` are:
- name: The name of the column in `annot` table. It will also be used in the app.
- review_data_annotation: a `ReviewDataAnnotation` object. Its parameters are:
    - `annot_value_type`: one of `["multi", "float", "int", "string"]`
    - `options`: list of options that are valid for this annotation
    - `validate_input`: A custom function to validate inputs
    - `default`: value to automatically fill annotations with

## gen_review_app()

This creates a dash application where you can define what components to include. The `ReviewDataApp` already has built-in components to handle interating through the items in any `ReviewData` object, rendering the annotation inputs defined by the `ReviewData` object's `review_data_annotation_list`, and the history.

**What is Dash?**

Dash is a library that makes it easy to generate custom dashboards in python. I recommend reviewing the [Dash Tutorial](https://dash.plotly.com/installation) first before proceeding. 

In short, to create a dash app, you define:
1. Layout: how you want the app to look
2. Callbacks: functions to define interactivity with the components in your layout

The `ReviewDataApp` is built so it is simple for you to easily add components and interactivity without having to deal directly with some of the idiosyncrasies of the plotly dash package. 


To create your custom app, you first instantiate a `ReviewDataApp`. Then you add a series of `AppComponent`'s.

**AppComponent()**
To create an `AppComponent`, you will specify:
- **`name`**: A string naming the particular component
- **`layout`**: Using plotly dash's html and boostrap libraries, define how your component will look like (Divs, Graphs, Tables, etc.)
- **`callback_output`**: A list of `Output()`'s. The first argument is the id of the subcomponent in your layout, and the second argument is what attribute of that component to update with your callback functions
- **`callback_input`**: A list of `Input()`'s. The arguments are similar to `Output()`. If these subcomponents' attributes change, it will run your `internal_callback` function.
- **`callback_state`**: A list of `State()`'s. The arguments are similar to `Output()`. If the `internal_callback` function is triggered, the current values of these subcomponent attributes will be passed as parameters to the `internal_callback` function.
- **`new_data_callback`**: A function that who's first two arguments are assumed to be (1) a `ReviewData` object's `data` table, and (2) an index value of the `ReviewData` object. The next parameters are defined IN ORDER of the `Input()`'s defined by the `callback_input` argument followed by the `State()`'s defined by the `callback_state`argument. The output of this function is a **list** that corresponds IN ORDER of the `Output()`'s listed in `callback_output`. This function will be called whenever a user switches to a new item to review.
- **`internal_callback`**: A function with the the EXACT signature as `new_data_callback`. This function will be called whenever a user changes the attributes of subcomponents listed in `callback_input`.


**Custom args for callback functions**
Sometimes your callback functions need parameters that may be specific to your reviewer type, or defined by the user (ex. pointing to a specific column name in the ReviewDataObject, specific parameters for displaying graphs, etc.). When adding a component to the app, you can also specify these arguments with keywords arguments.

```
premade_component = AppComponent(..., 
                                 new_data_callback=lambda df, idx, y: df.loc[idx] + y, 
                                 ...)

class PrebuiltReviewer(ReviewerTemplate):
    ...
    
    # Specific to reviewer type
    def gen_review_app(self):
        app = ReviewDataApp()
        app.add_component(premade_component,
                          y=10) # <-------------------
        return app
        
    # OR define by the user
    def gen_review_app(self, y):
        app = ReviewDataApp()
        app.add_component(premade_component,
                          y=y)  # <-------------------
        return app
```

**Premade components**

`ReviewDataApp` objects also includes a function `add_table_from_path()` to create a simple table reading a file from a column. You can use it just like `add_component()`, but you only need to specify which column in the ReviewData object to get the file from, and which columns in the file's table to display.


In [None]:
def gen_review_app(self, test_param) -> ReviewDataApp:
        app = ReviewDataApp()
        app.add_table_from_path(table_name='DFCI MAF file', 
                                component_name='maf-component-id', 
                                table_fn_col='DFCI_bucket_sample_dfci_maf_fn', 
                                table_cols=['Hugo_Symbol', 'Chromosome', 't_alt_count', 't_ref_count', 'Tumor_Sample_Barcode'])
        
        return app
    

## set_default_review_data_annotations_app_display()

After the user has set the review data object's annotation data and the corresponding app, the user now has to specify how to display the annotations in the input form of the app.

`Reviewer.set_default_review_data_annotations_app_display()` is only called in the public method `Reviewer.set_default_review_data_anontations_configuration()`, which is used if the user wants to use your default annotations and display configuration. To define `Reviewer.set_default_review_data_annotations_app_display()`, Use the prebuilt `self.add_review_data_annotations_app_display()` method for each annotation to include. The parameters are:
- `name`: Corresponding name of a column that exists in the `ReviewData.annot` table (specified in `gen_review_data_annotations()`
-`app_display_type`: one of `['text', 'textarea', 'number', 'checklist', 'radioitem', 'select']`


## set_default_autofill()

Sometimes you may have a lot of annotations, or one of your components produces an value that you want to use as an annotation. It can be annoying sometimes to have to manually type things into the form, so the `ReviewDataApp` has functionality to handle linking the current state of your subcomponents in the app to the annotation input panel.

You can set these "links" with `self.add_autofill()`:
- `component_name`: the name of the component to read from.
- `autofill_dict`: a dictionary linking the annotation column name in the reviewdata object (key) to the State of something in the layout of the app, or a constant value.

Prereqs: Only States specified when creating components with `AppComponent()` with parameter `callback_states_for_autofill` can be used for autofilling

Note that the user can add additional annotation if they like with `reviewer.add_autofill()` as well if they decide to add additional components or annotations to the existing reviewer.


# Full example

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import pandas as pd
import numpy as np
import functools
import time
import os

In [4]:
from JupyterReviewer.ReviewData import ReviewData, ReviewDataAnnotation
from JupyterReviewer.ReviewDataApp import ReviewDataApp, AppComponent
from JupyterReviewer.ReviewerTemplate import ReviewerTemplate



import plotly.express as px
from plotly.subplots import make_subplots
from jupyter_dash import JupyterDash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output, State
from dash.exceptions import PreventUpdate
from dash import Dash, dash_table
import dash
import dash_bootstrap_components as dbc
import functools
import plotly.graph_objects as go

# For pickling to work, need to explicitly define function
def validate_purity(x):
    return x < 0.5
class PrebuiltReviewer(ReviewerTemplate):
    def gen_review_data(self,
                        review_data_fn: str, 
                        description: str='', 
                        df: pd.DataFrame = pd.DataFrame(), 
                        review_data_annotation_list: [ReviewDataAnnotation] = [], 
                        reuse_existing_review_data_fn: str = None,
                       ):
        
        df['new_column'] = 'preprocessing'

        return  ReviewData(review_data_fn=review_data_fn,
                           description=description,
                           df=df,
                          )
    
    def set_default_review_data_annotations(self):
        self.add_review_data_annotation('purity', ReviewDataAnnotation('float', validate_input=validate_purity))
        self.add_review_data_annotation('rating', ReviewDataAnnotation('int', options=range(10)))
        self.add_review_data_annotation('description', ReviewDataAnnotation('string'))
        self.add_review_data_annotation('class', ReviewDataAnnotation('multi', options=[f'Option {n}' for n in range(4)]))
    
    def gen_review_app(self, test_param) -> ReviewDataApp:
        app = ReviewDataApp()
        app.add_table_from_path(table_title='DFCI MAF file', 
                                component_id='maf-component-id', 
                                table_fn_col='DFCI_bucket_sample_dfci_maf_fn', 
                                table_cols=['Hugo_Symbol', 'Chromosome', 't_alt_count', 't_ref_count', 'Tumor_Sample_Barcode'])


        def gen_data_summary_table(df, idx, cols):
            r = df.loc[idx]
            return [[html.H1(f'{r.name} Data Summary'), dbc.Table.from_dataframe(r[cols].to_frame().reset_index())]]

        app.add_component(AppComponent(name='sample-info-component', 
                                              layout=html.Div(children=[html.H1('Data Summary'), 
                                                                 dbc.Table.from_dataframe(df=pd.DataFrame())],
                                                       id='sample-info-component'
                                                      ), 
                                              callback_output=[Output('sample-info-component', 'children')],
                                              new_data_callback=gen_data_summary_table, 
                                              ),
                               cols=['BETA_ploidy',
                                     'BETA_purity',
                                     'BETA_purity_lower',
                                     'BETA_purity_upper']
                              )

        def plot_interactive_graph(df: pd.DataFrame, idx: str, slider_value, test_param):
            x = np.arange(0, 1, 0.1)
            fig = go.Figure()
            fig.add_trace(go.Scatter(x=x, y=x))
            print(f'plot_interactive_graph: {test_param}')
            return [fig, 0.5]

        def interactive_graph_change_lines(df: pd.DataFrame, idx:str, slider_value, test_param):
            fig = plot_interactive_graph(df, idx, slider_value, test_param)[0] # cache?
            fig.add_vline(slider_value)
            print(f'interactive_graph_change_lines: {test_param}')
            return [fig, dash.no_update] # or just return the original file


        app.add_component(AppComponent('test-interactive-graph',
                                      html.Div(children=[dcc.Graph(figure={}, id='a-figure'), 
                                                dcc.Slider(0, 1, 0.1, value=0.5, 
                                                           id='a-slider'
                                                          )
                                               ]),
                                      new_data_callback=plot_interactive_graph,
                                      internal_callback=interactive_graph_change_lines,
                                      callback_output=[Output('a-figure', 'figure'), Output('a-slider', 'value')],
                                      callback_input=[Input('a-slider', 'value')],
                                      callback_states_for_autofill=[State('a-slider', 'value')]
                                   ),
                       test_param=test_param
                       )
        
        return app
    
    def set_default_review_data_annotations_app_display(self):
        self.add_review_data_annotations_app_display('purity', 'number')
        self.add_review_data_annotations_app_display('rating', 'number')
        self.add_review_data_annotations_app_display('description', 'text')
        self.add_review_data_annotations_app_display('class', 'radioitem')
    
    def set_default_autofill(self):
        self.add_autofill('test-interactive-graph', State('a-slider', 'value'), 'purity')
        self.add_autofill('test-interactive-graph', 'Option 1', 'class')
    
    

# User POV

In [None]:
test_reviewer = PrebuiltReviewer()
test_reviewer.set_review_data(review_data_fn = '/Users/cchu/Desktop/Methods/JupyterReviewer/data/Prebuilt_reviewer.Dev_Reviewer.pkl', 
                                description='testing', 
                                df = cchu_purities_df, # optional if directory above already exists. 
                             )
test_reviewer.set_review_app(test_param='testing param kwargs')
test_reviewer.set_default_review_data_annotations_configuration()
test_reviewer.set_autofill()

# User customization
test_reviewer.app.add_component(AppComponent('Test Add Component', html.Div(html.P('New component'))))
test_reviewer.review_data.add_annotation('another_annotation', ReviewDataAnnotation('number'))

# Run
test_reviewer.run(mode='external', port=8085)

In [None]:
# User can reuse the app
another_reviewer = PrebuiltReviewer()
another_reviewer.set_review_data(...)
another_reviewer.app = test_reviewer.app # do not need to re-add customizations
another_reviewer.autofill_dict = test_reviewer.autofill_dict
test_reviewer.run()


In [13]:
if not os.path.exists('review_sessions'):
    os.mkdir('review_sessions')
    
# get local path
def valid_purity(x):
    return (x >= 0) and (x < 1)

rd_path = 'review_sessions/Jupyter_Reviewer_Tutorial.pkl'
rd = ReviewData(review_data_fn=rd_path,
                df=data_df,
                description='Example jupyter reviewer description',
                review_data_annotation_list=[ReviewDataAnnotation('Purity', annot_type='number', validate_input=valid_purity),
                                             ReviewDataAnnotation('Flag', annot_type='number', options=range(10)),
                                             ReviewDataAnnotation('Notes', annot_type='text'),
                                             ReviewDataAnnotation('Follow up', annot_type='radioitem', options=['Continue', 'Rerun', 'Remove'])]
               )
rd.annot.head()


Loading existing review session review_sessions/Jupyter_Reviewer_Tutorial.pkl...


Unnamed: 0,Purity,Flag,Notes,Follow up
0,,,,
1,,,,
2,,,,
3,,,,
4,,,,
