# FOV Dropout Detection Example

This notebook is meant to be completed in sequential order and will walk you through all the salient steps of the dropout detection pipeline. At the end it covers some functions which may be of use for further analyses


### Installation

Move to the directory you downloaded the code to and pip install:

   `$ pip install . `

### Import the module

In [None]:
from dropout_detection import TranscriptImage, DropoutResult

### Read in a merscope result

The TranscriptImage takes in an experiment barcode and reads in the `detected_transcripts.csv` file in the barcode's directory.

By default, it searches in `/allen/programs/celltypes/production/mfish` for the project and barcode to read in. This can be changed by setting the `merscope_out_dir` parameter. Further parameters are likely not useful and can be found in the function header in the source code.

In [None]:
# Arguments
barcode = 1233424702

# Create TranscriptImage object
ts = TranscriptImage(barcode)

### Run the dropout detection pipeline

The `TranscriptImage.run_dropout_pipeline()` function handles all the steps in the dropout detection pipeline. It requires a path to directory to output tissue mask data into (used for determining on-tissue FOVs) and a path to the codebook for the barcode. Both of these paths can be either of type `pathlib.Path` or `str`.

The `TranscriptImage.run_dropout_pipeline()` function also takes kwarg `threshold` which is set to `0.15` by default

In [None]:
# import pathlib for easier filepaths
from pathlib import Path
workbook_results = Path.cwd() / 'workbook_results/'

# Arguments
mask_out_dir = workbook_results / 'tissue_masks/'
codebook_path = Path.cwd() / 'codebooks/codebook_0_wholebrain031822a_VA142.csv'
threshold = 0.15

# Detect dropouts
ts.run_dropout_pipeline(mask_out_dir, codebook_path, threshold=threshold)

The results from the dropout detection pipeline are stored in an FOV dataframe. This can be saved to disk via either of the following functions:

In [None]:
ts.save_fov_pkl(workbook_results)
ts.save_fov_tsv(workbook_results)

### A note about classes

The `TranscriptImage` class is a subclass of the `DropoutResult` class, meaning it inherits all the functions and variables of the `DropoutResult` class. All of the post-dropout detection analysis functions are from the `DropoutResult` class. Thus, they can be used by the `TranscriptImage` class, but only after `TranscriptImage.run_dropout_pipeline()` is run.

The reason for this separation is to allow for dropout analysis on previously run dropout pipelines. If you have the FOV table from an old dropout detection run, you can use this to create a `DropoutResult` and look at the results without needing to re-run the entire dropout detection pipeline (note that the plotting analyses will all need to load in the transcripts as well).

In [None]:
# Example DropoutResult object, further information about DropoutResult initiation can be found at the bottom of the workbook
dr = DropoutResult(workbook_results / f'{barcode}_fovs.pkl')

For the remainder of the notebook, functions used will be from the `DropoutResult` class, but will be called on a `TranscriptImage` object

### Get the dropout summary

In [None]:
ts.dropout_summary()

### Draw the dropped genes

The `DropoutResult.draw_dropped_genes()` function plots all dropped genes for an experiment. For each gene, the transcripts are plotted alone on the left and a copy of the plot with dropped FOVs highlighted is plotted on the right.

`draw_dropped_genes()` requires a directory (`pathlib.Path` or `str`) to store the drawings in. It also has a kwarg `max_genes` which limits the number of genes plotted (genes are plotted in descending order of transcript count)

In [None]:
ts.draw_dropped_genes(workbook_results / 'images', max_genes=3)

In [None]:
# Display the genes
from IPython.display import Image, display
for gene in ['Slc17a7', 'Sv2b', 'Ccn3']:
    display(Image(workbook_results / f'images/{gene}.png'))

### Draw a total dropout map

The `DropoutResult.draw_dropped_genes()` function plots a graphical overview of the dropout across all genes and FOVs. 

It can optionally take the kwarg `out_file` to save the image somewhere. If an `out_file` is provided, the plot will be closed after drawing and only viewable as a file. If an `out_file` is not provided the plot will show and not be saved.

In [None]:
# Args
out_file = workbook_results / 'images/total_dropout_map.png'

# ts.draw_total_dropout_map(out_file=out_file)
ts.draw_total_dropout_map()

# Further Analyses

### The DropoutResult Class

Doing any of the image-based analyses requires a transcripts DataFrame. To get a transcripts DataFrame, either provide a previously read  DataFrame or provide an experiment barcode (like how you did for the `TranscriptImage` class.

**Args:**

**fovs (pd.DataFrame/str/Path):** FOV table with stored results

**transcripts (pd.DataFrame) [default=None]:** pandas dataframe of transcript information for the experiment

**experiment_id (int) [default=-1]:** id for the experiment. Used to read in transcript information if transcripts argument is not provided

**merscope_out_dir (str/path) [default='/allen/programs/celltypes/production/mfish']:** Merscopes output directory (contains all projects) for reading transcripts via experiment_id

**project (str) [default='']:** Name of the project the experiment is in for reading transcripts via experiment_id. If unspecified, found automatically.

**region (int) [default=0]:** region which holds the transcripts of interest for reading transcripts via experiment_id

### Draw a gene

If you would like to draw a specific gene and view its dropout, use `DropoutResult.draw_dropout()`. 

`draw_dropout()` requires a gene as an argument. It also takes the kwarg `out_file`. If an `out_file` is provided, the plot will be closed after drawing and only viewable as a file. If an `out_file` is not provided the plot will show and not be saved.

In [None]:
ts.draw_dropout('Fxyd6')

### Draw a representative set of genes

`DropoutResult.draw_top_mid_bot()` draws (by transcript count) the top 20 genes, bottom 10 gene, and a subset of 10 genes in the center starting from the last gene to average 100 transcripts per FOV. Depending on transcript count, this function can take a while!

In [None]:
ts.draw_top_mid_bot(output_dir / 'images')

### Dropped Gene Analysis Functions

`DropoutResult.get_dropped_genes(self, fov=-1, dic=False)`

    Get a list of the dropped genes. If an FOV is specified, gets the list of dropped genes for specified FOV. If dic=True, creates a dictionary of FOVs and dropped genes

    Args:
        fov (int) [default=-1]: If specified will return the dropped genes for the specified FOV
        dic (bool) [default=False]: If True, will return a dictionary of FOVs and dropped genes

`DropoutResult.get_dropped_gene_counts(self, fov=-1, dic=False)`

    Get the number of dropped genes. If an FOV is specified, gets the number of dropped genes for specified FOV. If dic=True, creates a dictionary of FOVs and dropped gene counts

    Args:
        fov (int) [default=-1]: If specified will return the number of dropped genes for the specified FOV
        dic (bool) [default=False]: If True, will return a dictionary of FOVs and dropped gene counts

`DropoutResult.get_considered_genes(self, fov=-1, dic=False)`

    Get a list of all genes with at least 1 FOV considered for dropout. An FOV is considered for dropout only if its 4 cardinal neighbors average at least 100 transcripts. 
    If an FOV is specified, gets list for the specified FOV
    If dic is set to True, returns a dictionary of considered genes for each considered fov

    Args:
        fov (int) [default=-1]: If specified will return the list of considered genes for the specified FOV
        dic (bool) [default=False]: If set to True, returns a dictionary of considered fovs and considered genes 
        
`DropoutResult.get_considered_gene_counts(self, fov=-1, dic=False)`

    Get the number of genes with at least 1 FOV considered for dropout. An FOV is considered for dropout only if its 4 cardinal neighbors average at least 100 transcripts.
    If an FOV is specified, get the number of genes for which the FOV was considered
    If dic is set to True, returns a dictionary of the number of considered genes for each considered fov

    Args:
        fov (int) [default=-1]: If specified will return the number considered genes for the specified FOV
        dic (bool) [default=False]: If set to True, returns a dictionary of considered fovs and number of considered genes 
        
`DropoutResult.get_false_positive_genes(self, fov=-1, dic=False)`

    Get a list of all genes which had an FOV which was determined not dropped due to False Positive Correction. If an FOV is specified, get a list of all false positive genes for that FOV. If dic=True, return a dictionary of fovs and their false positive genes.

    Args:
        fov (int) [default=-1]: If specified, return false positive genes for a specific FOV
        dic (bool) [default=False]: If specified, return dictionary of FOVs and their false positive genes

`DropoutResult.get_false_positive_gene_counts(self, fov=-1, dic=False)`

    Get the number genes which had an FOV which was determined not dropped due to False Positive Correction. If an FOV is specified, get the number of false positive genes for that FOV. If dic=True, return a dictionary of fovs and their false positive gene counts

    Args:
        fov (int) [default=-1]: If specified, return the number of false positive genes for a specific FOV
        dic (bool) [default=False]: If specified, return dictionary of FOVs and their false positive gene counts


### Dropped FOV Analysis Functions

`DropoutResult.get_dropped_fovs(self, gene='', dic=False)`

    Get a list of dropped FOVs. If a gene is specified, gets a list of dropped FOVs for specified gene. If dic=True return a dictionary of genes and dropped FOVs

    Args:
        gene (str) [default='']: If specified will return the dropped FOVs for the specified gene
        dic (bool) [default=False]: If True, will return a dictionary of genes and their dropped FOVs
        
`DropoutResult.get_dropped_fov_counts(self, gene='', dic=False)`

    Get the number of unique dropped FOVs. If a gene is specified, gets the number of dropped FOVs for specified gene. If dic=True return a dictionary of genes and dropped FOV counts

    Args:
        gene (str) [default='']: If specified will return the dropped FOV count for the specified gene
        dic (bool) [default=False]: If True, will return a dictionary of genes and their dropped FOV counts

`DropoutResult.get_considered_fovs(self)`

    Get a list of all on-tissue FOVs

`DropoutResult.get_considered_fov_counts(self)`

    Get a the number of on-tissue FOVs

`DropoutResult.get_false_positive_fovs(self, gene='', dic=False)`

    Get a list of all FOVs which were not considered dropped due to False Positive Correction. If a gene is specified, return the false positive FOVs for that gene. If dic=True, return a dictionary of false positive FOVs for each gene

    Args:
        gene (str) [default='']: If specififed, return false positive FOVs for that gene
        dic (bool) [default=False]: If True, return a dictionary of genes and their false positive FOVs

`DropoutResult.get_false_positive_fov_counts(self, gene='', dic=False)`

    Get a the number of FOVs which were not considered dropped due to False Positive Correction. If a gene is specified, return the number of false positive FOVs for that gene. If dic=True, return a dictionary of false positive FOV counts for each gene

    Args:
        gene (str) [default='']: If specififed, return false positive FOVs for that gene
        dic (bool) [default=False]: If True, return a dictionary of genes and their false positive FOVs

### Miscellaneous Analysis Functions

`DropoutResult.get_dropout_count(self)`

    Get the total number of dropped FOVs.