# Tutorial

This tutorial is a Jupyter notebook that illustrates steps involved in using Tapqir. If you would like to experiment with the live version of the notebook download it or run it in Google Colab using the links above.

In [1]:
# If you are running tutorial in Google Colab do this steps first
# 1) Change runtime to GPU (Runtime -> Change runtime type -> GPU)
# 2) Run this cell to install Tapqir in Google Colab
import sys
IN_COLAB = "google.colab" in sys.modules
if IN_COLAB:
    !pip install git+https://github.com/gelles-brandeis/tapqir.git > install.log

## Data preparation

In this tutorial we will analyze the data from Rpb1-SNAP$^{549}$ binding to DNA$^{488}$ experiment ([Dynamics of RNA polymerase II and elongation factor Spt4/5 recruitment during activator-dependent transcription](https://www.pnas.org/content/117/51/32348))

This data was acquired with [Glimpse](https://github.com/gelles-brandeis/Glimpse) and pre-processed with imscroll program (see [CoSMoS_Analysis](https://github.com/gelles-brandeis/CoSMoS_Analysis/wiki)). To prepare data for Tapqir, we need to specify the location of following files & folders:

* `title` - Text describing the experiment
* `header_dir` - folder name containing glimpse and header files
* `ontarget_aoiinfo` - file designating target molecule (DNA) locations in the binder channel
* `offtarget_aoiinfo` - file designating off-target (nonDNA) locations in the binder channel
* `driftlist` - file recording the stage movement that took place during the experiment
* `frame_start` - starting frame (optional)
* `frame_end` - ending frame (optional)
* `ontarget_labels` - predicted labels with another program (e.g., spot-picker) (optional for comparison)
* `offtarget_labels` - predicted labels with another program (e.g., spot-picker) (optional for comparison)

In [4]:
IS_TUTORIAL = True
DOWNLOADED = False

In [5]:
%%capture
# Download glimpse files for the tutorial (only once)
if IS_TUTORIAL and not DOWNLOADED:
    !wget http://centaur.biochem.brandeis.edu/Rpb1SNAP549_glimpse.zip
    !unzip Rpb1SNAP549_glimpse.zip && rm Rpb1SNAP549_glimpse.zip
    DOWNLOADED = True

In [6]:
# new empty directory to store data files and analysis results
!mkdir Rpb1SNAP549_analysis

In [7]:
!tapqir config Rpb1SNAP549_analysis

[0m

Edit the `options.cfg` file to indicate the locations of glimpse and preprocessing files:

```
[glimpse]
title = Rpb1-SNAP549 binding to DNA488
dir = Rpb1SNAP549_glimpse/garosen00267
ontarget_aoiinfo = Rpb1SNAP549_glimpse/green_DNA_locations.dat
offtarget_aoiinfo = Rpb1SNAP549_glimpse/green_nonDNA_locations.dat
driftlist = Rpb1SNAP549_glimpse/green_driftlist.dat
frame_start = 1
frame_end = 790
ontarget_labels
offtarget_labels
```

In [10]:
!tapqir glimpse Rpb1SNAP549_analysis

INFO - Processing glimpse files ...
100%|████████████████████████████████████████| 790/790 [00:07<00:00, 112.21it/s]
INFO - On-target data: N=331 AOIs, F=790 frames, P=14 pixels, P=14 pixels
INFO - Off-target data: N=526 AOIs, F=790 frames, P=14 pixels, P=14 pixels
INFO - Data is saved in Rpb1SNAP549_analysis/data.tpqr
[0m

## Data analysis

We will analyze the data using the time-independent model.

Probability of there being any target-specific spot in a frame $p(\mathsf{specific})$ is calculated as $p(\theta>0)$.

### Fitting the data to the model

In [1]:
!tapqir --debug fit marginal Rpb1SNAP549_analysis -it 10

INFO - Device - cuda
INFO - Floating precision - torch.float64
INFO - Tapqir version - v1.1.6+496.ge87a229.dirty
INFO - Model - marginal
INFO - Loaded data from Rpb1SNAP549_analysis/data.tpqr
INFO - Step #100. Loaded model params and optimizer state from Rpb1SNAP549_analysis/marginal/v1.1.6
INFO - Optimizer - Adam
INFO - Learning rate - 0.005
INFO - Batch size - 13
INFO - nojit
100%|███████████████████████████████████████████| 10/10 [00:03<00:00,  2.63it/s]
INFO - Device - cpu
INFO - Floating precision - torch.float64
INFO - Parameters were saved in Rpb1SNAP549_analysis/params.tpqr
[0m

In [2]:
!tapqir fit cosmos Rpb1SNAP549_analysis -it 100

INFO - Device - cuda
INFO - Floating precision - torch.float64
INFO - Tapqir version - v1.1.6+496.ge87a229.dirty
INFO - Model - cosmos
INFO - Loaded data from Rpb1SNAP549_analysis/data.tpqr
INFO - Optimizer - Adam
INFO - Learning rate - 0.005
INFO - Batch size - 13
INFO - nojit
100%|█████████████████████████████████████████| 100/100 [00:26<00:00,  3.80it/s]
INFO - Device - cpu
INFO - Floating precision - torch.float64
INFO - Parameters were saved in Rpb1SNAP549_analysis/params.tpqr
[0m

### View fitting progress in Tensorboard

In [16]:
%load_ext tensorboard

In [None]:
%tensorboard --logdir Rpb1SNAP549_analysis