# Running the MerFISH pipeline

This notebook describes how to run the MerFISH pipeline from the notebook. Information on the pipeline can be found from the documentation at http://imaxt.ast.cam.ac.uk/docs/merfish/docs/index.html

## Copying data

In order to copy data to the server use ftp. The connection details are:

 * Protocol: SFTP
 * URI: imaxt.ast.cam.ac.uk
 * Port: 2222
 * Authentication: Use your archive username and password
 * Path where to store data: /storage/*username*

In Linux the `lftp` command is very useful. E.g. to transfer the contents of the directory `merfish_sample_001` using 10 threads

```bash
lftp -e 'mirror -R --parallel=10 merfish_sample_001; quit;' \ 
sftp://username@imaxt.ast.cam.ac.uk:2222/storage/username
```

where *username* is your archive username.

<div class="alert alert-block alert-warning">
    <b>N.B.:</b> The location of the data in the server is /data/meds1_a/jimaxt/<em>username</em>
</div>

## Writing a pipeline definition file

Below there is a default configuration file. Need to edit this to point to the location of the data and additional information. The main parameters needed are:

 * ``data_description``: this strucure contains the location and characteristics of the data. Most of the keywords here needed modification for each experiment. This will be eventually in the metadata.
 * ``output_dir``: Location of output analysys. This should be a subdirectory in ``/data/meds1_a/processed/merfish``.
 
 In the server a default configuration file can be obtained typing:
 
 ```bash
 python -m merfish_pipeline config
 ```
 
 * ``decoding.bead_planes``: This is a list containing the z offsets that can be used to determine offsets between cycles. These should be those planes with higher concentration of beads. Note that the first plane is z=0. So this could be e.g. [0, 1, 2, 3] to use only the first 4 offsets.
 * ``decoding.rna_planes``: As above but for the RNA, these are the offsets that will be decoded.
 
More documentation is available from http://imaxt.ast.cam.ac.uk/docs/merfish/docs/pipedef.html

Below a ready to go pipeline definition file. Modify the parameters and run the cell so that a file called ``merfish.yaml`` will be created in the current directory.

In [None]:
%%writefile merfish.yaml

version: 1

extra_pip_packages: merfish-pipeline
name: merfish

# Data description file
data_description: 
  path: /data/meds1_b/imaxt_data/merfish/20190530/2019_05_30_tumourtissue4t1_acry_sds2d-cleared
  name: 2019_05_30_tumourtissue4t1  # Unique name to identify the sample
  raw: raw_data                     # Directory containing the Tiff images

  stagepos: matlab_processing/stagePos.csv                               # Stage pos file
  codebook: matlab_processing/C1E1_codebook.csv                          # Codebook file
  data_organization: reformatted_raw_for_matlab/data_organization.xlsx   # Data organization file

# Directory where to write data
output_dir: /data/meds1_b/imaxt/merfish/output2

# Configuration for doing the mosaic
mosaic:
  compute_offsets: True
  reference_channel: bit=0
  store_arrays: False

# Configuration for decoding
decoding:
  decode: True  # Run decoding
  fovs: []      # List of fields to decode. Empty for all.
  bead_planes: []     # List of bead offsets to use
  rna_planes: []       # List of RNA offsets to use
  threshold: 3  # Extraction threshold
  minarea: 5    # Minimum area of features to extract

resources:
  workers: 80    # number of workers
    
comments: >
  These are comments that can be added by the user.
  There can be any number of lines indented.

  Paragraphs are separated by an empty line

## Submitting the pipeline

Prior to submitting a pipeline you need to be authenticated. To do so run

```bash
owl api login
```

from a terminal. This only needs to be done once.

Then pipelines can be submitted as:

In [None]:
!owl pipeline submit --conf merfish.yaml

## Tracking progress

You can track progress from the terminal with the command:

```bash
owl pipeline status jobid
```

or using a browser and going to the archive web page at: https://imaxt.ast.cam.ac.uk/archive/owl


<p style='text-align: right;'> Last Updated: 2018-08-19</p>
