# PathFind scripts

***

## Introduction

A series of scripts were developed so that users can access the results of the Sanger Pathogen Informatics analysis pipelines. These are referred to as the **pathfind** or **pf** scripts.


| PathFind (pf) script | description                                                                               |
| :-:        | ---                                                                                                 |
| **status**     | returns the pipeline status (Running, Done or Failed) for each of the lanes                     |  
| **data **      | finds the location on disk of sequence data tracked by the Pathogen Informatics pipelines       |
| **info**       | returns metadata allowing you to match up internal sample identifiers with supplier identifiers |
| **accession**  | returns the sample and raw data accession numbers for the sequence data                         |
| **qc**         | returns the location of the Kraken report(s) generated by the QC pipeline                       | 
| **map**        | returns the location of the BAM files produced by the mapping pipeline                          | 
| **snp**        | returns the location of the VCF files produced by the SNP calling pipeline                      | 
| **assembly**   | returns the location of the contig FASTA files produced by the assembly pipeline                | 
| **annotation** | returns the location of the GFF files produced by the annotation pipeline                       | 
| **rnaseq**     | returns the location of the expression count files produced by the RNA-Seq pipeline             |
| **ref**        | finds the location of reference files on disk                                                   |

The source code for the pf script Perl module can be found on the [sanger-pathogens Git repository](https://github.com/sanger-pathogens) as [Bio-Path-Find](https://github.com/sanger-pathogens/Bio-Path-Find).

## Learning outcomes

By the end of this tutorial you can expect to be able to:

  * Find the pipeline status for your lane(s) using the pf scripts
  * Find the data for your lane(s) using the pf scripts
  * Find the quality control (QC) results for your lane(s) using the pf scripts
  * Find the analysis pipeline results for your lane(s) using the pf scripts
  * Find a reference using the pf scripts

## Tutorial sections

  * [Introducing the pf scripts](intro.ipynb)
  * [Pipeline status](status.ipynb)
  * [Finding your data](data.ipynb)
  * [Sample information and accessions](metadata.ipynb)
  * [Quality control](qc.ipynb)
  * [Analysis pipeline results](pipeline-results.ipynb)
  * [Finding a reference](reference.ipynb)

## Authors

This tutorial was created by [Victoria Offord](https://github.com/vaofford).

## Running the commands from this tutorial

You can run the commands in this tutorial either directly from the Jupyter notebook (if using Jupyter), or by typing the commands in your terminal window.

### Running commands on Jupyter

If you are using Jupyter, command cells (like the one below) can be run by selecting the cell and clicking _Cell -> Run_ from the menu above or using _Ctrl Enter_ to run the command. Let's give this a try by printing our working directory using the `pwd` command and listing the files within it. Run the commands in the two cells below.

In [None]:
pwd

In [None]:
ls -l

### Running commands in the terminal

You can also follow this tutorial by typing all the commands you see into a terminal window. This is similar to the "Command Prompt" window on MS Windows systems, which allows the user to type DOS commands to manage files.

To get started, select the cell below with the mouse and then either press control and enter or choose _Cell -> Run_ in the menu at the top of the page.

In [None]:
echo cd $PWD

Open a new terminal on your computer and type the command that was output by the previous cell followed by the enter key. The command will look similar to this:

In [None]:
cd /home/manager/pathogen-informatics-training/Notebooks/RNA-Seq/

Now you can follow the instructions in the tutorial from here.

## Prerequisites

This tutorial assumes that you have [Bio-Path-Find](https://github.com/sanger-pathogens/Bio-Path-Find) installed on your computer. For installation instructions please see [Bio-Path-Find](https://github.com/sanger-pathogens/Bio-Path-Find).

To check that you have installed Bio-Path-Find correctly, you can run the following command:

In [None]:
pf -h

This should return the help message for the PathFind (pf) scripts.

## Let's get started!

To get started with the tutorial, head to the first section: [Introducing the pf scripts](intro.ipynb).