# 1. Jupyter notebooks

This is a Jupyter notebook!

It can run bash commands!:

In [None]:
ls -1

In [None]:
mkdir this_is_a_folder

In [None]:
rmdir this_is_a_folder

We have conda installed in it!:

In [None]:
conda info

We have also mamba!:

In [None]:
mamba info

Also snakemake 

In [None]:
snakemake --help

And also git!:

In [None]:
git status

Let's download the pipeline used for the laser microdisection work package

In [None]:
git clone https://github.com/3d-omics/bioinfo_detection_limit_test

Move into the folder

In [None]:
cd bioinfo_detection_limit_test

# 2 The laser microdisection pipeline

This is the pipeline used in WP4 for microdisection testing:

![pipeline_complex](assets/img/rulegraph_complex.svg)

Simplified:

![rulegraph](assets/img/rulegraph.svg)

In short:
- Reads are trimmed with `fastp` (adaptors, short, LQ reads).
- They are mapped to the hosts (human + chicken) and then the remainder to the MAGs with `bowtie2`.
- Count tables are computed with `coverm`.
- Reference-free rarefaction curves with `nonpareil`.
- Taxonomic assignment of every read with `kraken2`.

Get inside the repository:

In [None]:
cd bioinfo_detection_limit_test

# 3. Tell Snakemake to download the necesary tools (5 min):

- `--use-conda`: use the conda package manager to download and manage software.
- `--conda-frontend mamba`: use mamba instead of conda, which is faster.
- `--conda-create-envs-only`: just download the software. Do not execute the pipeline.
- `--jobs 1`: use only one CPU.

In [None]:
snakemake \
    --use-conda \
    --conda-frontend mamba \
    --conda-create-envs-only \
    --jobs 1

# 4. Repository contents

There should be 3 folders (left panel or `ls -1`):
- `config`: with information about  the reads, references and parameters
- `resources`: where to put the input databases, FASTQ files, references, etc.
- `workflow`: the code. Enter at your own risk.

# 5. Look at the configuration files

## 5.1 Samples
- Sample names
- Library ID
- Library type
- Paths to forward and reverse FASTQ
- Adapters

In [None]:
cat config/samples.tsv

## 5.2 Features
  - Hosts: names and where to find them
  - MAG catalogues: names and where to find them
  - Additional databases

In [None]:
cat config/features.yml

## 5.3 Parameters

In [None]:
cat config/params.yml

# 6. Check the pipeline with `snakemake -n -p`

- `-n` means dry-run
- `-p` means print the commands

In [None]:
snakemake -n -p

# 7. Run it!

In [None]:
snakemake --use-conda -j 4

# 8. Results

Trimmed reads, mappings, count tables, etc are stored in the `results` folder

In [None]:
tree -L 2 results

# Reports

The pipeline generates lots of reports for every step of the pipeline and every sample used. They are under the `reports` folder.

Jupyter can't render properly the reports, so you will have to download them (right click) and open in your browser.

# Snakemake is regenerative

- Comment / delete the last sample in `config/samples.tsv`

In [None]:
snakemake --use-conda --conda-frontend mamba -j 4

- Remove a reference from the `config/features.tsv`

In [None]:
snakemake --use-conda --conda-frontend mamba -j 4

- Change the required length to 50 bp

In [None]:
snakemake --use-conda --conda-frontend mamba -j 4

# Play with the other pipelines:
- [Host Genomics (GATK)](https://github.com/3d-omics/Bioinfo_Macro_Host_Genomics)
- [Host Transcriptomics](https://github.com/3d-omics/Bioinfo_Macro_Host_Transcriptomics)
- [Metatranscriptomics](https://github.com/3d-omics/Bioinfo_Macro_Microbial_Metatranscriptomics)
- ~~[Metagenome Assembly](https://github.com/3d-omics/Bioinfo_Macro_Genome_Resolved_Metagenomics)~~ This one takes too much time

## 0. Get out of the folder

In [None]:
cd

## 1. Clone and enter the folder

In [None]:
git clone [url_goes_here]
cd [repo_name]

## 2. Edit the config files
- [x] samples.tsv
- [ ] features.tsv
- [ ] params.tsv

## 3. Download software

In [None]:
snakemake --use-conda --conda-frontend mamba --jobs 1

## 4. Test

In [None]:
snakemake -n -p

## 5. Run

## 6. Peek into the results

In [None]:
tree -

## 7. Verify the reports