# flairOutOfTheBox.ipynb
## Marcus Viscardi,    February 28, 2022
Working directory: /data16/marcus/working/220228_flairOutOfTheBox

***From 220228_readme.txt:***
*Going to try to run flair they way that it was built to be used. So it will*
*ID transcripts and then assess differential expression on its own. Hopefully*
*this will help to resolve the issues I have been facing with it being unable*
*to integrate well w/ the rest of my pipeline!*
[...]
*The goal of this script will be to write up the manifest file and consistently call the*
*FLAIR script(s). Hopefully this will make troubleshooting a bit easier!*

### reads_manifest.tsv production:
This file is meant to be in the below format. It is what FLAIR uses to pull in reads from various libraries and orient how it's comparing technical/biological replicates.

***From [FLAIR github](https://github.com/BrooksLabUCSC/flair#flair-quantify)***
*sample1	conditionA	batch1	./sample1_reads.fq*
*sample2	conditionA	batch1	./sample2_reads.fq*
*sample3	conditionA	batch2	./sample3_reads.fq*
*sample4	conditionB	batch1	./sample4_reads.fq*
*sample5	conditionB	batch1	./sample5_reads.fq*
*sample6	conditionB	batch2	./sample6_reads.fq*

In [None]:
from nanoporePipelineCommon import pick_libs_return_paths_dict, get_dt
libs = ["polyA1", "polyA2", "polyA3", "totalRNA1", "totalRNA2", "totalRNA3"]
fastq_dict = pick_libs_return_paths_dict(libs,
                                         output_dir_folder="cat_files",
                                         file_midfix="cat",
                                         file_suffix=".fastq")

manifest_path = f"/data16/marcus/working/220228_flairOutOfTheBox/{get_dt(for_file=True)}__generated_reads_manifest.tsv"

with open(manifest_path, "w") as manifest:
    for lib, path in fastq_dict.items():
        sample_id = f"set{lib[-1:]}"
        condition_id = f"{lib[:-1]}"
        manifest_line = f"{sample_id}\t{condition_id}\tbatch1\t{path}\n"
        manifest.write(manifest_line)
print(f"\nDone. Manifest generated at: {manifest_path}")

### Run FLAIR [*quantify*](https://github.com/BrooksLabUCSC/flair#quantify)

In [None]:
import subprocess
import os
os.chdir("/data16/marcus/working/220228_flairOutOfTheBox/")

threads = 30

flair_call = f"python3 /data16/marcus/scripts/brooksLabUCSC_flair/flair.py quantify -t {threads} " \
             f"--generate_map -r {manifest_path} " \
             f"-i /data16/marcus/genomes/elegansRelease100/Caenorhabditis_elegans.WBcel235.cdna.all.fa"
subprocess.call(flair_call, shell=True)

### Run FLAIR [*diffExp*](https://github.com/BrooksLabUCSC/flair#flair-diffexp)

In [None]:
# subprocess.call("rm -r ./output_dir", shell=True)

# THIS CRASHES THE NOTEBOOK, run the call in there terminal

# print(f"Removed.")
# flair_call = f"python3 /data16/marcus/scripts/brooksLabUCSC_flair/flair.py diffExp -q ./counts_matrix.tsv -o /data16/marcus/working/220228_flairOutOfTheBox/output_dir"
# subprocess.call(flair_call, shell=True)

### Run FLAIR [*diff_iso_usage.py*](https://github.com/BrooksLabUCSC/flair#diffisoscript)

In [None]:

flair_call = f"python3 /data16/marcus/scripts/brooksLabUCSC_flair/bin/diff_iso_usage.py " \
             f"counts_matrix.tsv sample2_totalRNA_batch1 sample2_polyA_batch1 /data16/marcus/working/220228_flairOutOfTheBox/diff_iso_attempt.txt"
subprocess.call(flair_call, shell=True)


### Lets look at the counts_matrix.py

In [None]:
import pandas as pd
pd.set_option('display.width', 150)
pd.set_option('display.max_columns', None)

counts_matrix = pd.read_table("./counts_matrix.tsv")
counts_matrix.sort_values("sample2_totalRNA_batch1", ascending=False)

In [None]:
import rpy2
rpy2.__version__