# SuSiE fine-mapping workflow

## Generate regional level genotype and phenotype data

### Genotype data partition by region

This step is necessary for finemapping using SuSiE.
`TADB_enhanced_cis.bed` is a list of TADB extended region for each gene, in formatting of gene_id, chr, start, end. The complete version of the file has been uploaded t
o [GitHub](https://github.com/cumc/fungen-xqtl-analysis/blob/main/resource/TADB_enhanced_cis.bed). Here we use a trimmed version that only contains information for the data we are analyzing in the MWE, ie, the `protocol_example.protein.enhanced_cis_chr21_chr22.bed` 

In [6]:
sos run pipeline/genotype_formatting.ipynb genotype_by_region \
    --region-list protocol_example/protocol_example.protein.enhanced_cis_chr21_chr22.bed \
    --genoFile protocol_example/protocol_example.genotype.chr21_22.bed \
    --cwd output \
    --container oras://ghcr.io/cumc/bioinfo_apptainer:latest

INFO: Running [32mgenotype_by_region_1[0m: 
INFO: [32mgenotype_by_region_1[0m (index=1) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=4) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=3) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=5) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=0) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=2) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=6) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=7) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=8) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=9) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=10) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=14) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=13) is [32mcompleted[0m.
INFO: [32mgenotype_by_region_1[0m (index=11) is [32

## Fine-mapping using individual level data

In [None]:
sos run pipeline/cis_workhorse.ipynb susie_twas \
    --name protocol_example_protein \
    --genoFile output/protocol_example.protein.enhanced_cis_chr21_chr22_genotype_by_region/protocol_example.genotype.chr21_22.genotype_by_region_files.txt \
    --phenoFile output/phenotype/protocol_example.protein.region_list.txt \
                output/phenotype/protocol_example.protein.region_list.txt \
    --covFile output/covariate/protocol_example.protein.protocol_example.samples.protocol_example.genotype.chr21_22.pQTL.plink_qc.prune.pca.Marchenko_PC.gz \
              output/covariate/protocol_example.protein.protocol_example.samples.protocol_example.genotype.chr21_22.pQTL.plink_qc.prune.pca.Marchenko_PC.gz \
    --phenotype-names trai_A trait_B \
    --no-indel \
    --container oras://ghcr.io/cumc/pecotmr_apptainer:latest

**Please skip the analysis below because they need to be updated with the new workflow interface**

### SuSiE results post processing


In [None]:
sos run pipeline/SuSiE_post_processing.ipynb susie_to_tsv \
    --cwd output/ADGWAS_finemapping_extracted/Bellenguez/ --rds_path `ls GWAS_Finemapping_Results/Bellenguez/ADGWAS2022*rds ` \
    --region-list ~/1300_hg38_EUR_LD_blocks_orig.tsv \
    --container containers/stephenslab.sif 

## Plotting susie

### PIP land scape plot

In [None]:
sos run pipeline/SuSiE_post_processing.ipynb susie_pip_landscape_plot \
    --cwd output/test/ --plot_list plot_recipe_668 --annot_tibble ~/Annotatr_builtin_annotation_tibble.tsv -s force --container containers/stephenslab.sif  &

sos run pipeline/SuSiE_post_processing.ipynb susie_pip_landscape_plot \
    --cwd output/test/ --plot_list plot_recipe_ADGWAS_uni --annot_tibble ~/Annotatr_builtin_annotation_tibble.tsv -s force --container containers/stephenslab.sif  &

sos run pipeline/SuSiE_post_processing.ipynb susie_pip_landscape_plot \
    --cwd output/1182/ --plot_list plot_recipe_1182 --annot_tibble ~/Annotatr_builtin_annotation_tibble.tsv -s force --container containers/stephenslab.sif  &

In [None]:
sos run pipeline/SuSiE_post_processing.ipynb susie_pip_landscape_plot \
    --cwd output/test_3/ --plot_list plot_recipe --annot_tibble ~/Annotatr_builtin_annotation_tibble.tsv -s force --container containers/stephenslab.sif  &

In [None]:
sos run pipeline/SuSiE_post_processing.ipynb susie_pip_landscape_plot \
    --cwd output/5g/ --plot_list recipe_5gene --annot_tibble ~/Annotatr_builtin_annotation_tibble.tsv -s force --container containers/stephenslab.sif  &

### UpSetR plot 

In [None]:
sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_plot \
    --cwd output/updated_mQTL/ --plot_list UpsetR_recipe -s force --container containers/stephenslab.sif &
sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_plot \
    --cwd output/updated_16/ --plot_list UpsetR_recipe_16 -s force --container containers/stephenslab.sif &

In [None]:
sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_cs_plot \
    --cwd output/updated_mQTL/ --plot_list UpsetR_recipe_1 -s force --trait_to_select 3 --container containers/stephenslab.sif &
sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_cs_plot \
    --cwd output/updated_16/ --plot_list UpsetR_recipe_16  -s force --trait_to_select 3 --container containers/stephenslab.sif &

sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_cs_plot \
    --cwd output/rerun/ --plot_list UpsetR_recipe_16_rerun  -s force --trait_to_select 3 --container containers/stephenslab.sif &

sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_cs_plot \
    --cwd output/rerun/ --plot_list UpsetR_recipe_16_rerun  -s force --trait_to_select 2 --container containers/stephenslab.sif &

sos run pipeline/SuSiE_post_processing.ipynb susie_upsetR_cs_plot \
    --cwd output/rerun/ --plot_list UpsetR_recipe_16_rerun  -s force --trait_to_select 1 --container containers/stephenslab.sif &