## Analysis of waist circumference (WAIST)

This notebook applies the `Get_Job_Script.ipynb` to automatically generate the sbatch scripts to run in Yale's cluster. The end result is to apply [various LMM workflows](https://github.com/statgenetics/UKBB_GWAS_dev/tree/master/workflow) to perform association analysis in the INT-WAIST trait, do clumping analysis and extract associated regions.

## File paths on Yale cluster

- Genotype files in PLINK format:
`/gpfs/gibbs/pi/dewan/data/UKBiobank/genotype_files/pleiotropy_geneticfiles/UKB_Caucasians_phenotypeindepqc120319_updated020720removedwithdrawnindiv`
- Genotype files in BGEN format:
`SAY/dbgapstg/scratch/UKBiobank/genotype_files/ukb39554_imputeddataset/`
- Summary stats for imputed variants BOLT-LMM:
`/gpfs/gibbs/pi/dewan/data/UKBiobank/results/BOLTLMM_results/results_imputed_data`
- Summary stats for inputed variants FastGWA:
`/gpfs/gibbs/pi/dewan/data/UKBiobank/results/FastGWA_results/results_imputed_data`
- Phenotype files:
`/gpfs/gibbs/pi/dewan/data/UKBiobank/phenotype_files/pleiotropy_R01/phenotypesforanalysis`
- Relationship file:
`/gpfs/gibbs/pi/dewan/data/UKBiobank/genotype_files/pleiotropy_geneticfiles/unrelated_n307259/UKB_unrelatedcauc_phenotypes_asthmat2dbmiwaisthip_agesex_waisthipratio_040620`

## 07/01/20 analysis

On the cluster, open up this notebook using the JupyterLab server you set up via the ssh channel, then run the following cells,

### Bash variables for workflow configurations

In [2]:
# Common variables

tpl_file=../farnam.yml
bfile=/gpfs/gibbs/pi/dewan/data/UKBiobank/genotype_files/pleiotropy_geneticfiles/UKB_Caucasians_phenotypeindepqc120319_updated082020removedwithdrawnindiv.bed
sampleFile=/gpfs/gibbs/pi/dewan/data/UKBiobank/genotype_files/ukb39554_imputeddataset/ukb32285_imputedindiv.sample
bgenFile=`echo /gpfs/gibbs/pi/dewan/data/UKBiobank/genotype_files/ukb39554_imputeddataset/ukb_imp_chr{1..22}_v3.bgen`
unrelated_samples=/gpfs/gibbs/pi/dewan/data/UKBiobank/genotype_files/pleiotropy_geneticfiles/unrelated_n307259/UKB_unrelatedcauc_phenotypes_asthmat2dbmiwaisthip_agesex_waisthipratio_040620
formatFile_fastgwa=~/project/UKBB_GWAS_dev/data/fastGWA_template.yml
formatFile_bolt=~/project/UKBB_GWAS_dev/data/boltlmm_template.yml
formatFile_saige=~/project/UKBB_GWAS_dev/data/saige_template.yml
container_lmm=/gpfs/gibbs/pi/dewan/data/UKBiobank/lmm.sif
container_marp=/gpfs/gibbs/pi/dewan/data/UKBiobank/marp.sif
# LMM directories
lmm_dir_fastgwa=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/FastGWA_results/results_imputed_data/INT-WAIST
lmm_dir_bolt=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/BOLTLMM_results/results_imputed_data/INT-WAIST
lmm_dir_saige=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/SAIGE_results/results_imputed_data/INT-WAIST
lmm_sos=~/project/bioworkflows/GWAS/LMM.ipynb
lmm_sbatch_fastgwa=../output/$(date +"%Y-%m-%d")_INT-WAIST-fastgwa.sbatch
lmm_sbatch_bolt=../output/$(date +"%Y-%m-%d")_INT-WAIST-bolt.sbatch
lmm_sbatch_saige=../output/$(date +"%Y-%m-%d")_INT-WAIST-saige.sbatch
phenoFile=/gpfs/gibbs/pi/dewan/data/UKBiobank/phenotype_files/pleiotropy_R01/phenotypesforanalysis/normalized_phenotypes/UKB_caucasians_BMIwaisthip_AsthmaAndT2D_INT-WAIST_withagesex_042020
## LMM variables 
covarFile=/gpfs/gibbs/pi/dewan/data/UKBiobank/phenotype_files/pleiotropy_R01/phenotypesforanalysis/normalized_phenotypes/UKB_caucasians_BMIwaisthip_AsthmaAndT2D_INT-WAIST_withagesex_042020
LDscoresFile=/gpfs/gibbs/pi/dewan/data/UKBiobank/LDSCORE.1000G_EUR.tab.gz
geneticMapFile=/gpfs/gibbs/pi/dewan/data/UKBiobank/genetic_map_hg19_withX.txt.gz
phenoCol=rankNorm_WAIST
covarCol=SEX
covarMaxLevels=10
qCovarCol="AGE BMI"
numThreads=20
bgenMinMAF=0.001
bgenMinINFO=0.8
lmm_job_size=1
### Specific to SAIGE
bgenMinMAC=4
trait_type=quantitative
loco=TRUE
sampleCol=IID
# LD clumping directories
clumping_dir=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/LD_clumping/INT-WAIST
clumping_sos=../workflow/LD_Clumping.ipynb
clumping_sbatch=../output/$(date +"%Y-%m-%d")_INT-WAIST_ldclumping.sbatch
## LD clumping variables
# For sumtastsFiles if more than one provide each path
bfile_ref=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/LD_clumping/UKB_Caucasians_phenotypeindepqc120319_updated020720removedwithdrawnindiv.1210.ref_geno.bed
# Changes dependending upon which traits are analyzed
# In this case INT-WAIST and asthma
sumstatsFiles="/gpfs/gibbs/pi/dewan/data/UKBiobank/results/BOLTLMM_results/results_imputed_data/INT-WAIST/UKB_caucasians_BMIwaisthip_AsthmaAndT2D_INT-WAIST_withagesex_042020_rankNorm_WAIST.boltlmm.snp_stats.gz \
              /gpfs/gibbs/pi/dewan/data/UKBiobank/results/FastGWA_results/results_imputed_data/asthma/Asthma_casesbyICD10codesANDselfreport_controlsbyselfreportandicd10_noautoimmuneincontrols_forbolt030720_ASTHMA.fastGWA.snp_stats.gz"
ld_sample_size=1210
clump_field=P
clump_p1=5e-08
clump_p2=1
clump_r2=0.2
clump_kb=2000
clump_annotate=BP
numThreads=20
clump_job_size=1
# Region extraction directories
extract_dir=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/region_extraction/INT-WAIST
extract_sos=../workflow/Region_Extraction.ipynb
extract_sbatch=../output/$(date +"%Y-%m-%d")_INT-WAIST-region.sbatch
## Region extraction variables
region_file=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/LD_clumping/INT-WAIST/*.clumped_region
geno_path=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/UKBB_bgenfilepath.txt
sumstats_path=/gpfs/gibbs/pi/dewan/data/UKBiobank/results/BOLTLMM_results/results_imputed_data/INT-WAIST/ukb_imp_v3.UKB_caucasians_BMIwaisthip_AsthmaAndT2D_INT-WAIST_withagesex_042020.BoltLMM.snp_stats.all_chr.gz
extract_job_size=10




## BoltLMM job

In [None]:
# age, sex and BMI as covariates
lmm_args="""boltlmm
    --cwd $lmm_dir_bolt
    --bfile $bfile 
    --sampleFile $sampleFile
    --bgenFile $bgenFile 
    --phenoFile $phenoFile 
    --formatFile $formatFile_bolt 
    --covarFile $covarFile 
    --LDscoresFile $LDscoresFile 
    --geneticMapFile $geneticMapFile 
    --phenoCol $phenoCol 
    --covarCol $covarCol 
    --covarMaxLevels $covarMaxLevels 
    --qCovarCol $qCovarCol 
    --numThreads $numThreads 
    --bgenMinMAF $bgenMinMAF 
    --bgenMinINFO $bgenMinINFO 
    --job_size $lmm_job_size
    --container_lmm $container_lmm
    --container_marp $container_marp
"""

sos run ~/project/bioworkflows/admin/Get_Job_Script.ipynb farnam \
    --template-file $tpl_file \
    --workflow-file $lmm_sos \
    --to-script $lmm_sbatch_bolt \
    --args "$lmm_args"

## FastGWA job

In [1]:
lmm_args="""fastGWA
    --cwd $lmm_dir_fastgwa 
    --bfile $bfile 
    --sampleFile $sampleFile
    --bgenFile $bgenFile 
    --phenoFile $phenoFile 
    --formatFile $formatFile_fastgwa 
    --covarFile $covarFile 
    --LDscoresFile $LDscoresFile 
    --geneticMapFile $geneticMapFile 
    --phenoCol $phenoCol 
    --covarCol $covarCol 
    --covarMaxLevels $covarMaxLevels 
    --qCovarCol $qCovarCol 
    --numThreads $numThreads 
    --bgenMinMAF $bgenMinMAF 
    --bgenMinINFO $bgenMinINFO 
    --job_size $lmm_job_size
    --container_lmm $container_lmm
    --container_marp $container_marp
"""

sos run ~/project/bioworkflows/admin/Get_Job_Script.ipynb farnam \
    --template-file $tpl_file \
    --workflow-file $lmm_sos \
    --to-script $lmm_sbatch_fastgwa \
    --args "$lmm_args"

## SAIGE job

In [None]:
lmm_args="""SAIGE
    --cwd $lmm_dir_saige 
    --bfile $bfile 
    --sampleFile $sampleFile
    --bgenFile $bgenFile 
    --phenoFile $phenoFile 
    --formatFile $formatFile_saige 
    --covarFile $covarFile 
    --LDscoresFile $LDscoresFile 
    --geneticMapFile $geneticMapFile 
    --phenoCol $phenoCol 
    --covarCol $covarCol 
    --covarMaxLevels $covarMaxLevels 
    --qCovarCol $qCovarCol 
    --numThreads $numThreads 
    --bgenMinMAF $bgenMinMAF 
    --bgenMinINFO $bgenMinINFO 
    --bgenMinMAC $bgenMinMAC
    --trait_type $trait_type
    --loco $loco
    --sampleCol $sampleCol
    --job_size $lmm_job_size
    --container_lmm $container_lmm
    --container_marp $container_marp
"""

sos run ~/project/bioworkflows/admin/Get_Job_Script.ipynb farnam \
    --template-file $tpl_file \
    --workflow-file $lmm_sos \
    --to-script $lmm_sbatch_saige \
    --args "$lmm_args"

## LD clumping job

In [None]:
clumping_args="""default 
    --cwd $clumping_dir
    --bfile_ref $bfile_ref
    --bgenFile $bgenFile
    --sampleFile $sampleFile 
    --sumstatsFiles $sumstatsFiles 
    --unrelated_samples $unrelated_samples 
    --ld_sample_size $ld_sample_size 
    --clump_field $clump_field
    --clump_p1 $clump_p1 
    --clump_p2 $clump_p2 
    --clump_r2 $clump_r2 
    --clump_kb $clump_kb 
    --clump_annotate $clump_annotate 
    --numThreads $numThreads 
    --job_size $clump_job_size
"""

sos run ~/project/bioworkflows/admin/Get_Job_Script.ipynb farnam \
    --template-file $tpl_file \
    --workflow-file $clumping_sos \
    --to-script $clumping_sbatch \
    --args "$clumping_args"

## Region extract job

In [3]:
extract_args="""default
    --cwd $extract_dir
    --region-file $region_file
    --pheno-path $phenoFile
    --geno-path $geno_path
    --bgen-sample-path $sampleFile
    --sumstats-path $sumstats_path
    --format-config-path $formatFile_bolt
    --unrelated-samples $unrelated_samples
"""

sos run ~/project/bioworkflows/admin/Get_Job_Script.ipynb farnam \
    --template-file $tpl_file \
    --workflow-file $extract_sos \
    --to-script $extract_sbatch \
    --args "$extract_args"

INFO: Running [32mfarnam[0m: Configuration for Yale `farnam` cluster
INFO: [32mfarnam[0m is [32mcompleted[0m.
INFO: [32mfarnam[0m output:   [32m../output/2021-05-03_INT-WAIST-region.sbatch[0m
INFO: Workflow farnam (ID=w877faa9ae666436f) is executed successfully with 1 completed step.



## 03/13/23 Burden Test for WC (using hg38 genotype array file)

### White-european extended

In [9]:
lmm_dir_regenie=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_extwhite-hg38
lmm_sbatch_regenie=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_extwhite-hg38/WC_EUR_200kWES-regenie-burden_$(date +"%Y-%m-%d").sbatch
phenoFile=~/UKBiobank/phenotype_files/pleiotropy/UKB_exome_white_WC_pcs_rerun
covarFile=~/UKBiobank/phenotype_files/pleiotropy/UKB_exome_white_WC_pcs_rerun
phenoCol=WAIST_inverserank
covarCol=sex
qCovarCol='BMI age WC_PC1 WC_PC2 WC_PC3 WC_PC4 WC_PC5 WC_PC6 WC_PC7 WC_PC8 WC_PC9 WC_PC10'
genoFile=`echo ~/UKBiobank/data/exome_files/project_VCF/072721_run/plink/ukb23156_c{1..22}.merged.filtered.bed`
bfile=~/UKBiobank/genotype_files_processed/012323_white_european_460649ind_hg38/final_files_no_outliers/*.bed
anno_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.anno_file
set_list=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.set_list_file
mask_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.mask_file
aaf_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.aff_file
build_mask=max
aaf_bins='0.005 0.01'
tpl_file=~/project/bioworkflows/admin/csg.yml
lmm_sos=~/project/bioworkflows/GWAS/LMM.ipynb
container_marp=~/containers/marp.sif
container_lmm=~/containers/lmm.sif 
lmm_job_size=1
ylim=20
k=10
reverse_log_p=True
numThreads=20
formatFile_regenie=~/project/UKBB_GWAS_dev/data/regenie_template.yml
bsize=1000
## Trait leave empty for qt traits
trait=
minMAC=1
snpannofile=~/UKBiobank/results/ukb23155_200Kexomes_annovar/2021_10_12_hg38_exome/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.gz

lmm_args="""regenie_burden
    --cwd $lmm_dir_regenie 
    --bfile $bfile 
    --genoFile $genoFile
    --phenoFile $phenoFile 
    --formatFile $formatFile_regenie 
    --phenoCol $phenoCol
    --covarCol $covarCol  
    --qCovarCol $qCovarCol
    --bsize $bsize
    --trait $trait
    --anno_file $anno_file
    --set_list $set_list
    --mask_file $mask_file
    --aaf_file $aaf_file
    --aaf_bins $aaf_bins
    --build_mask $build_mask
    --job_size $lmm_job_size
    --ylim $ylim
    --k $k
    --reverse_log_p $reverse_log_p
    --numThreads $numThreads
    --minMAC $minMAC
    --snpannofile $snpannofile
    --container_lmm $container_lmm
    --container_marp $container_marp
"""

sos run ~/project/UKBB_GWAS_dev/admin/Get_Job_Script.ipynb csg \
    --template-file $tpl_file \
    --workflow-file $lmm_sos \
    --to-script $lmm_sbatch_regenie \
    --args "$lmm_args"

INFO: Running [32mcsg[0m: Configuration for Columbia csg partition cluster
INFO: [32mcsg[0m (index=0) is [32mignored[0m due to saved signature
INFO: [32mcsg[0m output:   [32m/home/dmc2245/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_extwhite-hg38/WC_EUR_200kWES-regenie-burden_2023-03-14.sbatch[0m
INFO: Workflow csg (ID=wda52c27ecafe945a) is ignored with 1 ignored step.



## Asian

In [10]:
lmm_dir_regenie=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_asian-hg38
lmm_sbatch_regenie=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_asian-hg38/WC_ASN_200kWES-regenie-burden_$(date +"%Y-%m-%d").sbatch
phenoFile=~/UKBiobank/phenotype_files/pleiotropy/UKB_exome_asian_WC_pcs_rerun
covarFile=~/UKBiobank/phenotype_files/pleiotropy/UKB_exome_asian_WC_pcs_rerun
phenoCol=WAIST_inverserank
covarCol=sex
qCovarCol='BMI age WC_PC1 WC_PC2 WC_PC3 WC_PC4 WC_PC5 WC_PC6 WC_PC7 WC_PC8 WC_PC9 WC_PC10'
genoFile=`echo ~/UKBiobank/data/exome_files/project_VCF/072721_run/plink/ukb23156_c{1..22}.merged.filtered.bed`
bfile=~/UKBiobank/genotype_files_processed/012323_asian_10189ind_hg38/final_files_no_outliers/*.bed
anno_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.anno_file
set_list=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.set_list_file
mask_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.mask_file
aaf_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.aff_file
build_mask=max
aaf_bins='0.005 0.01'
tpl_file=~/project/bioworkflows/admin/csg.yml
lmm_sos=~/project/bioworkflows/GWAS/LMM.ipynb
container_marp=~/containers/marp.sif
container_lmm=~/containers/lmm.sif 
lmm_job_size=1
ylim=0
k=10
reverse_log_p=True
numThreads=20
formatFile_regenie=~/project/UKBB_GWAS_dev/data/regenie_template.yml
bsize=1000
## Trait leave empty for qt traits
trait=
minMAC=1
snpannofile=~/UKBiobank/results/ukb23155_200Kexomes_annovar/2021_10_12_hg38_exome/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.gz

lmm_args="""regenie_burden
    --cwd $lmm_dir_regenie 
    --bfile $bfile 
    --genoFile $genoFile
    --phenoFile $phenoFile 
    --formatFile $formatFile_regenie 
    --phenoCol $phenoCol
    --covarCol $covarCol  
    --qCovarCol $qCovarCol
    --bsize $bsize
    --trait $trait
    --anno_file $anno_file
    --set_list $set_list
    --mask_file $mask_file
    --aaf_file $aaf_file
    --aaf_bins $aaf_bins
    --build_mask $build_mask
    --job_size $lmm_job_size
    --ylim $ylim
    --k $k
    --reverse_log_p $reverse_log_p
    --numThreads $numThreads
    --minMAC $minMAC
    --snpannofile $snpannofile
    --container_lmm $container_lmm
    --container_marp $container_marp
"""

sos run ~/project/UKBB_GWAS_dev/admin/Get_Job_Script.ipynb csg \
    --template-file $tpl_file \
    --workflow-file $lmm_sos \
    --to-script $lmm_sbatch_regenie \
    --args "$lmm_args"

INFO: Running [32mcsg[0m: Configuration for Columbia csg partition cluster
INFO: [32mcsg[0m (index=0) is [32mignored[0m due to saved signature
INFO: [32mcsg[0m output:   [32m/home/dmc2245/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_asian-hg38/WC_ASN_200kWES-regenie-burden_2023-03-14.sbatch[0m
INFO: Workflow csg (ID=w61df2b25cf2053fa) is ignored with 1 ignored step.



## African

In [11]:
lmm_dir_regenie=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_african-hg38
lmm_sbatch_regenie=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_african-hg38/WC_AFR_200kWES-regenie-burden_$(date +"%Y-%m-%d").sbatch
phenoFile=~/UKBiobank/phenotype_files/pleiotropy/UKB_exome_african_WC_pcs_rerun
covarFile=~/UKBiobank/phenotype_files/pleiotropy/UKB_exome_african_WC_pcs_rerun
phenoCol=WAIST_inverserank
covarCol=sex
qCovarCol='BMI age WC_PC1 WC_PC2 WC_PC3 WC_PC4 WC_PC5 WC_PC6 WC_PC7 WC_PC8 WC_PC9 WC_PC10'
genoFile=`echo ~/UKBiobank/data/exome_files/project_VCF/072721_run/plink/ukb23156_c{1..22}.merged.filtered.bed`
bfile=~/UKBiobank/genotype_files_processed/012323_african_9096ind_hg38/final_files_no_outliers/*.bed
anno_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.anno_file
set_list=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.set_list_file
mask_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.mask_file
aaf_file=~/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/102121_burden_files/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.aff_file
build_mask=max
aaf_bins='0.005 0.01'
tpl_file=~/project/bioworkflows/admin/csg.yml
lmm_sos=~/project/bioworkflows/GWAS/LMM.ipynb
container_marp=~/containers/marp.sif
container_lmm=~/containers/lmm.sif 
lmm_job_size=1
ylim=0
k=10
reverse_log_p=True
numThreads=20
formatFile_regenie=~/project/UKBB_GWAS_dev/data/regenie_template.yml
bsize=1000
## Trait leave empty for qt traits
trait=
minMAC=1
snpannofile=~/UKBiobank/results/ukb23155_200Kexomes_annovar/2021_10_12_hg38_exome/ukb23155_chr1_chr22_091321.hg38.hg38_multianno.renamedcols.csv.gz

lmm_args="""regenie_burden
    --cwd $lmm_dir_regenie 
    --bfile $bfile 
    --genoFile $genoFile
    --phenoFile $phenoFile 
    --formatFile $formatFile_regenie 
    --phenoCol $phenoCol
    --covarCol $covarCol  
    --qCovarCol $qCovarCol
    --bsize $bsize
    --trait $trait
    --anno_file $anno_file
    --set_list $set_list
    --mask_file $mask_file
    --aaf_file $aaf_file
    --aaf_bins $aaf_bins
    --build_mask $build_mask
    --job_size $lmm_job_size
    --ylim $ylim
    --k $k
    --reverse_log_p $reverse_log_p
    --numThreads $numThreads
    --minMAC $minMAC
    --snpannofile $snpannofile
    --container_lmm $container_lmm
    --container_marp $container_marp
"""

sos run ~/project/UKBB_GWAS_dev/admin/Get_Job_Script.ipynb csg \
    --template-file $tpl_file \
    --workflow-file $lmm_sos \
    --to-script $lmm_sbatch_regenie \
    --args "$lmm_args"

INFO: Running [32mcsg[0m: Configuration for Columbia csg partition cluster
INFO: [32mcsg[0m (index=0) is [32mignored[0m due to saved signature
INFO: [32mcsg[0m output:   [32m/home/dmc2245/UKBiobank/results_pleiotropy/REGENIE_results/results_burden_exome/031323_WC_african-hg38/WC_AFR_200kWES-regenie-burden_2023-03-14.sbatch[0m
INFO: Workflow csg (ID=wbb7e0b435bb1aa86) is ignored with 1 ignored step.

