Workflow used to **apply** selected PGS scores into imputation data using **pgs-calc** (https://github.com/lukfor/pgs-calc)

In [1]:
import os
import glob
from datetime import date

basedir = "/labs/tassimes/rodrigoguarischi/projects/sea/apply_grs/"

# Change working directory
os.chdir(basedir)

# Run all scores present pgs_reference_weights folder
ref_weights_paths = glob.glob( "./pgs_reference_weights/*.txt.gz" )
ref_weights = [ os.path.basename(pgs_path).split(".")[0] for pgs_path in ref_weights_paths ]
ref_weights_paths = ",".join([pgs_path for pgs_path in ref_weights_paths])

# Create output folder named as raw_scores_<TODAYS_DATE>
today_date = date.today().strftime("%Y%m%d")
output_folder = "raw_scores_" + today_date
os.makedirs( output_folder )

# Dictionary with paths to imputed VCF files for HRC and TOPMed
imputed_genotypes = {
    "hrc_whites": "../imputed_data/michigan_hrc/whites/*.vcf.gz",
    "hrc_blacks": "../imputed_data/michigan_hrc/blacks/*.vcf.gz",
    "topmed_whites": "../imputed_data/topmed/whites/liftover_hg19/*no_chr_prefix.vcf.gz",
    "topmed_blacks": "../imputed_data/topmed/blacks/liftover_hg19/*no_chr_prefix.vcf.gz"
    }
      
# Run pgs-calc for hrc and topmed imputed genotypes for multiple r2 thresholds
for reference_panel in imputed_genotypes:
        
    for min_r2 in [0, 0.3, 0.5, 0.8]:
        
        print("Calculating scores for {0} at min R2 >= {1}".format( ", ".join(ref_weights), min_r2 ) )
        
        output_files_basename = output_folder + "/" + "_".join( [reference_panel, today_date, "multiGRS", ( "r" + str(min_r2).replace(".","") ) ] )
        
        info_report_filename = output_files_basename + ".info.txt"
        html_report_filename = output_files_basename + ".html"
        output_scores_filename = output_files_basename + ".scores.txt"
                
        # Run pgs-calc using GENOTYPE information, instead of DOSAGE (the default)
        # --genotypes=GT \     
        !./pgs-calc/pgs-calc apply \
            --ref {ref_weights_paths} \
            --minR2 {min_r2} \
            --threads 23 \
            --no-ansi \
            --info {info_report_filename} \
            --report-html={html_report_filename} \
            --out {output_scores_filename} \
            { imputed_genotypes[reference_panel] }

Calculating scores for PGS002026, PGS002009, PGS002244, PGS000027, PGS002133, PGS001900, PGS003356, PGS000958, wGRS49, PGS001979, PGS002197, tc20201014Shoa, PGS001351, PGS002037, PGS002161, PGS001830, TEMprsCatherineWhites, PGS002150, TEMprsCatherineBlacks, PGS000667, nonHDL20201014Shoa, PGS000013, PGS001357, PGS001933, PGS002114, PGS001105, PGS000349, PGS001917, PGS001818, HDL20201014Shoa, PGS000957, logTG20201014Shoa, PGS000889, PGS000018 at min R2 >= 0

pgs-calc 0.9.16
https://github.com/lukfor/pgs-calc
(c) 2020 - 2022 Lukas Forer


Input:
  ref: ./pgs_reference_weights/PGS002026.txt.gz,./pgs_reference_weights/PGS002009.txt.gz,./pgs_reference_weights/PGS002244.txt.gz,./pgs_reference_weights/PGS000027.txt.gz,./pgs_reference_weights/PGS002133.txt.gz,./pgs_reference_weights/PGS001900.txt.gz,./pgs_reference_weights/PGS003356.txt.gz,./pgs_reference_weights/PGS000958.txt.gz,./pgs_reference_weights/wGRS49.txt.gz,./pgs_reference_weights/PGS001979.txt.gz,./pgs_reference_weights/PGS002197.txt

[Run]     [Chr 05]...
[Run]     [Chr 09]...
[Run]     [Chr 04]...
[Run]     [Chr 08]...
[Run]     [Chr 02]...
[Run]     [Chr 12]...
[Run]     [Chr 01]...
[Run]     [Chr 06]...
[Run]     [Chr 20]...
[Run]     [Chr 07]...
[Run]     [Chr 18]...
[Run]     [Chr 17]...
[Run]     [Chr 03]...
[Run]     [Chr 15]...
[Run]     [Chr 10]...
[Run]     [Chr 16]...
[Run]     [Chr 14]...
[Run]     [Chr 0X]...
[Run]     [Chr 22]...
[Run]     [Chr 13]...
[Run]     [Chr 21]...
[Run]     [Chr 11]...
[Run]     [Chr 19]...
[Done]    [Chr 22]. Execution Time: 00:03:13
[Done]    [Chr 21]. Execution Time: 00:03:25
[Done]    [Chr 19]. Execution Time: 00:04:13
[Done]    [Chr 0X]. Execution Time: 00:04:37
[Done]    [Chr 20]. Execution Time: 00:04:45
[Done]    [Chr 17]. Execution Time: 00:05:12
[Done]    [Chr 18]. Execution Time: 00:05:20
[Done]    [Chr 15]. Execution Time: 00:05:22
[Done]    [Chr 14]. Execution Time: 00:05:34
[Done]    [Chr 16]. Execution Time: 00:05:46
[Done]    [Chr 13]. Execution Time: 00:06:10

Calculating scores for PGS002026, PGS002009, PGS002244, PGS000027, PGS002133, PGS001900, PGS003356, PGS000958, wGRS49, PGS001979, PGS002197, tc20201014Shoa, PGS001351, PGS002037, PGS002161, PGS001830, TEMprsCatherineWhites, PGS002150, TEMprsCatherineBlacks, PGS000667, nonHDL20201014Shoa, PGS000013, PGS001357, PGS001933, PGS002114, PGS001105, PGS000349, PGS001917, PGS001818, HDL20201014Shoa, PGS000957, logTG20201014Shoa, PGS000889, PGS000018 at min R2 >= 0.8

pgs-calc 0.9.16
https://github.com/lukfor/pgs-calc
(c) 2020 - 2022 Lukas Forer


Input:
  ref: ./pgs_reference_weights/PGS002026.txt.gz,./pgs_reference_weights/PGS002009.txt.gz,./pgs_reference_weights/PGS002244.txt.gz,./pgs_reference_weights/PGS000027.txt.gz,./pgs_reference_weights/PGS002133.txt.gz,./pgs_reference_weights/PGS001900.txt.gz,./pgs_reference_weights/PGS003356.txt.gz,./pgs_reference_weights/PGS000958.txt.gz,./pgs_reference_weights/wGRS49.txt.gz,./pgs_reference_weights/PGS001979.txt.gz,./pgs_reference_weights/PGS002197.t

[Run]     [Chr 12]...
[Run]     [Chr 15]...
[Run]     [Chr 06]...
[Run]     [Chr 22]...
[Run]     [Chr 07]...
[Run]     [Chr 18]...
[Run]     [Chr 21]...
[Run]     [Chr 03]...
[Run]     [Chr 0X]...
[Run]     [Chr 04]...
[Run]     [Chr 14]...
[Run]     [Chr 01]...
[Run]     [Chr 02]...
[Run]     [Chr 20]...
[Run]     [Chr 13]...
[Run]     [Chr 09]...
[Run]     [Chr 16]...
[Run]     [Chr 08]...
[Run]     [Chr 05]...
[Run]     [Chr 19]...
[Run]     [Chr 17]...
[Run]     [Chr 10]...
[Run]     [Chr 11]...
[Done]    [Chr 22]. Execution Time: 00:03:44
[Done]    [Chr 21]. Execution Time: 00:03:46
[Done]    [Chr 19]. Execution Time: 00:04:40
[Done]    [Chr 0X]. Execution Time: 00:04:54
[Done]    [Chr 20]. Execution Time: 00:05:04
[Done]    [Chr 17]. Execution Time: 00:05:43
[Done]    [Chr 15]. Execution Time: 00:05:56
[Done]    [Chr 18]. Execution Time: 00:06:03
[Done]    [Chr 16]. Execution Time: 00:06:07
[Done]    [Chr 14]. Execution Time: 00:06:08
[Done]    [Chr 13]. Execution Time: 00:06:33

HTML Report written to /oak/stanford/scg/lab_tassimes/rodrigoguarischi/projects/sea/apply_grs/raw_scores_20221229/hrc_blacks_20221229_multiGRS_r03.html.
[Done]    Html Report created and written to 'raw_scores_20221229/hrc_blacks_20221229_multiGRS_r03.html'. Execution Time: 00:00:00

Execution Time: 9 min, 7 sec

Calculating scores for PGS002026, PGS002009, PGS002244, PGS000027, PGS002133, PGS001900, PGS003356, PGS000958, wGRS49, PGS001979, PGS002197, tc20201014Shoa, PGS001351, PGS002037, PGS002161, PGS001830, TEMprsCatherineWhites, PGS002150, TEMprsCatherineBlacks, PGS000667, nonHDL20201014Shoa, PGS000013, PGS001357, PGS001933, PGS002114, PGS001105, PGS000349, PGS001917, PGS001818, HDL20201014Shoa, PGS000957, logTG20201014Shoa, PGS000889, PGS000018 at min R2 >= 0.5

pgs-calc 0.9.16
https://github.com/lukfor/pgs-calc
(c) 2020 - 2022 Lukas Forer


Input:
  ref: ./pgs_reference_weights/PGS002026.txt.gz,./pgs_reference_weights/PGS002009.txt.gz,./pgs_reference_weights/PGS002244.txt.gz,./pg

[Run]     [Chr 18]...
[Run]     [Chr 15]...
[Run]     [Chr 17]...
[Run]     [Chr 14]...
[Run]     [Chr 12]...
[Run]     [Chr 10]...
[Run]     [Chr 01]...
[Run]     [Chr 05]...
[Run]     [Chr 03]...
[Run]     [Chr 04]...
[Run]     [Chr 06]...
[Run]     [Chr 08]...
[Run]     [Chr 07]...
[Run]     [Chr 09]...
[Run]     [Chr 21]...
[Run]     [Chr 0X]...
[Run]     [Chr 11]...
[Run]     [Chr 20]...
[Run]     [Chr 02]...
[Run]     [Chr 19]...
[Run]     [Chr 16]...
[Run]     [Chr 22]...
[Run]     [Chr 13]...
[Done]    [Chr 21]. Execution Time: 00:02:43
[Done]    [Chr 22]. Execution Time: 00:02:44
[Done]    [Chr 19]. Execution Time: 00:03:41
[Done]    [Chr 20]. Execution Time: 00:03:41
[Done]    [Chr 17]. Execution Time: 00:04:14
[Done]    [Chr 18]. Execution Time: 00:04:15
[Done]    [Chr 15]. Execution Time: 00:04:15
[Done]    [Chr 0X]. Execution Time: 00:04:15
[Done]    [Chr 16]. Execution Time: 00:04:26
[Done]    [Chr 14]. Execution Time: 00:04:28
[Done]    [Chr 13]. Execution Time: 00:04:36

1 86004523
1 86005082
1 86005082
[Done]    [Chr 15]. Execution Time: 00:16:48
[Done]    [Chr 14]. Execution Time: 00:17:21
[Done]    [Chr 16]. Execution Time: 00:17:57
7 96728827
7 96728827
7 96728973
7 96728973
13 114217764
13 114217764
[Done]    [Chr 13]. Execution Time: 00:18:57
6 119160435
6 119160435
[Done]    [Chr 0X]. Execution Time: 00:21:01
[Done]    [Chr 09]. Execution Time: 00:21:38
[Done]    [Chr 10]. Execution Time: 00:22:35
[Done]    [Chr 11]. Execution Time: 00:23:03
[Done]    [Chr 12]. Execution Time: 00:23:06
[Done]    [Chr 08]. Execution Time: 00:24:24
[Done]    [Chr 07]. Execution Time: 00:24:36
[Done]    [Chr 06]. Execution Time: 00:25:49
[Done]    [Chr 05]. Execution Time: 00:26:22
[Done]    [Chr 04]. Execution Time: 00:27:44
[Done]    [Chr 03]. Execution Time: 00:27:47
[Done]    [Chr 01]. Execution Time: 00:29:36
[Done]    [Chr 02]. Execution Time: 00:31:15

[Run]     Merge score files...
[Done]    Merge score files. Execution Time: 00:00:00
[Run]     Merge report

Calculating scores for PGS002026, PGS002009, PGS002244, PGS000027, PGS002133, PGS001900, PGS003356, PGS000958, wGRS49, PGS001979, PGS002197, tc20201014Shoa, PGS001351, PGS002037, PGS002161, PGS001830, TEMprsCatherineWhites, PGS002150, TEMprsCatherineBlacks, PGS000667, nonHDL20201014Shoa, PGS000013, PGS001357, PGS001933, PGS002114, PGS001105, PGS000349, PGS001917, PGS001818, HDL20201014Shoa, PGS000957, logTG20201014Shoa, PGS000889, PGS000018 at min R2 >= 0.5

pgs-calc 0.9.16
https://github.com/lukfor/pgs-calc
(c) 2020 - 2022 Lukas Forer


Input:
  ref: ./pgs_reference_weights/PGS002026.txt.gz,./pgs_reference_weights/PGS002009.txt.gz,./pgs_reference_weights/PGS002244.txt.gz,./pgs_reference_weights/PGS000027.txt.gz,./pgs_reference_weights/PGS002133.txt.gz,./pgs_reference_weights/PGS001900.txt.gz,./pgs_reference_weights/PGS003356.txt.gz,./pgs_reference_weights/PGS000958.txt.gz,./pgs_reference_weights/wGRS49.txt.gz,./pgs_reference_weights/PGS001979.txt.gz,./pgs_reference_weights/PGS002197.t

[Run]     [Chr 22]...
[Run]     [Chr 11]...
[Run]     [Chr 02]...
[Run]     [Chr 10]...
[Run]     [Chr 15]...
[Run]     [Chr 01]...
[Run]     [Chr 14]...
[Run]     [Chr 20]...
[Run]     [Chr 16]...
[Run]     [Chr 09]...
[Run]     [Chr 13]...
[Run]     [Chr 12]...
[Run]     [Chr 06]...
[Run]     [Chr 21]...
[Run]     [Chr 03]...
[Run]     [Chr 18]...
[Run]     [Chr 04]...
[Run]     [Chr 19]...
[Run]     [Chr 17]...
[Run]     [Chr 07]...
[Run]     [Chr 08]...
[Run]     [Chr 0X]...
[Run]     [Chr 05]...
[Done]    [Chr 21]. Execution Time: 00:07:39
[Done]    [Chr 22]. Execution Time: 00:07:55
[Done]    [Chr 19]. Execution Time: 00:11:33
[Done]    [Chr 20]. Execution Time: 00:11:34
[Done]    [Chr 18]. Execution Time: 00:14:06
[Done]    [Chr 15]. Execution Time: 00:14:19
[Done]    [Chr 17]. Execution Time: 00:14:49
[Done]    [Chr 14]. Execution Time: 00:15:17
[Done]    [Chr 16]. Execution Time: 00:15:22
[Done]    [Chr 13]. Execution Time: 00:16:15
[Done]    [Chr 09]. Execution Time: 00:18:49

1 80967857
1 80967857
[Done]    [Chr 18]. Execution Time: 00:15:02
1 86004523
1 86005082
1 86005082
[Done]    [Chr 14]. Execution Time: 00:15:35
[Done]    [Chr 16]. Execution Time: 00:16:00
13 114217764
13 114217764
7 96728827
7 96728827
7 96728973
7 96728973
[Done]    [Chr 13]. Execution Time: 00:16:56
6 119160435
6 119160435
[Done]    [Chr 0X]. Execution Time: 00:19:19
[Done]    [Chr 09]. Execution Time: 00:19:45
[Done]    [Chr 10]. Execution Time: 00:20:56
[Done]    [Chr 12]. Execution Time: 00:20:58
[Done]    [Chr 11]. Execution Time: 00:21:14
[Done]    [Chr 08]. Execution Time: 00:22:03
[Done]    [Chr 07]. Execution Time: 00:23:08
[Done]    [Chr 06]. Execution Time: 00:23:50
[Done]    [Chr 05]. Execution Time: 00:24:31
[Done]    [Chr 04]. Execution Time: 00:25:07
[Done]    [Chr 03]. Execution Time: 00:26:16
[Done]    [Chr 01]. Execution Time: 00:27:51
[Done]    [Chr 02]. Execution Time: 00:28:57

[Run]     Merge score files...
[Done]    Merge score files. Execution Time: 00:00:00


HTML Report written to /oak/stanford/scg/lab_tassimes/rodrigoguarischi/projects/sea/apply_grs/raw_scores_20221229/topmed_blacks_20221229_multiGRS_r03.html.
[Done]    Html Report created and written to 'raw_scores_20221229/topmed_blacks_20221229_multiGRS_r03.html'. Execution Time: 00:00:00

Execution Time: 28 min, 21 sec

Calculating scores for PGS002026, PGS002009, PGS002244, PGS000027, PGS002133, PGS001900, PGS003356, PGS000958, wGRS49, PGS001979, PGS002197, tc20201014Shoa, PGS001351, PGS002037, PGS002161, PGS001830, TEMprsCatherineWhites, PGS002150, TEMprsCatherineBlacks, PGS000667, nonHDL20201014Shoa, PGS000013, PGS001357, PGS001933, PGS002114, PGS001105, PGS000349, PGS001917, PGS001818, HDL20201014Shoa, PGS000957, logTG20201014Shoa, PGS000889, PGS000018 at min R2 >= 0.5

pgs-calc 0.9.16
https://github.com/lukfor/pgs-calc
(c) 2020 - 2022 Lukas Forer


Input:
  ref: ./pgs_reference_weights/PGS002026.txt.gz,./pgs_reference_weights/PGS002009.txt.gz,./pgs_reference_weights/PGS002244.txt

[Run]     [Chr 16]...
[Run]     [Chr 09]...
[Run]     [Chr 04]...
[Run]     [Chr 19]...
[Run]     [Chr 12]...
[Run]     [Chr 15]...
[Run]     [Chr 21]...
[Run]     [Chr 18]...
[Run]     [Chr 13]...
[Run]     [Chr 10]...
[Run]     [Chr 11]...
[Run]     [Chr 07]...
[Run]     [Chr 05]...
[Run]     [Chr 20]...
[Run]     [Chr 14]...
[Run]     [Chr 01]...
[Run]     [Chr 17]...
[Run]     [Chr 02]...
[Run]     [Chr 03]...
[Run]     [Chr 22]...
[Run]     [Chr 06]...
[Run]     [Chr 08]...
[Run]     [Chr 0X]...
[Done]    [Chr 21]. Execution Time: 00:07:10
[Done]    [Chr 22]. Execution Time: 00:07:19
[Done]    [Chr 19]. Execution Time: 00:10:32
[Done]    [Chr 20]. Execution Time: 00:10:34
[Done]    [Chr 18]. Execution Time: 00:12:23
[Done]    [Chr 17]. Execution Time: 00:12:59
[Done]    [Chr 14]. Execution Time: 00:13:35
[Done]    [Chr 15]. Execution Time: 00:13:41
[Done]    [Chr 16]. Execution Time: 00:14:41
[Done]    [Chr 13]. Execution Time: 00:15:08
[Done]    [Chr 09]. Execution Time: 00:17:08