# TWAS
a) Manhattan plot of glyco-TWAS results (unadjusted data); annotate points to indicate MR significance



## Combine Data
TWAS and MR results generaated by pipelines were combined using the following commands
```
python /mnt/lustre/home/rl3328/rl3328/motor_qtl/combine_twas.py /mnt/lustre/lab/ctcn/hklein/motor_qtl_project/unadjusted_wp_2020/susie_twas/noQC_nonimputed_result/twas --type both --output ROSMAP_eQTL_wp_unadjusted
```
### Input
Input files are all on the new AWS-based HPC
* Combined TWAS file:`/mnt/lustre/lab/ctcn/hklein/motor_qtl_project/wp_2020/susie_twas/noQC_nonimputed_result/twas/ROSMAP_eQTL_wp_unadjusted.combined_twas.tsv.gz`
* Combined MR file:`/mnt/lustre/lab/ctcn/hklein/motor_qtl_project/unadjusted_wp_2020/susie_twas/noQC_nonimputed_result/twas/ROSMAP_eQTL_wp_unadjusted.combined_mr.tsv.gz`
* reference file for gene annotation: `/mnt/lustre/home/rl3328/rl3328/resource/Homo_sapiens.GRCh38.103.chr.reformatted.collapse_only.gene.region_list.2025`


### Landscape

* The TWAS results show substantial variation in the number of available genes across contexts. Monocyte, DLPFC, and AC have notably fewer genes available for analysis. In particular, the DLPFC bulk eQTL dataset includes only 57 genes, which likely explains why no genes reach significance under the most accurate predictive model for DLPFC bulk eQTL.
* Consistently, the MR analysis also yields no significant genes for monocyte, DLPFC, or AC.

* **Given these limitations, it may be more appropriate to combine bulk and single-nucleus (sn) data for DLPFC in downstream analyses.**



#### TWAS

In [1]:
library(data.table)
library(tidyverse)
twas <- fread('/mnt/lustre/lab/ctcn/hklein/motor_qtl_project/unadjusted_wp_2020/susie_twas/noQC_nonimputed_result/twas/ROSMAP_eQTL_wp_unadjusted.combined_twas.tsv.gz')

── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.4     [32m✔[39m [34mreadr    [39m 2.1.6
[32m✔[39m [34mforcats  [39m 1.0.1     [32m✔[39m [34mstringr  [39m 1.6.0
[32m✔[39m [34mggplot2  [39m 3.5.2     [32m✔[39m [34mtibble   [39m 3.3.0
[32m✔[39m [34mlubridate[39m 1.9.4     [32m✔[39m [34mtidyr    [39m 1.3.2
[32m✔[39m [34mpurrr    [39m 1.2.0     
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mbetween()[39m     masks [34mdata.table[39m::between()
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m      masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mfirst()[39m       masks [34mdata.table[39m::first()
[31m✖[39m [34mlubridate[39m::[32mhour()[39m    masks [34mdata.table[39m::hour()
[31m✖[39m [34mlubridate[39m::[32misoweek()[39m masks [34mdata.table[39m::isoweek()
[31m✖[39m 

In [2]:
head(twas)
dim(twas)

chr,molecular_id,TSS,start,end,context,gwas_study,method,is_imputable,is_selected_method,rsq_cv,pval_cv,twas_z,twas_pval,type,block,method_selected_original,region,study_context,source_file
<int>,<chr>,<int>,<int>,<int>,<chr>,<chr>,<chr>,<lgl>,<lgl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<lgl>,<chr>,<chr>,<chr>
10,ENSG00000055950,100987515,99320000,102120000,Inh_DeJager_eQTL,unadjusted_wp_2020,enet,True,False,-0.0002636756,0.3460743,-0.8321645,0.40531612,eQTL,chr10_100331627_104378781,False,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.twas.tsv.gz
10,ENSG00000055950,100987515,99320000,102120000,Inh_DeJager_eQTL,unadjusted_wp_2020,lasso,True,False,-0.0008220066,0.4181955,-0.9241237,0.35542192,eQTL,chr10_100331627_104378781,False,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.twas.tsv.gz
10,ENSG00000055950,100987515,99320000,102120000,Inh_DeJager_eQTL,unadjusted_wp_2020,mrash,True,False,0.0074837805,0.04222085,-2.478684,0.01318681,eQTL,chr10_100331627_104378781,False,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.twas.tsv.gz
10,ENSG00000055950,100987515,99320000,102120000,Inh_DeJager_eQTL,unadjusted_wp_2020,susie,True,True,0.0157468862,0.005809734,-0.652317,0.51419672,eQTL,chr10_100331627_104378781,True,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.twas.tsv.gz
10,ENSG00000055950,100987515,99320000,102120000,PCC_DeJager_eQTL,unadjusted_wp_2020,enet,True,False,0.1176354683,7.716102e-14,-0.2702251,0.7869871,eQTL,chr10_100331627_104378781,False,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.twas.tsv.gz
10,ENSG00000055950,100987515,99320000,102120000,PCC_DeJager_eQTL,unadjusted_wp_2020,lasso,True,False,0.1157138583,1.253604e-13,-0.1913938,0.84821705,eQTL,chr10_100331627_104378781,False,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.twas.tsv.gz


In [3]:
## TWAS analysis were performed across 10 contexts and 13872 genes
twas %>% pull(context) |> unique()#%>%length

twas %>% pull(molecular_id)%>%unique()%>%length



In [4]:
## available genes by contexts
twas |> group_by(context) |> summarize(gene = n_distinct(molecular_id))

context,gene
<chr>,<int>
AC_DeJager_eQTL,60
Ast_DeJager_eQTL,5922
DLPFC_DeJager_eQTL,57
Exc_DeJager_eQTL,8273
Inh_DeJager_eQTL,6450
Mic_DeJager_eQTL,2770
OPC_DeJager_eQTL,3054
Oli_DeJager_eQTL,5450
PCC_DeJager_eQTL,10573
monocyte_ROSMAP_eQTL,3


In [5]:

twas_res_sig <- twas %>% filter(is_selected_method=="TRUE")%>%filter(twas_pval<=0.05)#%>%pull(context)%>%unique()
twas_res_sig|> pull(molecular_id)%>%unique()%>%length


In [6]:
twas_DLPFC_bulk = twas |> filter(context == 'DLPFC_DeJager_eQTL') |> pull(molecular_id)%>%unique()%>%length
twas_DLPFC_bulk

In [7]:
twas_DLPFC = twas|> filter(!context %in% c("monocyte_ROSMAP_eQTL", "PCC_DeJager_eQTL", "AC_DeJager_eQTL")) |> pull(molecular_id)%>%unique()%>%length
twas_DLPFC

In [20]:
## need to run after unadj_cutoff was calculated,  shown here for better review
## Best model-0 gene significant after bonferroni correction
twas |> filter(is_selected_method, twas_pval < unadj_cutoff, str_detect(context, "DLPFC"))
## All models-1 gene(EPHA3) significant after bonferroni correction
twas |> filter(twas_pval < unadj_cutoff, str_detect(context, "DLPFC"))


chr,molecular_id,TSS,start,end,context,gwas_study,method,is_imputable,is_selected_method,rsq_cv,pval_cv,twas_z,twas_pval,type,block,method_selected_original,region,study_context,source_file
<int>,<chr>,<int>,<int>,<int>,<chr>,<chr>,<chr>,<lgl>,<lgl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<lgl>,<chr>,<chr>,<chr>


chr,molecular_id,TSS,start,end,context,gwas_study,method,is_imputable,is_selected_method,rsq_cv,pval_cv,twas_z,twas_pval,type,block,method_selected_original,region,study_context,source_file
<int>,<chr>,<int>,<int>,<int>,<chr>,<chr>,<chr>,<lgl>,<lgl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<lgl>,<chr>,<chr>,<chr>
3,ENSG00000044524,89107621,86600000,93720000,DLPFC_DeJager_eQTL,unadjusted_wp_2020,mrash,True,False,0.04285215,2.935038e-09,-5.764569,8.186698e-09,eQTL,chr3_88209300_94537986,False,chr3_88209300_94537986,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr3_88209300_94537986.twas.tsv.gz


#### MR

In [9]:
mr <- fread('/mnt/lustre/lab/ctcn/hklein/motor_qtl_project/unadjusted_wp_2020/susie_twas/noQC_nonimputed_result/twas/ROSMAP_eQTL_wp_unadjusted.combined_mr.tsv.gz')

In [10]:
mr_sig_unadj <- mr %>% .[
  meta_pval < 0.05 / .N &          # Bonferroni correction
  cpip >= 0.7 &                    # strong causal evidence
  num_CS >= 1 &                    # as quested by Hans and Gao on Oct 24, 2025 to get a loose version
  Q_pval > 0.01 &                  # no heterogeneity
  I2 < 0.4                         # low heterogeneity
]

In [11]:
head(mr_sig_unadj)
dim(mr_sig_unadj)

Q,Q_pval,I2,context,cpip,gene_name,gwas_study,meta_eff,meta_pval,num_CS,num_IV,se_meta_eff,region,study_context,source_file
<dbl>,<dbl>,<dbl>,<chr>,<dbl>,<chr>,<chr>,<dbl>,<dbl>,<int>,<int>,<dbl>,<chr>,<chr>,<chr>
1.823,0.402,0,Exc_DeJager_eQTL,0.983,ENSG00000138111,unadjusted_wp_2020,0.0,0,3,21,0,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.mr_result.tsv.gz
0.0,1.0,0,Inh_DeJager_eQTL,0.956,ENSG00000138111,unadjusted_wp_2020,0.0,0,1,16,0,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.mr_result.tsv.gz
0.0,1.0,0,Ast_DeJager_eQTL,1.012,ENSG00000171206,unadjusted_wp_2020,0.001,0,1,59,0,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.mr_result.tsv.gz
0.001,0.979,0,Exc_DeJager_eQTL,1.026,ENSG00000171206,unadjusted_wp_2020,0.001,0,2,144,0,chr10_100331627_104378781,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_100331627_104378781.mr_result.tsv.gz
0.0,1.0,0,Exc_DeJager_eQTL,0.96,ENSG00000078403,unadjusted_wp_2020,-0.001,0,1,8,0,chr10_20625737_23053090,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_20625737_23053090.mr_result.tsv.gz
0.0,1.0,0,PCC_DeJager_eQTL,0.995,ENSG00000148572,unadjusted_wp_2020,-0.002,0,1,92,0,chr10_62446953_64035328,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr10_62446953_64035328.mr_result.tsv.gz


In [12]:
mr_sig_unadj |> pull(gene_name)%>%unique()%>%length

In [13]:
mr_sig_unadj |> filter(!context %in% c("monocyte_ROSMAP_eQTL", "PCC_DeJager_eQTL", "AC_DeJager_eQTL")) |> pull(gene_name)%>%unique()%>%length

In [14]:
mr_sig_unadj |> group_by(context) |> summarize(gene = n_distinct(gene_name))

context,gene
<chr>,<int>
Ast_DeJager_eQTL,22
Exc_DeJager_eQTL,45
Inh_DeJager_eQTL,29
Mic_DeJager_eQTL,6
OPC_DeJager_eQTL,7
Oli_DeJager_eQTL,27
PCC_DeJager_eQTL,80


In [19]:
## Regardless of context, there are 60 genes have both TWAS significant and MR significant asscoaitions
intersect(twas_unadj_sig$molecular_id, mr_sig_unadj$gene_name) |> unique()|> length()

### summarize significant TWAS & MR genes

* Significant TWAS associations were defined using two criteria: (i) the most accurate prediction model yielded a TWAS p-value below the Bonferroni-corrected threshold, or (ii) at least half of the prediction models produced p-values below this threshold.
* Among the 14,075 total genes, 11,538 are from DLPFC. Overall, 133 genes show significant TWAS associations, of which 84 are significant in DLPFC tissues

* Mendelian randomization (MR) associations were considered significant if they met all of the following criteria: a meta-analysis p-value passing Bonferroni correction (meta-analysis p < 0.05 divided by the number of tested genes), strong causal evidence as indicated by a causal posterior inclusion probability (CPIP) ≥ 0.7, at least one credible set (num_CS ≥ 1), and no evidence of substantial heterogeneity (Cochran’s Q test p > 0.01 and I² < 0.4).
* 154 genes show significant MR associations, of which 92 are significant in DLPFC tissues.
* Regardless of context, 60 genes are both MR and TWAS significant. 55 genes are both MR and TWAS significant if matched by contexts, of which 23 are from DLPFC(both bulk and sn).

In [16]:
n_imputable_genes <- twas %>%
  filter(is_imputable) %>%                  # keep only imputable rows
  distinct(molecular_id, context, gwas_study) %>%         # count unique gene-context pairs
  nrow()
n_imputable_genes

In [17]:
# Compute cutoff using Bonferroni-style correction
unadj_cutoff <- 0.05 / n_imputable_genes
unadj_cutoff

In [18]:
twas_unadj_sig <- twas %>%
  mutate(sig = twas_pval < unadj_cutoff) %>%
  group_by(molecular_id, context, gwas_study) %>%
  summarise(
    n_methods = n(),
    n_sig = sum(sig, na.rm = T),
    best_sig = any(sig & is_selected_method, na.rm = T),
    .groups = "drop"
  ) %>%
  mutate(pass = (n_sig / n_methods >= 0.5) | best_sig) %>%
  mutate(pass_strict = (n_sig / n_methods >= 0.5)) %>%
  filter(pass) %>% left_join(twas) %>% arrange(twas_pval) %>% distinct(molecular_id, .keep_all = T)
twas_unadj_sig %>% dim

[1m[22mJoining with `by = join_by(molecular_id, context, gwas_study)`


In [105]:
twas_unadj_sig |> filter(!context %in% c("monocyte_ROSMAP_eQTL", "PCC_DeJager_eQTL", "AC_DeJager_eQTL")) |> nrow()

In [21]:
gene_ref <- fread('/mnt/lustre/home/rl3328/rl3328/resource/Homo_sapiens.GRCh38.103.chr.reformatted.collapse_only.gene.region_list.2025')

In [22]:
# twas_unadj_sig |> left_join(gene_ref, by = c("molecular_id" = "gene_id")) |> select(molecular_id, gene_name, everything())

## Figure data

In [23]:
df <- twas %>%
  mutate(sig = twas_pval < unadj_cutoff) %>%
  group_by(molecular_id, context, gwas_study) %>%
  summarise(
    n_methods = n(),
    n_sig = sum(sig, na.rm = TRUE),
    best_pval = min(twas_pval, na.rm = TRUE),
    best_method_pval = min(twas_pval[is_selected_method], na.rm = TRUE),
    best_method_selected = any(sig & is_selected_method, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    pass = (n_sig / n_methods >= 0.5) | best_method_selected, # pass defined by two threshold, best or half methods satisfied
    final_pval = ifelse(n_sig / n_methods >= 0.5, best_pval, best_method_pval),
    # fallback: if no selected method existed (Inf), use best_pval
    final_pval = ifelse(is.infinite(final_pval), best_pval, final_pval)
  ) %>%
  left_join(twas, by = c("molecular_id", "context", "gwas_study")) %>%
  filter(twas_pval == final_pval) %>%
  distinct(molecular_id, .keep_all = TRUE) %>%
  mutate(
    logp = -log10(twas_pval) * sign(twas_z),
    pos = (start + end) / 2
  ) %>%
  merge(gene_ref, by.x = 'molecular_id', by.y = 'gene_id')

# Order chromosomes
df$chr <- factor(df$chr, levels = as.character(1:22))

[1m[22m[36mℹ[39m In argument: `best_method_pval = min(twas_pval[is_selected_method], na.rm =
  TRUE)`.
[36mℹ[39m In group 202: `molecular_id = "ENSG00000005189"`, `context =
  "Exc_DeJager_eQTL"`, `gwas_study = "unadjusted_wp_2020"`.
[33m![39m no non-missing arguments to min; returning Inf


In [24]:
dim(df)

In [25]:
head(df)

Unnamed: 0_level_0,molecular_id,context,gwas_study,n_methods,n_sig,best_pval,best_method_pval,best_method_selected,pass,final_pval,⋯,method_selected_original,region,study_context,source_file,logp,pos,#chr,TSS.y,TES,gene_name
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<int>,<int>,<dbl>,<dbl>,<lgl>,<lgl>,<dbl>,⋯,<lgl>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<chr>,<int>,<int>,<chr>
1,ENSG00000000419,Exc_DeJager_eQTL,unadjusted_wp_2020,4,0,0.004887676,0.634741658,False,False,0.634741658,⋯,True,chr20_50319421_53436693,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr20_50319421_53436693.twas.tsv.gz,-0.197403,51747434,chr20,50958555,50934867,DPM1
2,ENSG00000000457,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.31431759,0.66695175,False,False,0.66695175,⋯,True,chr1_168438717_170228106,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr1_168438717_170228106.twas.tsv.gz,0.1759056,169540000,chr1,169894267,169849631,SCYL3
3,ENSG00000000460,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.550799973,0.681781965,False,False,0.681781965,⋯,True,chr1_168438717_170228106,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr1_168438717_170228106.twas.tsv.gz,-0.1663545,169540000,chr1,169662007,169854080,C1orf112
4,ENSG00000000971,PCC_DeJager_eQTL,unadjusted_wp_2020,4,0,0.006727627,0.006727627,False,False,0.006727627,⋯,True,chr1_195599253_199271134,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr1_195599253_199271134.twas.tsv.gz,2.1721381,195093752,chr1,196652043,196747504,CFH
5,ENSG00000001036,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.125012981,0.125012981,False,False,0.125012981,⋯,True,chr6_143399926_145632539,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr6_143399926_145632539.twas.tsv.gz,-0.9030449,143115860,chr6,143511720,143494812,FUCA2
6,ENSG00000001084,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.166978326,0.851335412,False,False,0.851335412,⋯,True,chr6_52730905_54027603,ROSMAP_eQTL_wp,ROSMAP_eQTL_wp_unadjusted.chr6_52730905_54027603.twas.tsv.gz,-0.0698993,53557156,chr6,53616970,53497341,GCLC


In [26]:
chr_info <- df %>%
  group_by(chr) %>%
  summarise(chr_len = max(pos)) %>%
  mutate(chr_start = lag(cumsum(chr_len), default = 0))

df <- df %>%
  left_join(chr_info, by = "chr") %>%
  mutate(cum_pos = pos + chr_start)
  
axis_df <- chr_info %>%
  mutate(center = chr_start + chr_len/2)


In [27]:
df <- df %>%
  left_join(
    mr_sig_unadj %>% rename(molecular_id = gene_name),
    by = c("molecular_id", "context")
  )

In [28]:
df %>% filter(!is.na(Q_pval)) %>% select(molecular_id, context, gene_name)

molecular_id,context,gene_name
<chr>,<chr>,<chr>
ENSG00000004534,Ast_DeJager_eQTL,RBM6
ENSG00000014123,Ast_DeJager_eQTL,UFL1
ENSG00000023516,Exc_DeJager_eQTL,AKAP11
ENSG00000030066,Exc_DeJager_eQTL,NUP160
ENSG00000073711,Ast_DeJager_eQTL,PPP2R3A
ENSG00000075213,Ast_DeJager_eQTL,SEMA3A
ENSG00000075413,Ast_DeJager_eQTL,MARK3
ENSG00000078487,Ast_DeJager_eQTL,ZCWPW1
ENSG00000081377,Ast_DeJager_eQTL,CDC14B
ENSG00000090621,PCC_DeJager_eQTL,PABPC4


In [29]:
head(df)
dim(df)

Unnamed: 0_level_0,molecular_id,context,gwas_study.x,n_methods,n_sig,best_pval,best_method_pval,best_method_selected,pass,final_pval,⋯,cpip,gwas_study.y,meta_eff,meta_pval,num_CS,num_IV,se_meta_eff,region.y,study_context.y,source_file.y
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<int>,<int>,<dbl>,<dbl>,<lgl>,<lgl>,<dbl>,⋯,<dbl>,<chr>,<dbl>,<dbl>,<int>,<int>,<dbl>,<chr>,<chr>,<chr>
1,ENSG00000000419,Exc_DeJager_eQTL,unadjusted_wp_2020,4,0,0.004887676,0.634741658,False,False,0.634741658,⋯,,,,,,,,,,
2,ENSG00000000457,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.31431759,0.66695175,False,False,0.66695175,⋯,,,,,,,,,,
3,ENSG00000000460,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.550799973,0.681781965,False,False,0.681781965,⋯,,,,,,,,,,
4,ENSG00000000971,PCC_DeJager_eQTL,unadjusted_wp_2020,4,0,0.006727627,0.006727627,False,False,0.006727627,⋯,,,,,,,,,,
5,ENSG00000001036,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.125012981,0.125012981,False,False,0.125012981,⋯,,,,,,,,,,
6,ENSG00000001084,Ast_DeJager_eQTL,unadjusted_wp_2020,4,0,0.166978326,0.851335412,False,False,0.851335412,⋯,,,,,,,,,,


In [30]:

# Add a column for MR significance (example: FDR < 0.05)
df <- df %>%
  mutate(category = case_when(
    !is.na(Q_pval) & pass ~ "TWAS & MR",        # Both significant
    is.na(Q_pval) & pass ~ "TWAS only",         # Only TWAS significant
    !is.na(Q_pval) & !pass ~ "MR only",         # Only MR significant
    is.na(Q_pval) & !pass ~ "Not Significant",  # Neither significant
    TRUE ~ "Others"                              # Catch-all for remaining cases
  ))

In [31]:
df |> count(category)

category,n
<chr>,<int>
MR only,32
Not Significant,13997
TWAS & MR,24
TWAS only,54


In [32]:
# dir.create('data')
saveRDS(list(df = df, unadj_cutoff = unadj_cutoff, axis_df= axis_df), 'data/TWAS_manhattan_plot_data.rds')