# Analysis 09: Interval Overlap Identification

In [None]:

library(tidyverse)


── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:lubridate':

    intersect, setdiff, union

The following objects are masked from 'package:dplyr':

    combine, intersect, setdiff, union

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    anyDuplicated, aperm, append, as.data.frame, basename, cbind,
    colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
    get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
    Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:lubridate':

    second, second<-

The following


Attaching package: 'cowplot'

The following object is masked from 'package:lubridate':

    stamp


Attaching package: 'foreach'

The following objects are masked from 'package:purrr':

    accumulate, when

Loading required package: iterators
Loading required package: parallel

# Inputs

In [None]:

ns_dir <- "data/processed/20231116_Analysis_NemaScan"
tox_file <- "data/processed/tox_data/tox_metadata.csv"


# Outputs

In [None]:
out_dir <- "data/processed/interval_overlap"
if (!dir.exists(out_dir)) {
  dir.create(out_dir, recursive = TRUE)
}


# Main

Code to identify overlapping intervals and calculate the proportion of overlapping intervals for all toxicant pairs.

This script runs `pairwise_qtl_overlaps()` and processes the output.

-   Each row represents the overlaps between all possible trait pair combinations.
-   Trait pairs with no overlaps are listed as rows with `NA` values for `peakidA` and `peakidB`.
-   Trait pairs with multiple overlaps have a row for each overlap.

To get the overlaps each interval is involved in, we filter the output to only include rows where `peakidA` or `peakidB` is not `NA`.

The scipt also outputs dataframe summarizing the overlaps.

# outputs

`inbred_qtl_overlaps.csv` - dataframe with the following columns: - `peakidA` - peak marker ID for trait A - `peakidB` - peak marker ID for trait B - `traitA` - trait name for trait A - `traitB` - trait name for trait B - `groupA` - use or MoA for trait A - `groupB` - use or MoA for trait B - `n_overlaps` - number of overlapping intervals `inbred_qtl_overlaps_moa.csv` - dataframe with the same columns as `inbred_qtl_overlaps.csv` but where group represents the MoA group instead of the use group

`inbred_qtl_overlaps_summary.csv` - summary of the overlapping intervals `inbred_condition_pair_overlaps_summary.csv` - summary of the overlapping intervals by condition pair `overlap_qtl.csv` - list of intervals that overlap another interval

In [None]:
tox_metadata <- data.table::fread(tox_file) %>%
  dplyr::select(trait, nice_drug_label2, big_class, class, moa_class) %>%
  # adjust the uM to µM
  dplyr::mutate(nice_drug_label2 = dplyr::case_when(
    nice_drug_label2 == "uM" ~ "µM",
    TRUE ~ nice_drug_label2
  ))

### Load the inbred QTL data ###
inbred_qtl <- data.table::fread(
  glue::glue("{ns_dir}/INBRED/Mapping/Processed/QTL_peaks_inbred.tsv")
) %>%
  # remove the CV_length traits
  dplyr::filter(!stringr::str_detect(trait, pattern = "CV_length"))

# join the toxicant data to the qtl data
tox.QTL.summary <- inbred_qtl %>%
  dplyr::left_join(tox_metadata, by = c("trait" = "trait"))

### Create a dataframe with the data for just MoA or just use ###

tox_group_df <- tox.QTL.summary %>%
  dplyr::mutate(group = big_class)

tox_moa_df <- tox.QTL.summary %>%
  dplyr::mutate(group = moa_class)

#### Calculate Overlaps for all trait pairs #### ----

print("Calculating overlaps for class data")


[1] "Calculating overlaps for class data"

[1] "Generating all possible trait pairs"
[1] "Generated all possible trait pairs"
[1] "Starting to iterate over the pairs"
[1] "starting new pair"
[1] "length_2_4_D"          "length_Methyl_mercury"
Creating gRanges for trait length_2_4_D
Creating gRanges for trait length_Methyl_mercury
[1] "created both trait gRanges"
Calculating overlaps for length_2_4_D and length_Methyl_mercury
Number of overlaps: 0
[1] "calculated overlaps"
[1] "starting new pair"
[1] "length_2_4_D"         "length_Paraquat_62_5"
Creating gRanges for trait length_2_4_D
Creating gRanges for trait length_Paraquat_62_5
[1] "created both trait gRanges"
Calculating overlaps for length_2_4_D and length_Paraquat_62_5
Number of overlaps: 0
[1] "calculated overlaps"
[1] "starting new pair"
[1] "length_2_4_D"    "length_Aldicarb"
Creating gRanges for trait length_2_4_D
Creating gRanges for trait length_Aldicarb
[1] "created both trait gRanges"
Calculating overlaps for length_2_4_D and length_Aldicarb
Number of overlaps: 0


[1] "Calculating overlaps for MOA data"

[1] "Generating all possible trait pairs"
[1] "Generated all possible trait pairs"
[1] "Starting to iterate over the pairs"
[1] "starting new pair"
[1] "length_2_4_D"          "length_Methyl_mercury"
Creating gRanges for trait length_2_4_D
Creating gRanges for trait length_Methyl_mercury
[1] "created both trait gRanges"
Calculating overlaps for length_2_4_D and length_Methyl_mercury
Number of overlaps: 0
[1] "calculated overlaps"
[1] "starting new pair"
[1] "length_2_4_D"         "length_Paraquat_62_5"
Creating gRanges for trait length_2_4_D
Creating gRanges for trait length_Paraquat_62_5
[1] "created both trait gRanges"
Calculating overlaps for length_2_4_D and length_Paraquat_62_5
Number of overlaps: 0
[1] "calculated overlaps"
[1] "starting new pair"
[1] "length_2_4_D"    "length_Aldicarb"
Creating gRanges for trait length_2_4_D
Creating gRanges for trait length_Aldicarb
[1] "created both trait gRanges"
Calculating overlaps for length_2_4_D and length_Aldicarb
Number of overlaps: 0
