Issue with hovering for standardized parallel coordinate plot for RNA seq #15

Galaxy154 · 2023-05-23T20:20:37Z

Hi there,

I love this tool but am having an issue both with my data and the example data with creating the interactive parallel coordinate plot. I can not get the gene names to show up when hovering over the individual lines.. I have followed the exact steps outlined here: https://lindsayrutter.github.io/bigPint/articles/pipeline.html#step-5-deg-litre-plots-2 , but am unable to get it work even with the example data. Instead, when hovering over the lines, I just get the sample names instead. I think this is a really cool functionally and would appreciate any help with getting the gene names to show up!

Here is an example image below along with the code I used to generate this:

library(bigPint)
library(dplyr)
library(ggplot2)
library(plotly)

data = data %>% select(ID, starts_with("B"), starts_with("L"))
str(data, strict.width = "wrap")

data_st <- as.data.frame(t(apply(as.matrix(data[,-1]), 1, scale)))
data_st$ID <- as.character(data$ID)
data_st <- data_st[,c(length(data_st), 1:length(data_st)-1)]
colnames(data_st) <- colnames(data)
nID <- which(is.nan(data_st[,2]))
data_st[nID,2:length(data_st)] <- 0

library(edgeR)
library(data.table)

rownames(data) = data[,1]

y = DGEList(counts=data[,-1])
group = c(1,1,1,1,2,2,2,2)

y = DGEList(counts=y, group=group)
Group = factor(c(rep("B",4), rep("L",4)))
design <- model.matrix(~0+Group, data=y$samples)
colnames(design) <- levels(Group)
y <- estimateDisp(y, design)
fit <- glmFit(y, design)
dataMetrics <- list()

contrast=rep(0,ncol(fit))
contrast[1]=1
contrast[2]=-1
lrt <- glmLRT(fit, contrast=contrast)
lrt <- topTags(lrt, n = nrow(y[[1]]))[[1]]

lrt <- setDT(lrt, keep.rownames = TRUE)[]
colnames(lrt)[1] = "ID"
lrt <- as.data.frame(lrt)

dataMetrics[[paste0(colnames(fit)[1], "_", colnames(fit)[2])]] <- lrt

ret <- plotPCP(data=data_st, saveFile = FALSE)
ret[["B_L"]]

ret <- plotPCP(data_st, dataMetrics, threshVal = 0.1, lineSize = 0.3,
lineColor = "magenta", saveFile = FALSE)
ret[["B_L"]] + ggtitle("DEGs (FDR < 0.1)")

#Making the plot
ret <- plotClusters(data_st, dataMetrics, threshVal = 0.1, nC = 2,
colList = c("#00A600FF", "#CC00FFFF"), lineSize = 0.5, verbose = TRUE)
plot(ret[["B_L_2"]])

ret <- plotPCP(data_st, dataMetrics, threshVal = 0.2, lineSize = 0.5,
lineColor = "magenta", saveFile = FALSE, hover = TRUE)
ret[["B_L"]] %>% layout(title="DEGs (FDR < 0.2)")

lindsayrutter · 2023-05-25T07:13:11Z

Hi galaxy:

Thanks for your inquiry. I see that your data has variable names "B" and "L" (instead of "S.1" and "S.2", as in the toy soybean_cn_sub data in the bigPint package) and that you have four samples for each variable (instead of 3 samples per variable, as in the toy soybean_cn_sub data in the bigPint package).

So, I tried to reproduce similar code from the website you were following (here), and altered the example data (soybean_cn_sub) by changing its variable names to "B" and "L" and creating a fourth sample for each variable. I did this using the add_column() function from the tibble package. This, I think, should then create a dataset similar to the one you are working on. After that, I used your exact code, and it seemed to generate the output you intended. See the code below:

library(bigPint)
library(dplyr)
library(ggplot2)
library(plotly)
library(tibble) # to use add_column() function

# Create a dataset similar to Galaxy154's
data("soybean_cn_sub")
data = soybean_cn_sub %>% select(ID, starts_with("S1"), starts_with("S3"))
names(data) = c("ID", "B.1", "B.2", "B.3", "L.1", "L.2", "L.3")
data = add_column(data, B.4 = data[,4], .after = 4) # Add a fourth sample to "B"
data = add_column(data, L.4 = data[,8], .after = 8) # Add a fourth sample to "L"

# Check structure of Galaxy154's data
str(data)

# The rest of the code is the original Galaxy154's code
data_st <- as.data.frame(t(apply(as.matrix(data[,-1]), 1, scale)))
data_st$ID <- as.character(data$ID)
data_st <- data_st[,c(length(data_st), 1:length(data_st)-1)]
colnames(data_st) <- colnames(data)
nID <- which(is.nan(data_st[,2]))
data_st[nID,2:length(data_st)] <- 0

library(edgeR)
library(data.table)

rownames(data) = data[,1]

y = DGEList(counts=data[,-1])
group = c(1,1,1,1,2,2,2,2)

y = DGEList(counts=y, group=group)
Group = factor(c(rep("B",4), rep("L",4)))
design <- model.matrix(~0+Group, data=y$samples)
colnames(design) <- levels(Group)
y <- estimateDisp(y, design)
fit <- glmFit(y, design)
dataMetrics <- list()

contrast=rep(0,ncol(fit))
contrast[1]=1
contrast[2]=-1
lrt <- glmLRT(fit, contrast=contrast)
lrt <- topTags(lrt, n = nrow(y[[1]]))[[1]]

lrt <- setDT(lrt, keep.rownames = TRUE)[]
colnames(lrt)[1] = "ID"
lrt <- as.data.frame(lrt)

dataMetrics[[paste0(colnames(fit)[1], "_", colnames(fit)[2])]] <- lrt

ret <- plotPCP(data=data_st, saveFile = FALSE)
ret[["B_L"]]

ret <- plotPCP(data_st, dataMetrics, threshVal = 0.1, lineSize = 0.3, lineColor = "magenta", saveFile = FALSE)
ret[["B_L"]] + ggtitle("DEGs (FDR < 0.1)")

#Making the plot
ret <- plotClusters(data_st, dataMetrics, threshVal = 0.1, nC = 2, colList = c("#00A600FF", "#CC00FFFF"), lineSize = 0.5, verbose = TRUE)
plot(ret[["B_L_2"]])

ret <- plotPCP(data_st, dataMetrics, threshVal = 0.2, lineSize = 0.5, lineColor = "magenta", saveFile = FALSE, hover = TRUE)
ret[["B_L"]] %>% layout(title="DEGs (FDR < 0.2)")

The above code does seem to create what you intend, i.e. an interactive plot that displays the gene names (instead of the sample names).

Does this code work for you too? If so, then the trick might be to determine how your data differs from the toy data I used the code above to simulate what I believe your data looks like.

If you are still stuck, please let me know what you get when you run the command:

str(data)

on your data frame. If your data frame is a different structure than the toy data I used above (i.e. gives a different format than when I ran str(data) in the code above), then that may pinpoint us to the source of the problem.

Thanks again!

lindsayrutter added the more-information-needed label Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with hovering for standardized parallel coordinate plot for RNA seq #15

Issue with hovering for standardized parallel coordinate plot for RNA seq #15

Galaxy154 commented May 23, 2023

lindsayrutter commented May 25, 2023 •

edited

Issue with hovering for standardized parallel coordinate plot for RNA seq #15

Issue with hovering for standardized parallel coordinate plot for RNA seq #15

Comments

Galaxy154 commented May 23, 2023

lindsayrutter commented May 25, 2023 • edited

lindsayrutter commented May 25, 2023 •

edited