-
Notifications
You must be signed in to change notification settings - Fork 16
version 0.0.2
JunJunLao edited this page Nov 21, 2022
·
1 revision
ClusterGVis 出来一天不到,就已经陆陆续续的收到了粉丝们的 支持和鼓励, 说明大家还是挺喜欢的。当然也还有 很多的疑问和建议, 此次针对一些问题做出一些 更新和优化。
问题:
Q: 如何显示基因名?
- 考虑到当你的聚类 基因数量不多时, 其实是可以标注基因名的,此时你可以设置 show_row_names=TRUE 即可。但是当 基因数量特别多 的时候,也许你可以重点标注一些你感兴趣的基因名称出来即可,你可以设置 markGenes 来提供你的基因名称。
Q: 单细胞输入数据用什么?
- 有些小伙伴可能想把 ClusterGVis 应用到单细胞数据上面,但是单细胞数据对象里没有 tpm/fpkm/rpkm 这样的数据,这里是可以使用单细胞的 scaledata 作为输入,你需要设置 scaleData=TRUE, 因为默认会对输入数据进行 Z-score 操作。
Q: 折线怎么绘制成多个组?
- 这种情况属于是, 多个时间点的数据,然后分为处理组和对照组的表达谱数据。设置 mulGroup 指明你的组数即可。
Q: GO 通路的大小和颜色?
- 这里你可以使通路文字大小随 P 值变化,也可以指定颜色。
Q: 添加富集代码,一键富集,默认展示每个 cluster 的 top 富集结果?
- 安排,直接添加 GO 和 KEGG 的富集代码即可。enrichCluster 函数一键富集。
重新安装获取新功能:
# install.packages("devtools")
devtools::install_github("junjunlab/ClusterGVis")
依然是加载和聚类数据:
library(ClusterGVis)
# load data
data(exps)
# using mfuzz for clustering
# mfuzz
ck <- clusterData(exp = exps,
cluster.method = "kmeans",
cluster.num = 8)
随机取一些基因名进行标注:
# add gene name
markGenes = rownames(exps)[sample(1:nrow(exps),30,replace = F)]
pdf('addgene.pdf',height = 10,width = 6,onefile = F)
visCluster(object = ck,
plot.type = "heatmap",
column_names_rot = 45,
markGenes = markGenes)
dev.off()
加载数据,你可以提供三列的数据:
# load GO term data
data("termanno2")
# check
head(termanno2,3)
# id term pval
# 1 C1 developmental process 3.17e-69
# 2 C1 anatomical structure development 1.44e-68
# 3 C1 multicellular organism development 1.36e-66
默认画图:
# anno with GO terms
pdf('term.pdf',height = 10,width = 10,onefile = F)
visCluster(object = ck,
plot.type = "both",
column_names_rot = 45,
markGenes = markGenes,
annoTerm.data = termanno2)
dev.off()
调整基因的位置及相关图形参数:
# adjust gene labels and side
pdf('term2.pdf',height = 10,width = 10,onefile = F)
visCluster(object = ck,
plot.type = "both",
column_names_rot = 45,
markGenes = markGenes,
markGenes.side = "left",
genes.gp = c('italic',fontsize = 12,col = "orange"),
annoTerm.data = termanno2)
dev.off()
线的位置也调一下:
# adjust lines side
pdf('termlf.pdf',height = 10,width = 10,onefile = F)
visCluster(object = ck,
plot.type = "both",
column_names_rot = 45,
markGenes = markGenes,
markGenes.side = "left",
genes.gp = c('italic',fontsize = 12,col = "black"),
annoTerm.data = termanno2,
line.side = "left")
dev.off()
修改通路名称颜色和对应 p 值大小:
# adjust term colors and text size
pdf('termlfts.pdf',height = 10,width = 10,onefile = F)
visCluster(object = ck,
plot.type = "both",
column_names_rot = 45,
markGenes = markGenes,
markGenes.side = "left",
genes.gp = c('italic',fontsize = 12,col = "black"),
annoTerm.data = termanno2,
line.side = "left",
go.col = rep(ggsci::pal_d3()(8),each = 3),
go.size = "pval")
dev.off()
你需要使用 mulGroup 设置每个组的数量,同时设置对应的颜色:
# multiple groups
pdf('termlftsmp.pdf',height = 10,width = 11,onefile = F)
visCluster(object = ck,
plot.type = "both",
column_names_rot = 45,
show_row_dend = F,
markGenes = markGenes,
markGenes.side = "left",
genes.gp = c('italic',fontsize = 12,col = "black"),
annoTerm.data = termanno2,
line.side = "left",
go.col = rep(ggsci::pal_d3()(8),each = 3),
go.size = "pval",
mulGroup = c(2,2,2),
mline.col = c(ggsci::pal_lancet()(3)))
dev.off()
直接传入 clusterData 的结果即可,自动对每个 cluster 进行富集分析,设置随机种子保证结果可重复,默认种子为 5201314
:
library(org.Mm.eg.db)
# enrich for clusters
enrich <- enrichCluster(object = ck,
OrgDb = org.Mm.eg.db,
type = "BP",
pvalueCutoff = 0.05,
topn = 5,
seed = 5201314)
# check
head(enrich,3)
# group Description pvalue
# GO:0046034 C1 ATP metabolic process 5.823698e-07
# GO:0062012 C1 regulation of small molecule metabolic process 1.180990e-06
# GO:0002363 C1 alpha-beta T cell lineage commitment 1.202543e-06
注意: 如果设置 topn=NULL 即可获得所有的富集结果。默认取前 5 的富集通路 (P 值排序) 。
绘图:
# plot
pdf('termenrich.pdf',height = 10,width = 11,onefile = F)
visCluster(object = ck,
plot.type = "both",
column_names_rot = 45,
show_row_dend = F,
markGenes = markGenes,
markGenes.side = "left",
genes.gp = c('italic',fontsize = 12,col = "black"),
annoTerm.data = enrich,
line.side = "left",
go.col = rep(ggsci::pal_d3()(8),each = 5),
go.size = "pval",
mulGroup = c(2,2,2),
mline.col = c(ggsci::pal_lancet()(3)))
dev.off()
有任何疑问和想法,欢迎在 github 上面交流讨论!