GO_MWU.R error #7

Ruiqi-CUB · 2020-12-09T05:10:48Z

Hello Dr. Matz,

I was trying to run the GO_MWU.R but ended up with an error at the very first step. Would you mind having a look?

The code I tried to run was

gomwuStats(input, goDatabase, goAnnotations, goDivision,
	perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already
	largest=0.1,  # a GO category will not be considered if it contains more than this fraction of the total number of genes
	smallest=5,   # a GO category should contain at least this many genes to be considered
	clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details.
#	Alternative="g" # by default the MWU test is two-tailed; specify "g" or "l" of you want to test for "greater" or "less" instead. 
#	Module=TRUE,Alternative="g" # un-remark this if you are analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module)
#	Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA module 
)

The error I got was

go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25

Run parameters:

largest GO category as fraction of all genes (largest)  : 0.1
         smallest GO category as # of genes (smallest)  : 5
                clustering threshold (clusterCutHeight) : 0.25

-----------------
retrieving GO hierarchy, reformatting data...

-------------
go_reformat:
Genes with GO annotations, but not listed in measure table: 41394

Terms without defined level (old ontology?..): 0
-------------
-------------
go_nrify:
0 categories, 0 genes; size range 5-0
	0 too broad
	0 too small
	0 remaining

removing redundancy:

calculating GO term similarities based on shared genes...

 Error in read.table(inname, sep = "\t", header = T, check.names = F) : 
  no lines available in input

I checked the format of my input files but they look fine to me.

Would you mind having a look? Thank you so much!

The text was updated successfully, but these errors were encountered:

z0on · 2020-12-09T05:59:18Z

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file?

On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> wrote: Hello Dr. Matz, I was trying to run the GO_MWU.R but end up with an error in the very first step. Would you mind having a look? The code I tried to run was gomwuStats(input, goDatabase, goAnnotations, goDivision, perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes smallest=5, # a GO category should contain at least this many genes to be considered clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details. # Alternative="g" # by default the MWU test is two-tailed; specify "g" or "l" of you want to test for "greater" or "less" instead. # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module) # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA module ) The error I got was go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25 Run parameters: largest GO category as fraction of all genes (largest) : 0.1 smallest GO category as # of genes (smallest) : 5 clustering threshold (clusterCutHeight) : 0.25 ----------------- retrieving GO hierarchy, reformatting data... ------------- go_reformat: Genes with GO annotations, but not listed in measure table: 41394 Terms without defined level (old ontology?..): 0 ------------- ------------- go_nrify: 0 categories, 0 genes; size range 5-0 0 too broad 0 too small 0 remaining removing redundancy: calculating GO term similarities based on shared genes... Error in read.table(inname, sep = "\t", header = T, check.names = F) : no lines available in input I checked the format of my input files but they look fine to me. [image: image] <https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png> Would you mind having a look? Thank you so much! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA> .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB · 2020-12-09T06:03:40Z

Oh right! Thanks! I just remember that I had to add i_1 in one of the intermediate step for annotation. I will try it! On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz <notifications@github.com> wrote:

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file? On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> wrote: > Hello Dr. Matz, > > I was trying to run the GO_MWU.R but end up with an error in the very > first step. Would you mind having a look? > > The code I tried to run was > > gomwuStats(input, goDatabase, goAnnotations, goDivision, > perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already > largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes > smallest=5, # a GO category should contain at least this many genes to be considered > clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details. > # Alternative="g" # by default the MWU test is two-tailed; specify "g" or "l" of you want to test for "greater" or "less" instead. > # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module) > # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA module > ) > > The error I got was > > go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25 > > Run parameters: > > largest GO category as fraction of all genes (largest) : 0.1 > smallest GO category as # of genes (smallest) : 5 > clustering threshold (clusterCutHeight) : 0.25 > > ----------------- > retrieving GO hierarchy, reformatting data... > > ------------- > go_reformat: > Genes with GO annotations, but not listed in measure table: 41394 > > Terms without defined level (old ontology?..): 0 > ------------- > ------------- > go_nrify: > 0 categories, 0 genes; size range 5-0 > 0 too broad > 0 too small > 0 remaining > > removing redundancy: > > calculating GO term similarities based on shared genes... > > Error in read.table(inname, sep = "\t", header = T, check.names = F) : > no lines available in input > > I checked the format of my input files but they look fine to me. > [image: image] > < https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png > > > Would you mind having a look? Thank you so much! > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#7>, or unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA > > . > -- cheers Misha matzlab.weebly.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB · 2020-12-09T18:22:06Z

Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test? Best Ruiqi On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz <notifications@github.com> wrote:

…

Hi - thanks for trying out GO_MWU! Your gene names seem to be different in the fold-change table compared to annotations table. Remove ‘_i1’ from gene names in annotations file? On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> wrote: > Hello Dr. Matz, > > I was trying to run the GO_MWU.R but end up with an error in the very > first step. Would you mind having a look? > > The code I tried to run was > > gomwuStats(input, goDatabase, goAnnotations, goDivision, > perlPath="perl", # replace with full path to perl executable if it is not in your system's PATH already > largest=0.1, # a GO category will not be considered if it contains more than this fraction of the total number of genes > smallest=5, # a GO category should contain at least this many genes to be considered > clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) terms. See README for details. > # Alternative="g" # by default the MWU test is two-tailed; specify "g" or "l" of you want to test for "greater" or "less" instead. > # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module genes). In the call to gomwuPlot below, specify absValue=0.001 (count number of "good genes" that fall into the module) > # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA module > ) > > The error I got was > > go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 smallest=5 cutHeight=0.25 > > Run parameters: > > largest GO category as fraction of all genes (largest) : 0.1 > smallest GO category as # of genes (smallest) : 5 > clustering threshold (clusterCutHeight) : 0.25 > > ----------------- > retrieving GO hierarchy, reformatting data... > > ------------- > go_reformat: > Genes with GO annotations, but not listed in measure table: 41394 > > Terms without defined level (old ontology?..): 0 > ------------- > ------------- > go_nrify: > 0 categories, 0 genes; size range 5-0 > 0 too broad > 0 too small > 0 remaining > > removing redundancy: > > calculating GO term similarities based on shared genes... > > Error in read.table(inname, sep = "\t", header = T, check.names = F) : > no lines available in input > > I checked the format of my input files but they look fine to me. > [image: image] > < https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png > > > Would you mind having a look? Thank you so much! > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#7>, or unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA > > . > -- cheers Misha matzlab.weebly.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on · 2020-12-09T19:08:16Z

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough. I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it? cheers Misha

…

On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB ***@***.***> wrote: Hi Dr.Matz, It works! Thanks a lot! The BP one has been running for 2 hours but the other two were done in a few minutes. Is there a RAM or CPU requirement for running BP? Also, is there a particular reason that you use p-value instead of fold-change and adjusted p-value in the test? Best Ruiqi On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz ***@***.***> wrote: > Hi - thanks for trying out GO_MWU! > Your gene names seem to be different in the fold-change table compared to > annotations table. Remove ‘_i1’ from gene names in annotations file? > > On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> > wrote: > > > Hello Dr. Matz, > > > > I was trying to run the GO_MWU.R but end up with an error in the very > > first step. Would you mind having a look? > > > > The code I tried to run was > > > > gomwuStats(input, goDatabase, goAnnotations, goDivision, > > perlPath="perl", # replace with full path to perl executable if it is > not in your system's PATH already > > largest=0.1, # a GO category will not be considered if it contains more > than this fraction of the total number of genes > > smallest=5, # a GO category should contain at least this many genes to > be considered > > clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) > terms. See README for details. > > # Alternative="g" # by default the MWU test is two-tailed; specify "g" > or "l" of you want to test for "greater" or "less" instead. > > # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a > SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module > genes). In the call to gomwuPlot below, specify absValue=0.001 (count > number of "good genes" that fall into the module) > > # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA > module > > ) > > > > The error I got was > > > > go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 > smallest=5 cutHeight=0.25 > > > > Run parameters: > > > > largest GO category as fraction of all genes (largest) : 0.1 > > smallest GO category as # of genes (smallest) : 5 > > clustering threshold (clusterCutHeight) : 0.25 > > > > ----------------- > > retrieving GO hierarchy, reformatting data... > > > > ------------- > > go_reformat: > > Genes with GO annotations, but not listed in measure table: 41394 > > > > Terms without defined level (old ontology?..): 0 > > ------------- > > ------------- > > go_nrify: > > 0 categories, 0 genes; size range 5-0 > > 0 too broad > > 0 too small > > 0 remaining > > > > removing redundancy: > > > > calculating GO term similarities based on shared genes... > > > > Error in read.table(inname, sep = "\t", header = T, check.names = F) : > > no lines available in input > > > > I checked the format of my input files but they look fine to me. > > [image: image] > > < > https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png > > > > > > Would you mind having a look? Thank you so much! > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > <#7>, or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA > > > > . > > > -- > cheers > Misha > matzlab.weebly.com > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA > > . > -- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-10T15:51:27Z

Hi Misha, I really appreciate your help! I ended up running gowuStats on a server and then I downloaded the intermediate files on my laptop. It worked well. Ruiqi On Wed, Dec 9, 2020 at 12:10 PM Mikhail V Matz <notifications@github.com> wrote:

…

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough. I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it? cheers Misha On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB ***@***.***> wrote: > Hi Dr.Matz, > It works! Thanks a lot! The BP one has been running for 2 hours but the > other two were done in a few minutes. Is there a RAM or CPU requirement for > running BP? > Also, is there a particular reason that you use p-value instead of > fold-change and adjusted p-value in the test? > > Best > Ruiqi > > On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz ***@***.*** > > wrote: > > > Hi - thanks for trying out GO_MWU! > > Your gene names seem to be different in the fold-change table compared to > > annotations table. Remove ‘_i1’ from gene names in annotations file? > > > > On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> > > wrote: > > > > > Hello Dr. Matz, > > > > > > I was trying to run the GO_MWU.R but end up with an error in the very > > > first step. Would you mind having a look? > > > > > > The code I tried to run was > > > > > > gomwuStats(input, goDatabase, goAnnotations, goDivision, > > > perlPath="perl", # replace with full path to perl executable if it is > > not in your system's PATH already > > > largest=0.1, # a GO category will not be considered if it contains more > > than this fraction of the total number of genes > > > smallest=5, # a GO category should contain at least this many genes to > > be considered > > > clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) > > terms. See README for details. > > > # Alternative="g" # by default the MWU test is two-tailed; specify "g" > > or "l" of you want to test for "greater" or "less" instead. > > > # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a > > SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module > > genes). In the call to gomwuPlot below, specify absValue=0.001 (count > > number of "good genes" that fall into the module) > > > # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA > > module > > > ) > > > > > > The error I got was > > > > > > go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 > > smallest=5 cutHeight=0.25 > > > > > > Run parameters: > > > > > > largest GO category as fraction of all genes (largest) : 0.1 > > > smallest GO category as # of genes (smallest) : 5 > > > clustering threshold (clusterCutHeight) : 0.25 > > > > > > ----------------- > > > retrieving GO hierarchy, reformatting data... > > > > > > ------------- > > > go_reformat: > > > Genes with GO annotations, but not listed in measure table: 41394 > > > > > > Terms without defined level (old ontology?..): 0 > > > ------------- > > > ------------- > > > go_nrify: > > > 0 categories, 0 genes; size range 5-0 > > > 0 too broad > > > 0 too small > > > 0 remaining > > > > > > removing redundancy: > > > > > > calculating GO term similarities based on shared genes... > > > > > > Error in read.table(inname, sep = "\t", header = T, check.names = F) : > > > no lines available in input > > > > > > I checked the format of my input files but they look fine to me. > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png > > > > > > > > > Would you mind having a look? Thank you so much! > > > > > > — > > > You are receiving this because you are subscribed to this thread. > > > Reply to this email directly, view it on GitHub > > > <#7>, or unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA > > > > > > . > > > > > -- > > cheers > > Misha > > matzlab.weebly.com > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA > > > > . > > > > > -- > Ruiqi Li > PhD Student > Dept. of Ecology and Evolutionary Biology > University of Colorado Boulder > pronouns: he/him/his > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIR4SFMWSO4LDJCKTNLST7DLBANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB · 2020-12-14T18:45:34Z

Hi Misha, are suggestions if I have too many GO terms (~70) on the final figure? I have already used the "strict" cutoffs.
Thanks a lot

Ruiqi

Ruiqi-CUB · 2020-12-18T01:03:44Z

Hi Misha, I got a lot of go terms (>50) in the figure even if I use some very strict cutoff values for -logP (level1=0.01, level2=0.001, level3=0.0001). Would you mind giving me some suggestions? I noticed that you tend to have very few GO terms in your GO MWU figures in your papers. Did you do any filtration? Thank you so much for your help! Here is an example of my GO MWU figure. [image: image.png] Best Ruiqi On Wed, Dec 9, 2020 at 12:10 PM Mikhail V Matz <notifications@github.com> wrote:

…

Hi Ruiqi - yeah, BP can take a long time for large richly annotated datasets. I suspect it is a memory problem, but I am not quite sure. Try lreducing the number of genes in your data (toss more of the low-abundant ones, until you have ~10-12K genes remaining) - this should not affect the GO summaries if the signal is robust enough. I prefer "signed -log pvalues" because they tend to give stronger signals, but broadly the same result should be obtainable with just log-fold changes. Try it? cheers Misha On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB ***@***.***> wrote: > Hi Dr.Matz, > It works! Thanks a lot! The BP one has been running for 2 hours but the > other two were done in a few minutes. Is there a RAM or CPU requirement for > running BP? > Also, is there a particular reason that you use p-value instead of > fold-change and adjusted p-value in the test? > > Best > Ruiqi > > On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz ***@***.*** > > wrote: > > > Hi - thanks for trying out GO_MWU! > > Your gene names seem to be different in the fold-change table compared to > > annotations table. Remove ‘_i1’ from gene names in annotations file? > > > > On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> > > wrote: > > > > > Hello Dr. Matz, > > > > > > I was trying to run the GO_MWU.R but end up with an error in the very > > > first step. Would you mind having a look? > > > > > > The code I tried to run was > > > > > > gomwuStats(input, goDatabase, goAnnotations, goDivision, > > > perlPath="perl", # replace with full path to perl executable if it is > > not in your system's PATH already > > > largest=0.1, # a GO category will not be considered if it contains more > > than this fraction of the total number of genes > > > smallest=5, # a GO category should contain at least this many genes to > > be considered > > > clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) > > terms. See README for details. > > > # Alternative="g" # by default the MWU test is two-tailed; specify "g" > > or "l" of you want to test for "greater" or "less" instead. > > > # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a > > SIGNED WGCNA module (values: 0 for not in module genes, kME for in-module > > genes). In the call to gomwuPlot below, specify absValue=0.001 (count > > number of "good genes" that fall into the module) > > > # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA > > module > > > ) > > > > > > The error I got was > > > > > > go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 > > smallest=5 cutHeight=0.25 > > > > > > Run parameters: > > > > > > largest GO category as fraction of all genes (largest) : 0.1 > > > smallest GO category as # of genes (smallest) : 5 > > > clustering threshold (clusterCutHeight) : 0.25 > > > > > > ----------------- > > > retrieving GO hierarchy, reformatting data... > > > > > > ------------- > > > go_reformat: > > > Genes with GO annotations, but not listed in measure table: 41394 > > > > > > Terms without defined level (old ontology?..): 0 > > > ------------- > > > ------------- > > > go_nrify: > > > 0 categories, 0 genes; size range 5-0 > > > 0 too broad > > > 0 too small > > > 0 remaining > > > > > > removing redundancy: > > > > > > calculating GO term similarities based on shared genes... > > > > > > Error in read.table(inname, sep = "\t", header = T, check.names = F) : > > > no lines available in input > > > > > > I checked the format of my input files but they look fine to me. > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png > > > > > > > > > Would you mind having a look? Thank you so much! > > > > > > — > > > You are receiving this because you are subscribed to this thread. > > > Reply to this email directly, view it on GitHub > > > <#7>, or unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA > > > > > > . > > > > > -- > > cheers > > Misha > > matzlab.weebly.com > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA > > > > . > > > > > -- > Ruiqi Li > PhD Student > Dept. of Ecology and Evolutionary Biology > University of Colorado Boulder > pronouns: he/him/his > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIR4SFMWSO4LDJCKTNLST7DLBANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on · 2020-12-18T01:32:26Z

Hi Ruiqi - this is possible, I have seen that. but LAO can be a mistake in setting up options or the measures table. Can you please post the resulting figure? (crank down the levels another 10-fold if you must) Misha

…

On Thu, Dec 17, 2020 at 7:04 PM Ruiqi-CUB ***@***.***> wrote: Hi Misha, I got a lot of go terms (>50) in the figure even if I use some very strict cutoff values for -logP (level1=0.01, level2=0.001, level3=0.0001). Would you mind giving me some suggestions? I noticed that you tend to have very few GO terms in your GO MWU figures in your papers. Did you do any filtration? Thank you so much for your help! Here is an example of my GO MWU figure. [image: image.png] Best Ruiqi On Wed, Dec 9, 2020 at 12:10 PM Mikhail V Matz ***@***.***> wrote: > Hi Ruiqi - yeah, BP can take a long time for large richly annotated > datasets. I suspect it is a memory problem, but I am not quite sure. Try > lreducing the number of genes in your data (toss more of the low-abundant > ones, until you have ~10-12K genes remaining) - this should not affect the > GO summaries if the signal is robust enough. > > I prefer "signed -log pvalues" because they tend to give stronger signals, > but broadly the same result should be obtainable with just log-fold > changes. Try it? > > cheers > Misha > > On Wed, Dec 9, 2020 at 12:22 PM Ruiqi-CUB ***@***.***> > wrote: > > > Hi Dr.Matz, > > It works! Thanks a lot! The BP one has been running for 2 hours but the > > other two were done in a few minutes. Is there a RAM or CPU requirement > for > > running BP? > > Also, is there a particular reason that you use p-value instead of > > fold-change and adjusted p-value in the test? > > > > Best > > Ruiqi > > > > On Tue, Dec 8, 2020 at 10:59 PM Mikhail V Matz < ***@***.*** > > > > wrote: > > > > > Hi - thanks for trying out GO_MWU! > > > Your gene names seem to be different in the fold-change table compared > to > > > annotations table. Remove ‘_i1’ from gene names in annotations file? > > > > > > On Tue, Dec 8, 2020 at 11:11 PM Ruiqi-CUB ***@***.***> > > > wrote: > > > > > > > Hello Dr. Matz, > > > > > > > > I was trying to run the GO_MWU.R but end up with an error in the very > > > > first step. Would you mind having a look? > > > > > > > > The code I tried to run was > > > > > > > > gomwuStats(input, goDatabase, goAnnotations, goDivision, > > > > perlPath="perl", # replace with full path to perl executable if it is > > > not in your system's PATH already > > > > largest=0.1, # a GO category will not be considered if it contains > more > > > than this fraction of the total number of genes > > > > smallest=5, # a GO category should contain at least this many genes > to > > > be considered > > > > clusterCutHeight=0.25, # threshold for merging similar (gene-sharing) > > > terms. See README for details. > > > > # Alternative="g" # by default the MWU test is two-tailed; specify > "g" > > > or "l" of you want to test for "greater" or "less" instead. > > > > # Module=TRUE,Alternative="g" # un-remark this if you are analyzing a > > > SIGNED WGCNA module (values: 0 for not in module genes, kME for > in-module > > > genes). In the call to gomwuPlot below, specify absValue=0.001 (count > > > number of "good genes" that fall into the module) > > > > # Module=TRUE # un-remark this if you are analyzing an UNSIGNED WGCNA > > > module > > > > ) > > > > > > > > The error I got was > > > > > > > > go.obo scruposum_gene2go.tab scruposum_foldchange.csv CC largest=0.1 > > > smallest=5 cutHeight=0.25 > > > > > > > > Run parameters: > > > > > > > > largest GO category as fraction of all genes (largest) : 0.1 > > > > smallest GO category as # of genes (smallest) : 5 > > > > clustering threshold (clusterCutHeight) : 0.25 > > > > > > > > ----------------- > > > > retrieving GO hierarchy, reformatting data... > > > > > > > > ------------- > > > > go_reformat: > > > > Genes with GO annotations, but not listed in measure table: 41394 > > > > > > > > Terms without defined level (old ontology?..): 0 > > > > ------------- > > > > ------------- > > > > go_nrify: > > > > 0 categories, 0 genes; size range 5-0 > > > > 0 too broad > > > > 0 too small > > > > 0 remaining > > > > > > > > removing redundancy: > > > > > > > > calculating GO term similarities based on shared genes... > > > > > > > > Error in read.table(inname, sep = "\t", header = T, check.names = F) > : > > > > no lines available in input > > > > > > > > I checked the format of my input files but they look fine to me. > > > > [image: image] > > > > < > > > > > > https://user-images.githubusercontent.com/46695842/101587791-2b5e0800-39a2-11eb-97a8-055eb2c51bc7.png > > > > > > > > > > > > Would you mind having a look? Thank you so much! > > > > > > > > — > > > > You are receiving this because you are subscribed to this thread. > > > > Reply to this email directly, view it on GitHub > > > > <#7>, or unsubscribe > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGHU4VAH64QSOA6DHGTST4BGLANCNFSM4US6PVXA > > > > > > > > . > > > > > > > -- > > > cheers > > > Misha > > > matzlab.weebly.com > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ALEILIUEEWLBWDBBDUUA2LTST4G4HANCNFSM4US6PVXA > > > > > > . > > > > > > > > > -- > > Ruiqi Li > > PhD Student > > Dept. of Ecology and Evolutionary Biology > > University of Colorado Boulder > > pronouns: he/him/his > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ABZUHGDTHMBRN3JUBJLGNC3ST6557ANCNFSM4US6PVXA > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALEILIR4SFMWSO4LDJCKTNLST7DLBANCNFSM4US6PVXA > > . > -- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGEH5BSW4LI3UTCKLW3SVKS75ANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-18T03:07:03Z

Sorry it seems like the figure did not go through via gmail.

Here is the figure with level1=0.01, level2=0.001, level3=0.0001

Here is the figure with another 10-fold down but it still looks messy.

One thing on the second figure I noticed is that some GO terms were plotted even the number is 0.

The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001

Here is the R code in GO_MWU.R

z0on · 2020-12-18T04:00:29Z

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements?

…

On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB ***@***.***> wrote: Sorry it seems like the figure did not go through via gmail. Here is the figure with level1=0.01, level2=0.001, level3=0.0001 [image: image] <https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png> Here is the figure with another 10-fold down but it still looks messy. [image: image] <https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png> One thing on the second figure I noticed is that some GO terms were plotted even the number is 0. [image: image] <https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png> The figure get even messier with BP, even using level1=0.001, level2=0.0001, level3=0.00001 [image: image] <https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png> Here is the R code in GO_MWU.R [image: image] <https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-18T04:39:28Z

The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU? Ruiqi On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz <notifications@github.com> wrote:

…

This is a bit strange, but not entirely impossible... What is your measure for ranking? Do you include all genes for which there are gene expression measurements? On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB ***@***.***> wrote: > Sorry it seems like the figure did not go through via gmail. > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > [image: image] > < https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > Here is the figure with another 10-fold down but it still looks messy. > [image: image] > < https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > One thing on the second figure I noticed is that some GO terms were > plotted even the number is 0. > > [image: image] > < https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > The figure get even messier with BP, even using level1=0.001, > level2=0.0001, level3=0.00001 > [image: image] > < https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > Here is the R code in GO_MWU.R > [image: image] > < https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on · 2020-12-18T05:14:38Z

Sounds correct... are you studying some model organism (with really good annotations) perhaps?

On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB ***@***.***> wrote: The measure of ranking is -log10(unadjusted-P). Yes I did use all the genes for which there are gene expression measurements from DESeq2 output. But I think it should not matter since we are specifying cutoffs for p-value at gomwuPlot step? Should I do any filtration based on parameters such as gene expression count before performing GO_MWU? Ruiqi On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz ***@***.***> wrote: > This is a bit strange, but not entirely impossible... > What is your measure for ranking? Do you include all genes for which there > are gene expression measurements? > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB ***@***.***> > wrote: > > > Sorry it seems like the figure did not go through via gmail. > > > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > > [image: image] > > < > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > Here is the figure with another 10-fold down but it still looks messy. > > [image: image] > > < > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > One thing on the second figure I noticed is that some GO terms were > > plotted even the number is 0. > > > > [image: image] > > < > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > The figure get even messier with BP, even using level1=0.001, > > level2=0.0001, level3=0.00001 > > [image: image] > > < > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > Here is the R code in GO_MWU.R > > [image: image] > > < > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > . > -- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA> .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB · 2020-12-18T06:04:54Z

No.. they are cockles from subfamily Fraginae, which does not have good annotations. On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz <notifications@github.com> wrote:

…

Sounds correct... are you studying some model organism (with really good annotations) perhaps? On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB ***@***.***> wrote: > The measure of ranking is -log10(unadjusted-P). > Yes I did use all the genes for which there are gene expression > measurements from DESeq2 output. But I think it should not matter since we > are specifying cutoffs for p-value at gomwuPlot step? Should I do any > filtration based on parameters such as gene expression count before > performing GO_MWU? > > Ruiqi > > On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz ***@***.*** > > wrote: > > > This is a bit strange, but not entirely impossible... > > What is your measure for ranking? Do you include all genes for which > there > > are gene expression measurements? > > > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB ***@***.***> > > wrote: > > > > > Sorry it seems like the figure did not go through via gmail. > > > > > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > > > > Here is the figure with another 10-fold down but it still looks messy. > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > > > > One thing on the second figure I noticed is that some GO terms were > > > plotted even the number is 0. > > > > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > > > > The figure get even messier with BP, even using level1=0.001, > > > level2=0.0001, level3=0.00001 > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > > > > Here is the R code in GO_MWU.R > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > > > > — > > > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > > > . > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > > > . > > > > > -- > Ruiqi Li > PhD Student > Dept. of Ecology and Evolutionary Biology > University of Colorado Boulder > pronouns: he/him/his > > — > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA > > . > -- cheers Misha matzlab.weebly.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB · 2020-12-18T07:15:38Z

Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results? Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz <notifications@github.com> wrote:

Sounds correct... are you studying some model organism (with really good annotations) perhaps? On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB ***@***.***> wrote: > The measure of ranking is -log10(unadjusted-P). > Yes I did use all the genes for which there are gene expression > measurements from DESeq2 output. But I think it should not matter since we > are specifying cutoffs for p-value at gomwuPlot step? Should I do any > filtration based on parameters such as gene expression count before > performing GO_MWU? > > Ruiqi > > On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz ***@***.*** > > wrote: > > > This is a bit strange, but not entirely impossible... > > What is your measure for ranking? Do you include all genes for which > there > > are gene expression measurements? > > > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB ***@***.***> > > wrote: > > > > > Sorry it seems like the figure did not go through via gmail. > > > > > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > > > > Here is the figure with another 10-fold down but it still looks messy. > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > > > > One thing on the second figure I noticed is that some GO terms were > > > plotted even the number is 0. > > > > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > > > > The figure get even messier with BP, even using level1=0.001, > > > level2=0.0001, level3=0.00001 > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > > > > Here is the R code in GO_MWU.R > > > [image: image] > > > < > > > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > > > > — > > > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > > > . > > > > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > > > . > > > > > -- > Ruiqi Li > PhD Student > Dept. of Ecology and Evolutionary Biology > University of Colorado Boulder > pronouns: he/him/his > > — > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA > > . > -- cheers Misha matzlab.weebly.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on · 2020-12-18T13:01:27Z

That is correct - There must be no cutoffs. The input file must list all genes, -log10p for each. Can you maybe send me your annotations file and the input file? Also, what is the experiment? (If I may ask)

On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB ***@***.***> wrote: Since I am using -log10P for the measurement of ranking, there is no cut-off for logFC. Would that impact the results? Ruiqi On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz ***@***.***> wrote: > Sounds correct... are you studying some model organism (with really good > annotations) perhaps? > > On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB ***@***.***> > wrote: > > > The measure of ranking is -log10(unadjusted-P). > > Yes I did use all the genes for which there are gene expression > > measurements from DESeq2 output. But I think it should not matter since > we > > are specifying cutoffs for p-value at gomwuPlot step? Should I do any > > filtration based on parameters such as gene expression count before > > performing GO_MWU? > > > > Ruiqi > > > > On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < ***@***.*** > > > > wrote: > > > > > This is a bit strange, but not entirely impossible... > > > What is your measure for ranking? Do you include all genes for which > > there > > > are gene expression measurements? > > > > > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB ***@***.***> > > > wrote: > > > > > > > Sorry it seems like the figure did not go through via gmail. > > > > > > > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > > > > [image: image] > > > > < > > > > > > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > > > > > > > Here is the figure with another 10-fold down but it still looks > messy. > > > > [image: image] > > > > < > > > > > > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > > > > > > > One thing on the second figure I noticed is that some GO terms were > > > > plotted even the number is 0. > > > > > > > > [image: image] > > > > < > > > > > > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > > > > > > > The figure get even messier with BP, even using level1=0.001, > > > > level2=0.0001, level3=0.00001 > > > > [image: image] > > > > < > > > > > > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > > > > > > > Here is the R code in GO_MWU.R > > > > [image: image] > > > > < > > > > > > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > > > > > > > — > > > > You are receiving this because you commented. > > > > Reply to this email directly, view it on GitHub > > > > <#7 (comment)>, or > > > > unsubscribe > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > > > > > . > > > > > > > > > > — > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > > > > > . > > > > > > > > > -- > > Ruiqi Li > > PhD Student > > Dept. of Ecology and Evolutionary Biology > > University of Colorado Boulder > > pronouns: he/him/his > > > > — > > You are receiving this because you commented. > > > > > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA > > > > . > > > -- > cheers > Misha > matzlab.weebly.com > > — > > > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA > > . > -- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA> .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB · 2020-12-18T14:44:27Z

Thanks! May I send them to your personal email? On Fri, Dec 18, 2020 at 6:02 AM Mikhail V Matz <notifications@github.com> wrote:

That is correct - There must be no cutoffs. The input file must list all genes, -log10p for each. Can you maybe send me your annotations file and the input file? Also, what is the experiment? (If I may ask) On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB ***@***.***> wrote: > Since I am using -log10P for the measurement of ranking, there is no > cut-off for logFC. Would that impact the results? > > Ruiqi > On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz < ***@***.***> > wrote: > > > Sounds correct... are you studying some model organism (with really good > > annotations) perhaps? > > > > On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB ***@***.***> > > wrote: > > > > > The measure of ranking is -log10(unadjusted-P). > > > Yes I did use all the genes for which there are gene expression > > > measurements from DESeq2 output. But I think it should not matter since > > we > > > are specifying cutoffs for p-value at gomwuPlot step? Should I do any > > > filtration based on parameters such as gene expression count before > > > performing GO_MWU? > > > > > > Ruiqi > > > > > > On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < > ***@***.*** > > > > > > wrote: > > > > > > > This is a bit strange, but not entirely impossible... > > > > What is your measure for ranking? Do you include all genes for which > > > there > > > > are gene expression measurements? > > > > > > > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB < ***@***.***> > > > > wrote: > > > > > > > > > Sorry it seems like the figure did not go through via gmail. > > > > > > > > > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > > > > > [image: image] > > > > > < > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > > > > > > > > > > Here is the figure with another 10-fold down but it still looks > > messy. > > > > > [image: image] > > > > > < > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > > > > > > > > > > One thing on the second figure I noticed is that some GO terms were > > > > > plotted even the number is 0. > > > > > > > > > > [image: image] > > > > > < > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > > > > > > > > > > The figure get even messier with BP, even using level1=0.001, > > > > > level2=0.0001, level3=0.00001 > > > > > [image: image] > > > > > < > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > > > > > > > > > > Here is the R code in GO_MWU.R > > > > > [image: image] > > > > > < > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > > > > > > > > > > — > > > > > You are receiving this because you commented. > > > > > Reply to this email directly, view it on GitHub > > > > > <#7 (comment) >, > or > > > > > unsubscribe > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > > > > > > > . > > > > > > > > > > > > > — > > > > You are receiving this because you authored the thread. > > > > Reply to this email directly, view it on GitHub > > > > <#7 (comment)>, or > > > > unsubscribe > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > > > > > > > . > > > > > > > > > > > > > -- > > > Ruiqi Li > > > PhD Student > > > Dept. of Ecology and Evolutionary Biology > > > University of Colorado Boulder > > > pronouns: he/him/his > > > > > > — > > > You are receiving this because you commented. > > > > > > > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA > > > > > > . > > > > > -- > > cheers > > Misha > > matzlab.weebly.com > > > > — > > > > > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA > > > > . > > > -- > Ruiqi Li > PhD Student > Dept. of Ecology and Evolutionary Biology > University of Colorado Boulder > pronouns: he/him/his > > — > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA > > . > -- cheers Misha matzlab.weebly.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILISABPRBJETI3TD3V3DSVNHDNANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB · 2020-12-18T16:19:17Z

Hi Misha, I just send you all the files to your email. Please let me know if you can access them! Thanks a lot!

z0on · 2020-12-18T17:36:52Z

sure!

…

On Fri, Dec 18, 2020 at 8:44 AM Ruiqi-CUB ***@***.***> wrote: Thanks! May I send them to your personal email? On Fri, Dec 18, 2020 at 6:02 AM Mikhail V Matz ***@***.***> wrote: > That is correct - There must be no cutoffs. The input file must list all > genes, -log10p for each. > Can you maybe send me your annotations file and the input file? Also, what > is the experiment? (If I may ask) > > On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB ***@***.***> > wrote: > > > Since I am using -log10P for the measurement of ranking, there is no > > cut-off for logFC. Would that impact the results? > > > > Ruiqi > > On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz < > ***@***.***> > > wrote: > > > > > Sounds correct... are you studying some model organism (with really > good > > > annotations) perhaps? > > > > > > On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB ***@***.*** > > > > wrote: > > > > > > > The measure of ranking is -log10(unadjusted-P). > > > > Yes I did use all the genes for which there are gene expression > > > > measurements from DESeq2 output. But I think it should not matter > since > > > we > > > > are specifying cutoffs for p-value at gomwuPlot step? Should I do any > > > > filtration based on parameters such as gene expression count before > > > > performing GO_MWU? > > > > > > > > Ruiqi > > > > > > > > On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < > > ***@***.*** > > > > > > > > wrote: > > > > > > > > > This is a bit strange, but not entirely impossible... > > > > > What is your measure for ranking? Do you include all genes for > which > > > > there > > > > > are gene expression measurements? > > > > > > > > > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB < > ***@***.***> > > > > > wrote: > > > > > > > > > > > Sorry it seems like the figure did not go through via gmail. > > > > > > > > > > > > Here is the figure with level1=0.01, level2=0.001, level3=0.0001 > > > > > > [image: image] > > > > > > < > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > > > > > > > > > > > > > Here is the figure with another 10-fold down but it still looks > > > messy. > > > > > > [image: image] > > > > > > < > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > > > > > > > > > > > > > One thing on the second figure I noticed is that some GO terms > were > > > > > > plotted even the number is 0. > > > > > > > > > > > > [image: image] > > > > > > < > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > > > > > > > > > > > > > The figure get even messier with BP, even using level1=0.001, > > > > > > level2=0.0001, level3=0.00001 > > > > > > [image: image] > > > > > > < > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > > > > > > > > > > > > > Here is the R code in GO_MWU.R > > > > > > [image: image] > > > > > > < > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > > > > > > > > > > > > > — > > > > > > You are receiving this because you commented. > > > > > > Reply to this email directly, view it on GitHub > > > > > > < #7 (comment) > >, > > or > > > > > > unsubscribe > > > > > > < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > > > > > > > > > . > > > > > > > > > > > > > > > > — > > > > > You are receiving this because you authored the thread. > > > > > Reply to this email directly, view it on GitHub > > > > > <#7 (comment) >, > or > > > > > unsubscribe > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > > > > > > > > > . > > > > > > > > > > > > > > > > > -- > > > > Ruiqi Li > > > > PhD Student > > > > Dept. of Ecology and Evolutionary Biology > > > > University of Colorado Boulder > > > > pronouns: he/him/his > > > > > > > > — > > > > You are receiving this because you commented. > > > > > > > > > > > > Reply to this email directly, view it on GitHub > > > > <#7 (comment)>, or > > > > unsubscribe > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA > > > > > > > > . > > > > > > > -- > > > cheers > > > Misha > > > matzlab.weebly.com > > > > > > — > > > > > > > > > You are receiving this because you authored the thread. > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA > > > > > > . > > > > > -- > > Ruiqi Li > > PhD Student > > Dept. of Ecology and Evolutionary Biology > > University of Colorado Boulder > > pronouns: he/him/his > > > > — > > You are receiving this because you commented. > > > > > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA > > > > . > > > -- > cheers > Misha > matzlab.weebly.com > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALEILISABPRBJETI3TD3V3DSVNHDNANCNFSM4US6PVXA > > . > -- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGGFUV4KR5SZK7ZG7QTSVNTGBANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-18T18:52:39Z

Thanks a lot! I have sent them to you via Google Drive and gmail. There is one thought that came across to me: since I am using -log10P, the logFC won't impact the results in the rank test. For example, gene 1 (logFC=0.4, -log10P=10) and gene 2 (logFC=5, -log10P=10) would have the same power in the test. If I believe that gene with log2FC<1 is not really differentially expressed. Should I set up a cutoff at logFC<1 for the DE results, then perform GO_MWU? Best Ruiqi On Fri, Dec 18, 2020 at 10:37 AM Mikhail V Matz <notifications@github.com> wrote:

…

sure! On Fri, Dec 18, 2020 at 8:44 AM Ruiqi-CUB ***@***.***> wrote: > Thanks! May I send them to your personal email? > > On Fri, Dec 18, 2020 at 6:02 AM Mikhail V Matz ***@***.*** > > wrote: > > > That is correct - There must be no cutoffs. The input file must list all > > genes, -log10p for each. > > Can you maybe send me your annotations file and the input file? Also, > what > > is the experiment? (If I may ask) > > > > On Fri, Dec 18, 2020 at 1:15 AM Ruiqi-CUB ***@***.***> > > wrote: > > > > > Since I am using -log10P for the measurement of ranking, there is no > > > cut-off for logFC. Would that impact the results? > > > > > > Ruiqi > > > On Thu, Dec 17, 2020 at 10:14 PM Mikhail V Matz < > > ***@***.***> > > > wrote: > > > > > > > Sounds correct... are you studying some model organism (with really > > good > > > > annotations) perhaps? > > > > > > > > On Thu, Dec 17, 2020 at 10:39 PM Ruiqi-CUB < ***@***.*** > > > > > > wrote: > > > > > > > > > The measure of ranking is -log10(unadjusted-P). > > > > > Yes I did use all the genes for which there are gene expression > > > > > measurements from DESeq2 output. But I think it should not matter > > since > > > > we > > > > > are specifying cutoffs for p-value at gomwuPlot step? Should I do > any > > > > > filtration based on parameters such as gene expression count before > > > > > performing GO_MWU? > > > > > > > > > > Ruiqi > > > > > > > > > > On Thu, Dec 17, 2020 at 9:00 PM Mikhail V Matz < > > > ***@***.*** > > > > > > > > > > wrote: > > > > > > > > > > > This is a bit strange, but not entirely impossible... > > > > > > What is your measure for ranking? Do you include all genes for > > which > > > > > there > > > > > > are gene expression measurements? > > > > > > > > > > > > On Thu, Dec 17, 2020 at 9:07 PM Ruiqi-CUB < > > ***@***.***> > > > > > > wrote: > > > > > > > > > > > > > Sorry it seems like the figure did not go through via gmail. > > > > > > > > > > > > > > Here is the figure with level1=0.01, level2=0.001, > level3=0.0001 > > > > > > > [image: image] > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569272-621ed700-40a2-11eb-8da3-0d77b61b4297.png > > > > > > > > > > > > > > > > > > > > > Here is the figure with another 10-fold down but it still looks > > > > messy. > > > > > > > [image: image] > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569321-7f53a580-40a2-11eb-96ee-d251c9951830.png > > > > > > > > > > > > > > > > > > > > > One thing on the second figure I noticed is that some GO terms > > were > > > > > > > plotted even the number is 0. > > > > > > > > > > > > > > [image: image] > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569400-a8743600-40a2-11eb-8f21-ffe1691a1a30.png > > > > > > > > > > > > > > > > > > > > > The figure get even messier with BP, even using level1=0.001, > > > > > > > level2=0.0001, level3=0.00001 > > > > > > > [image: image] > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569747-4405a680-40a3-11eb-9c60-705d82eb96fd.png > > > > > > > > > > > > > > > > > > > > > Here is the R code in GO_MWU.R > > > > > > > [image: image] > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://user-images.githubusercontent.com/46695842/102569823-6ac3dd00-40a3-11eb-995a-2de962d776eb.png > > > > > > > > > > > > > > > > > > > > > — > > > > > > > You are receiving this because you commented. > > > > > > > Reply to this email directly, view it on GitHub > > > > > > > < > #7 (comment) > > >, > > > or > > > > > > > unsubscribe > > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGANSQ4LSJPLETHUYDDSVLBOJANCNFSM4US6PVXA > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > — > > > > > > You are receiving this because you authored the thread. > > > > > > Reply to this email directly, view it on GitHub > > > > > > < #7 (comment) > >, > > or > > > > > > unsubscribe > > > > > > < > > > > > > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ALEILIWGZAMK7TBGCE4KVRTSVLHWXANCNFSM4US6PVXA > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ruiqi Li > > > > > PhD Student > > > > > Dept. of Ecology and Evolutionary Biology > > > > > University of Colorado Boulder > > > > > pronouns: he/him/his > > > > > > > > > > — > > > > > You are receiving this because you commented. > > > > > > > > > > > > > > > Reply to this email directly, view it on GitHub > > > > > <#7 (comment) >, > or > > > > > unsubscribe > > > > > < > > > > > > > > > > https://github.com/notifications/unsubscribe-auth/ABZUHGEFUFLSR37AOPYPZJ3SVLMI3ANCNFSM4US6PVXA > > > > > > > > > > . > > > > > > > > > -- > > > > cheers > > > > Misha > > > > matzlab.weebly.com > > > > > > > > — > > > > > > > > > > > > You are receiving this because you authored the thread. > > > > Reply to this email directly, view it on GitHub > > > > <#7 (comment)>, or > > > > unsubscribe > > > > < > > > > > > https://github.com/notifications/unsubscribe-auth/ALEILISSNS7CD7K4Q72GNIDSVLQMXANCNFSM4US6PVXA > > > > > > > > . > > > > > > > -- > > > Ruiqi Li > > > PhD Student > > > Dept. of Ecology and Evolutionary Biology > > > University of Colorado Boulder > > > pronouns: he/him/his > > > > > > — > > > You are receiving this because you commented. > > > > > > > > > Reply to this email directly, view it on GitHub > > > <#7 (comment)>, or > > > unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ABZUHGAAPSKZF42UQJ2LGJTSVL6SVANCNFSM4US6PVXA > > > > > > . > > > > > -- > > cheers > > Misha > > matzlab.weebly.com > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALEILISABPRBJETI3TD3V3DSVNHDNANCNFSM4US6PVXA > > > > . > > > -- > Ruiqi Li > PhD Student > Dept. of Ecology and Evolutionary Biology > University of Colorado Boulder > pronouns: he/him/his > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGGFUV4KR5SZK7ZG7QTSVNTGBANCNFSM4US6PVXA > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIUZOF4HCCXT5RUNGFTSVOHMLANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on · 2020-12-18T19:13:40Z

Hi Ruiqi - damn, it looks real to me. Amazing dataset! You somehow have very extensive annotations, that gives you extra power. Where did you get the annotations from? (there is a bunch of "obsolete" terms in it, maybe re-annotate?) Also there is the last part of GO_MWU.R that gives you "best GOs" representing independent groups of GO terms - use that to summarize your super-extensive GO list? (I just pushed the commit correcting a minor bug there :) Misha

…

On Fri, Dec 18, 2020 at 10:19 AM Ruiqi-CUB ***@***.***> wrote: Hi Misha, I just send you all the files to your email. Please let me know if you can access them! Thanks a lot! — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGGPZIXCYE3UUELY2DTSVN6JLANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-18T19:25:18Z

I annotated it with eggnog mapper v2.0.1b with the database download by the script coming with it using the following command. `python2.7 emapper.py -m diamond --output_dir ../ --translate -o unigene_fragum --cpu 8 -i ../unigene_fragum.fna` Is there a better way to annotate them? Should I use interproscan instead? Do you happen to know why the Go terms are "obsolete"? The databases should be updated. As for the bestGOs, is it reasonable to just plot the best GOs? How can I plot it with the plot command within the GO_MWU? Thank you so much for your help! Ruiqi On Fri, Dec 18, 2020 at 12:13 PM Mikhail V Matz <notifications@github.com> wrote:

…

Hi Ruiqi - damn, it looks real to me. Amazing dataset! You somehow have very extensive annotations, that gives you extra power. Where did you get the annotations from? (there is a bunch of "obsolete" terms in it, maybe re-annotate?) Also there is the last part of GO_MWU.R that gives you "best GOs" representing independent groups of GO terms - use that to summarize your super-extensive GO list? (I just pushed the commit correcting a minor bug there :) Misha On Fri, Dec 18, 2020 at 10:19 AM Ruiqi-CUB ***@***.***> wrote: > Hi Misha, I just send you all the files to your email. Please let me know > if you can access them! Thanks a lot! > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGGPZIXCYE3UUELY2DTSVN6JLANCNFSM4US6PVXA > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIUDRHVE5CVW3P7GUQLSVOSXLANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB · 2020-12-19T01:02:55Z

Just want to confirm, is dissim_(GO division)_(go-to-gene table filename) the same given the same GO division and go-to-gene table filename? Even if input filenames are different?

I am using a loop in R to perform GO_MWU for several datasets sharing the same go-to-gene table filename. I just found out that dissim_(GO division)_(go-to-gene table filename) is overwriten everytime performing GO_MWU in the same GO divison, e.g. input1.txt with BP and input2.txt with BP.

z0on · 2020-12-19T06:26:21Z

Yes, this is correct. My code is not very efficient, but it works :)

On Fri, Dec 18, 2020 at 7:03 PM Ruiqi-CUB ***@***.***> wrote: Just want to confirm, is dissim_(GO division)_(go-to-gene table filename) the same given the same GO division and go-to-gene table filename? Even if input filenames are different? I am using a loop in R to perform GO_MWU for several datasets sharing the same go-to-gene table filename. I just found out that dissim_(GO division)_(go-to-gene table filename) is overwriten everytime performing GO_MWU in the same GO divison, e.g. input1.txt with BP and input2.txt with BP. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGGZA76YK3TG2XAD3UTSVP3UXANCNFSM4US6PVXA> .

-- cheers Misha matzlab.weebly.com

Ruiqi-CUB · 2020-12-20T22:08:54Z

Thanks a lot! On Fri, Dec 18, 2020 at 11:26 PM Mikhail V Matz <notifications@github.com> wrote:

…

Yes, this is correct. My code is not very efficient, but it works :) On Fri, Dec 18, 2020 at 7:03 PM Ruiqi-CUB ***@***.***> wrote: > Just want to confirm, is dissim_(GO division)_(go-to-gene table filename) > the same given the same GO division and go-to-gene table filename? Even if > input filenames are different? > > I am using a loop in R to perform GO_MWU for several datasets sharing the > same go-to-gene table filename. I just found out that dissim_(GO > division)_(go-to-gene table filename) is overwriten everytime performing > GO_MWU in the same GO divison, e.g. input1.txt with BP and input2.txt with > BP. > > — > You are receiving this because you commented. > > > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGGZA76YK3TG2XAD3UTSVP3UXANCNFSM4US6PVXA > > . > -- cheers Misha matzlab.weebly.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIRAL4JFS33ATBRN523SVRBRTANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

Ruiqi-CUB · 2020-12-20T23:49:29Z

Hi Misha,

I suspect that the dissim_(GO division)_(go-to-gene table filename) is not the same given the different input file (gene-lop10P), even if the GO division and go-to-gene table are the same.

I tried to run gomwuStats with input1.csv and input2.csv with CC and the same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. After getting the output file, I tried to run gomwuPlot. GO_MWU figure for input2 can be plotted but there is any error message for input1 as below.

 Error in `[.data.frame`(diss, goods.names, goods.names) : 
  undefined columns selected

Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv respectively. Both figures were plotted successfully. The dissim_MF_gene2go.tab for input1 is 15.5MB while dissim_MF_gene2go.tab for input2.csv is 15.6MB.

Could you please check your code to see if that is the issue. If it is, would you mind modifying the code to rename dissim_(GO division)(go-to-gene table filename) with dissim(GO division)(input filename)(go-to-gene table filename) instead?

Thank you so much!
Ruiqi

z0on · 2020-12-21T00:25:58Z

Hi Ruiqi - to be honest I would rather not touch my old perl code unless there is a critical error. If you think this is indeed an important thing to correct/add, you are welcome to create a git branch and fix that! cheers Misha

…

On Sun, Dec 20, 2020 at 5:49 PM Ruiqi-CUB ***@***.***> wrote: Hi Misha, I suspect that the dissim_(GO division)_(go-to-gene table filename) is not the same given the different input file (gene-lop10P), even if the GO division and go-to-gene table are the same. I tried to run gomwuStats with input1.csv and input2.csv with CC and the same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. After getting the input file, I tried to run gomwuPlot. GO_MWU figure for input2 can be plotted but there is any error message for input1 as below. Error in `[.data.frame`(diss, goods.names, goods.names) : undefined columns selected Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv respectively. Both figures were plotted successfully. Could you please check your code to see if that is the issue. If it is, would you mind modifying the code to rename dissim_(GO division)*(go-to-gene table filename) with dissim*(GO division)_(input filename) instead? Thank you so much! Ruiqi — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGH76FL325LDVWHYSPLSV2ERJANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-21T01:04:31Z

Thank you so much! Would you mind pointing out where the code about saving the dissim_GO_gene2go.txt and GO_input.csv are? On Sun, Dec 20, 2020 at 5:26 PM Mikhail V Matz <notifications@github.com> wrote:

…

Hi Ruiqi - to be honest I would rather not touch my old perl code unless there is a critical error. If you think this is indeed an important thing to correct/add, you are welcome to create a git branch and fix that! cheers Misha On Sun, Dec 20, 2020 at 5:49 PM Ruiqi-CUB ***@***.***> wrote: > Hi Misha, > > I suspect that the dissim_(GO division)_(go-to-gene table filename) is not > the same given the different input file (gene-lop10P), even if the GO > division and go-to-gene table are the same. > > I tried to run gomwuStats with input1.csv and input2.csv with CC and the > same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. > After getting the input file, I tried to run gomwuPlot. GO_MWU figure for > input2 can be plotted but there is any error message for input1 as below. > > Error in `[.data.frame`(diss, goods.names, goods.names) : > undefined columns selected > > Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv > respectively. Both figures were plotted successfully. > > Could you please check your code to see if that is the issue. If it is, > would you mind modifying the code to rename dissim_(GO division)*(go-to-gene > table filename) with dissim*(GO division)_(input filename) instead? > > Thank you so much! > Ruiqi > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ABZUHGH76FL325LDVWHYSPLSV2ERJANCNFSM4US6PVXA > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALEILIWRPDDSSVLE4RWTNBTSV2I2HANCNFSM4US6PVXA> .

-- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his

z0on · 2020-12-21T03:42:00Z

:) if I only knew exactly what needs to be changed I would have changed it...

…

On Sun, Dec 20, 2020 at 7:04 PM Ruiqi-CUB ***@***.***> wrote: Thank you so much! Would you mind pointing out where the code about saving the dissim_GO_gene2go.txt and GO_input.csv are? On Sun, Dec 20, 2020 at 5:26 PM Mikhail V Matz ***@***.***> wrote: > Hi Ruiqi - to be honest I would rather not touch my old perl code unless > there is a critical error. If you think this is indeed an important thing > to correct/add, you are welcome to create a git branch and fix that! > > cheers > Misha > > > On Sun, Dec 20, 2020 at 5:49 PM Ruiqi-CUB ***@***.***> > wrote: > > > Hi Misha, > > > > I suspect that the dissim_(GO division)_(go-to-gene table filename) is > not > > the same given the different input file (gene-lop10P), even if the GO > > division and go-to-gene table are the same. > > > > I tried to run gomwuStats with input1.csv and input2.csv with CC and the > > same goAnnotations gene2go.tab. dissim_CC_gene2go.tab is overwriten once. > > After getting the input file, I tried to run gomwuPlot. GO_MWU figure for > > input2 can be plotted but there is any error message for input1 as below. > > > > Error in `[.data.frame`(diss, goods.names, goods.names) : > > undefined columns selected > > > > Then I runned gomwuStats and gomwuPlot for input1.csv and input2.csv > > respectively. Both figures were plotted successfully. > > > > Could you please check your code to see if that is the issue. If it is, > > would you mind modifying the code to rename dissim_(GO > division)*(go-to-gene > > table filename) with dissim*(GO division)_(input filename) instead? > > > > Thank you so much! > > Ruiqi > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > <#7 (comment)>, or > > unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ABZUHGH76FL325LDVWHYSPLSV2ERJANCNFSM4US6PVXA > > > > . > > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#7 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALEILIWRPDDSSVLE4RWTNBTSV2I2HANCNFSM4US6PVXA > > . > -- Ruiqi Li PhD Student Dept. of Ecology and Evolutionary Biology University of Colorado Boulder pronouns: he/him/his — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGBCDJVWMN3TRLUHJ5TSV2NKXANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-21T04:27:34Z

I guess I have to change "dissim_".$div."_".$gen2go to "dissim_".$div."_".$measure."_".$gen2go at line 153 in gomwu_b.pl and line 53 in gomwu_a.pl, and in.dissim=paste("dissim",goDivision,goAnnotations,sep="_") to in.dissim=paste("dissim",goDivision,input,goAnnotations,sep="_") at line 173 in gomwu.functions.R. I have tested it with 2 input files and it work well, at least there is no error message.

Ruiqi-CUB · 2020-12-22T17:33:41Z

Hi Dr. Matz,
Sorry to bother you again. I am trying to interpret the best GO table. Does level mean the GO term level? Does nseqs means the number of tested sequences(genes, isoforms, orthogroups, etc.) found associated witn the GO term? Thank you so much!

delta.rank         pval       level nseqs                                        term                          name        p.adj
  41        -871 6.495308e-05     5   205 GO:0000428;GO:0030880;GO:0016591;GO:0055029        RNA polymerase complex 5.249926e-04

z0on · 2020-12-22T21:17:29Z

yes and yes! Level is pretty non-informative since it is not standardized in any way across GO hierarchy (some functional groups have many levels, some only a few) You might wish to explore different tree-cut cutoffs to get the most reasonable summary - plot GO trees and cutoff levels by un-remarking two lines in the script (saying this just in case, you probably did that already) (did the edit work?..)

…

On Tue, Dec 22, 2020 at 11:34 AM Ruiqi-CUB ***@***.***> wrote: Hi Dr. Matz, Sorry to bother you again. I am trying to interpret the best GO table. Does level mean the GO term level? Does nseqs means the number of tested sequences(genes, isoforms, orthogroups, etc.) found associated witn the GO term? Thank you so much! delta.rank pval level nseqs term name p.adj 41 -871 6.495308e-05 5 205 GO:0000428;GO:0030880;GO:0016591;GO:0055029 RNA polymerase complex 5.249926e-04 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGB6BX4SX4PEH6AJEMDSWDKALANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-22T21:49:20Z

Thank you so much! The tree with the cut-off line works really well! It helps me get the representative GOs from so many Go terms in my analyses! Also, the negative delta.rank (the value before pval) means down-regulation, correct?

The editting works really well with a loop in R. Since BP usually takes about one hour on a server and I have so many contrasts (3 species and 3 treatments), it is much more convenient to run gomwuStats with a loop first, then explore each one with gomwuPlot later. I have posted the changes I made in a previous comment.

I guess I have to change "dissim_".$div."_".$gen2go to "dissim_".$div."_".$measure."_".$gen2go at line 153 in gomwu_b.pl and line 53 in gomwu_a.pl, and in.dissim=paste("dissim",goDivision,goAnnotations,sep="_") to in.dissim=paste("dissim",goDivision,input,goAnnotations,sep="_") at line 173 in gomwu.functions.R. I have tested it with 2 input files and it work well, at least there is no error message.

z0on · 2020-12-22T22:13:57Z

yep, negative delta-rank is down-regulation. I added your edits to the code! thanks a lot, this is really helpful.

…

On Tue, Dec 22, 2020 at 3:49 PM Ruiqi-CUB ***@***.***> wrote: Thank you so much! The tree with the cut-off line works really well! It helps me get the representative GOs from so many Go terms in my analyses! Also, the negative delta.rank (the value before pval) means down-regulation, correct? The editting works really well with a loop in R. Since BP usually takes about one hour on a server and I have so many contrasts (3 species and 3 treatments), it is much more convenient to run gomwuStats with a loop first, then explore each one with gomwuPlot later. I have posted the changes I made on a previous comment. I guess I have to change "dissim_".$div."_".$gen2go to "dissim_".$div."_".$measure."_".$gen2go at line 153 in gomwu_b.pl and line 53 in gomwu_a.pl, and in.dissim=paste("dissim",goDivision,goAnnotations,sep="_") to in.dissim=paste("dissim",goDivision,input,goAnnotations,sep="_") at line 173 in gomwu.functions.R. I have tested it with 2 input files and it work well, at least there is no error message. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABZUHGADNH6N3PWDDY3HJS3SWEH67ANCNFSM4US6PVXA> .

Ruiqi-CUB · 2020-12-22T22:26:41Z

Thank you! I am honored to contribute your code!

Ruiqi-CUB closed this as completed Jan 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GO_MWU.R error #7

GO_MWU.R error #7

Ruiqi-CUB commented Dec 9, 2020 •

edited

Loading

z0on commented Dec 9, 2020 via email

Ruiqi-CUB commented Dec 9, 2020 via email

Ruiqi-CUB commented Dec 9, 2020 via email

z0on commented Dec 9, 2020 via email

Ruiqi-CUB commented Dec 10, 2020 via email

Ruiqi-CUB commented Dec 14, 2020

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 19, 2020

z0on commented Dec 19, 2020 via email

Ruiqi-CUB commented Dec 20, 2020 via email

Ruiqi-CUB commented Dec 20, 2020 •

edited

Loading

z0on commented Dec 21, 2020 via email

Ruiqi-CUB commented Dec 21, 2020 via email

z0on commented Dec 21, 2020 via email

Ruiqi-CUB commented Dec 21, 2020 •

edited

Loading

Ruiqi-CUB commented Dec 22, 2020

z0on commented Dec 22, 2020 via email

Ruiqi-CUB commented Dec 22, 2020 •

edited

Loading

z0on commented Dec 22, 2020 via email

Ruiqi-CUB commented Dec 22, 2020

GO_MWU.R error #7

GO_MWU.R error #7

Comments

Ruiqi-CUB commented Dec 9, 2020 • edited Loading

z0on commented Dec 9, 2020 via email

Ruiqi-CUB commented Dec 9, 2020 via email

Ruiqi-CUB commented Dec 9, 2020 via email

z0on commented Dec 9, 2020 via email

Ruiqi-CUB commented Dec 10, 2020 via email

Ruiqi-CUB commented Dec 14, 2020

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

z0on commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 18, 2020 via email

Ruiqi-CUB commented Dec 19, 2020

z0on commented Dec 19, 2020 via email

Ruiqi-CUB commented Dec 20, 2020 via email

Ruiqi-CUB commented Dec 20, 2020 • edited Loading

z0on commented Dec 21, 2020 via email

Ruiqi-CUB commented Dec 21, 2020 via email

z0on commented Dec 21, 2020 via email

Ruiqi-CUB commented Dec 21, 2020 • edited Loading

Ruiqi-CUB commented Dec 22, 2020

z0on commented Dec 22, 2020 via email

Ruiqi-CUB commented Dec 22, 2020 • edited Loading

z0on commented Dec 22, 2020 via email

Ruiqi-CUB commented Dec 22, 2020

Ruiqi-CUB commented Dec 9, 2020 •

edited

Loading

Ruiqi-CUB commented Dec 20, 2020 •

edited

Loading

Ruiqi-CUB commented Dec 21, 2020 •

edited

Loading

Ruiqi-CUB commented Dec 22, 2020 •

edited

Loading