Comut plot for 10 genes #11

varsh1090 · 2017-03-30T20:22:14Z

Hi Victor, I need some help with making a comut plot for these 10 genes -

RB1
NONO 0.520
RNF8 0.604
FAM75A6 0.607
ZNF385D 0.631
POP7 0.669
MYADM 0.695
ADCY10 0.803
NGLY1 0.830
HIST2H2AB 0.925

varsh1090 · 2017-03-30T20:24:03Z

I am going to be out of town, with limited access to my laptop. Could you please make a comut plot with these 10 genes, a plot without and 1 with the numbers mentioned next to each of the genes (except RB1)? Thank you!

victorlin · 2017-04-06T15:03:39Z

All of these genes but RB1 and P53 (which is labeled as TP53) are not in the file data/SigGenes_001.txt nor data/SigGenes_005.txt. Is there another file that contains all the genes?

varsh1090 · 2017-04-06T17:25:00Z

They are from the 4datasetnonsilent file, gene column. Please make sure you count each gene:patient combination only once.

…

Sent from my iPhone

On Apr 6, 2017, at 11:03 AM, Victor Lin ***@***.***> wrote: All of these genes but RB1 and P53 (which is labeled as TP53) are not in the file data/SigGenes_001.txt nor data/SigGenes_005.txt. Is there another file that contains all the genes? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

victorlin · 2017-04-06T17:32:08Z

Previously the genes were sorted based on p-value from data/SigGenes_*.txt. Should it just use a pre-defined order so that p-value will not be needed?

I am using this as an input file:

$ cat genes.txt
P53
RB1
NONO	0.520
RNF8	0.604
FAM75A6	0.607
ZNF385D	0.631
POP7	0.669
MYADM	0.695
ADCY10	0.803
NGLY1	0.830
HIST2H2AB	0.925

varsh1090 · 2017-04-06T17:59:22Z

The order can be the frequency of genes? Or the order from the number I sent you, next to the genes, with P53 and RB1 at the top. Sorry I am traveling, might respond late. Thanks, Varsha

…

Sent from my iPhone

On Apr 6, 2017, at 1:32 PM, Victor Lin ***@***.***> wrote: Previously the genes were sorted based on p-value from data/SigGenes_*.txt. Should it just use a pre-defined order so that p-value will not be needed? I am using this as an input file: $ cat genes.txt P53 RB1 NONO 0.520 RNF8 0.604 FAM75A6 0.607 ZNF385D 0.631 POP7 0.669 MYADM 0.695 ADCY10 0.803 NGLY1 0.830 HIST2H2AB 0.925 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

leizhou69 · 2017-04-06T18:26:37Z

Agree with Varsha, could either based on frequency or the value (from highest 0.925 to the lowest 0.520).

…

On 4/6/17 1:59 PM, varsh1090 wrote: The order can be the frequency of genes? Or the order from the number I sent you, next to the genes, with P53 and RB1 at the top. Sorry I am traveling, might respond late. Thanks, Varsha Sent from my iPhone > On Apr 6, 2017, at 1:32 PM, Victor Lin ***@***.***> wrote: > > Previously the genes were sorted based on p-value from data/SigGenes_*.txt. Should it just use a pre-defined order so that p-value will not be needed? > > I am using this as an input file: > > $ cat genes.txt > P53 > RB1 > NONO 0.520 > RNF8 0.604 > FAM75A6 0.607 > ZNF385D 0.631 > POP7 0.669 > MYADM 0.695 > ADCY10 0.803 > NGLY1 0.830 > HIST2H2AB 0.925 > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub, or mute the thread. > — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#11 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AShJOPY-KvHeh99sG5Cl11vUGNDPyiwXks5rtSf6gaJpZM4Mu37Q>.

-- ------------------------------------------------------------- Lei Zhou (B.Med. Ph.D.) Associate Professor Department of Molecular Genetics and Microbiology College of Medicine Member, UF Health Cancer Center & UF Genetics Institute University of Florida PO Box 103633 For FedEx delivery: 2033 Mowry Road CGRC-285G (Zhou Lab) Gainesville, Florida 32610-3633

List of genes with scores: data/gene-lists/genes.txt `--p_value` and `--gene_list_file` are disjointly two primary filters and sort methods, with `--num_genes` being an additional cutoff. #11

victorlin · 2017-04-06T19:54:18Z

The plot is generated. Here are the relevant files:

Base directory: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/

Gene list file: data/gene-lists/genes.txt
Comutation plot: results/SCLC_comut_plot_040617.pdf
Sample IDs: results/SampleIDs_040617.txt

Also, @varsh1090 you mentioned to count each gene:patient combination only once. They are being counted once, however currently it is taking the very last mutation type encountered in the dataset file, regardless of the previous ones.

For example, if this was part of the file:

TP53  Sample1 4
RB1   Sample1 3
TP53  Sample1 6

The only mutation type information stored for (TP53, Sample1) would be 6. Is this the desired behavior?

varsh1090 · 2017-04-06T21:13:18Z

Thanks Victor, I'll take a look when I get a chance. I think for this figure, we might not need to mention the mutation type info.

…

Sent from my iPhone

On Apr 6, 2017, at 3:54 PM, Victor Lin ***@***.***> wrote: The plot is generated. Here are the relevant files: Base directory: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/ Gene list file: data/gene-lists/genes.txt Comutation plot: results/SCLC_comut_plot_040617.pdf Sample IDs: results/SampleIDs_040617.txt Also, @varsh1090 you mentioned to count each gene:patient combination only once. They are being counted once, however currently it is taking the very last mutation type encountered in the dataset file, regardless of the previous ones. For example, if this was part of the file: TP53 Sample1 4 RB1 Sample1 3 TP53 Sample1 6 The only mutation type information stored for (TP53, Sample1) would be 6. Is this the desired behavior? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

victorlin · 2017-04-06T22:41:40Z

Sounds good. The mutation type concern also applies to all the previous comutation plots that were generated by this script.

varsh1090 · 2017-04-09T19:14:41Z

Can we also make the plot without the mutation types colored? For this figure and also the figure with the top 8 most significant genes? Thanks

* pass arguments directly to functions * add argument to generate color map (default no color map) * make mutation_type_map a global definition * remove unnecessary color generating functions * filter by num_genes was not working properly, — fixed

victorlin · 2017-04-11T16:24:11Z

Generated the plots. The files are in /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041117/.

You can see the commands I used to generate the plot in the file comutations/examples.txt.

varsh1090 · 2017-04-11T17:38:08Z

Thanks! Can we also make 1 for p<0.001 genes? The top 8 in the list.

victorlin · 2017-04-19T21:41:22Z

Here is a list of all the most recent plots and sample IDs:

custom list of 10 genes: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041117
15 genes (p < 0.001): /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041117
8 genes: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041717
56 genes (p < 0.01): /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041917

varsh1090 · 2017-04-19T22:38:28Z

Thanks Victor!

varsh1090 assigned victorlin Mar 30, 2017

varsh1090 added the help wanted label Mar 30, 2017

varsh1090 added this to the comutation plot milestone Mar 30, 2017

victorlin added a commit that referenced this issue Apr 6, 2017

Add option to specify genes

7555780

List of genes with scores: data/gene-lists/genes.txt `--p_value` and `--gene_list_file` are disjointly two primary filters and sort methods, with `--num_genes` being an additional cutoff. #11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comut plot for 10 genes #11

Comut plot for 10 genes #11

varsh1090 commented Mar 30, 2017

varsh1090 commented Mar 30, 2017

victorlin commented Apr 6, 2017

varsh1090 commented Apr 6, 2017 via email

victorlin commented Apr 6, 2017

varsh1090 commented Apr 6, 2017 via email

leizhou69 commented Apr 6, 2017 via email

victorlin commented Apr 6, 2017

varsh1090 commented Apr 6, 2017 via email

victorlin commented Apr 6, 2017

varsh1090 commented Apr 9, 2017

victorlin commented Apr 11, 2017

varsh1090 commented Apr 11, 2017

victorlin commented Apr 19, 2017

varsh1090 commented Apr 19, 2017

Comut plot for 10 genes #11

Comut plot for 10 genes #11

Comments

varsh1090 commented Mar 30, 2017

varsh1090 commented Mar 30, 2017

victorlin commented Apr 6, 2017

varsh1090 commented Apr 6, 2017 via email

victorlin commented Apr 6, 2017

varsh1090 commented Apr 6, 2017 via email

leizhou69 commented Apr 6, 2017 via email

victorlin commented Apr 6, 2017

varsh1090 commented Apr 6, 2017 via email

victorlin commented Apr 6, 2017

varsh1090 commented Apr 9, 2017

victorlin commented Apr 11, 2017

varsh1090 commented Apr 11, 2017

victorlin commented Apr 19, 2017

varsh1090 commented Apr 19, 2017