Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comut plot for 10 genes #11

Open
varsh1090 opened this issue Mar 30, 2017 · 14 comments
Open

Comut plot for 10 genes #11

varsh1090 opened this issue Mar 30, 2017 · 14 comments
Assignees

Comments

@varsh1090
Copy link
Contributor

Hi Victor, I need some help with making a comut plot for these 10 genes -

RB1
NONO 0.520
RNF8 0.604
FAM75A6 0.607
ZNF385D 0.631
POP7 0.669
MYADM 0.695
ADCY10 0.803
NGLY1 0.830
HIST2H2AB 0.925

@varsh1090
Copy link
Contributor Author

I am going to be out of town, with limited access to my laptop. Could you please make a comut plot with these 10 genes, a plot without and 1 with the numbers mentioned next to each of the genes (except RB1)? Thank you!

@victorlin
Copy link
Collaborator

All of these genes but RB1 and P53 (which is labeled as TP53) are not in the file data/SigGenes_001.txt nor data/SigGenes_005.txt. Is there another file that contains all the genes?

@varsh1090
Copy link
Contributor Author

varsh1090 commented Apr 6, 2017 via email

@victorlin
Copy link
Collaborator

Previously the genes were sorted based on p-value from data/SigGenes_*.txt. Should it just use a pre-defined order so that p-value will not be needed?

I am using this as an input file:

$ cat genes.txt
P53
RB1
NONO	0.520
RNF8	0.604
FAM75A6	0.607
ZNF385D	0.631
POP7	0.669
MYADM	0.695
ADCY10	0.803
NGLY1	0.830
HIST2H2AB	0.925

@varsh1090
Copy link
Contributor Author

varsh1090 commented Apr 6, 2017 via email

@leizhou69
Copy link
Collaborator

leizhou69 commented Apr 6, 2017 via email

victorlin added a commit that referenced this issue Apr 6, 2017
List of genes with scores: data/gene-lists/genes.txt

`--p_value` and `--gene_list_file` are disjointly two primary filters
and sort methods, with `--num_genes` being an additional cutoff.

#11
@victorlin
Copy link
Collaborator

The plot is generated. Here are the relevant files:

Base directory: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/

  • Gene list file: data/gene-lists/genes.txt
  • Comutation plot: results/SCLC_comut_plot_040617.pdf
  • Sample IDs: results/SampleIDs_040617.txt

Also, @varsh1090 you mentioned to count each gene:patient combination only once. They are being counted once, however currently it is taking the very last mutation type encountered in the dataset file, regardless of the previous ones.

For example, if this was part of the file:

TP53  Sample1 4
RB1   Sample1 3
TP53  Sample1 6

The only mutation type information stored for (TP53, Sample1) would be 6. Is this the desired behavior?

@varsh1090
Copy link
Contributor Author

varsh1090 commented Apr 6, 2017 via email

@victorlin
Copy link
Collaborator

Sounds good. The mutation type concern also applies to all the previous comutation plots that were generated by this script.

@varsh1090
Copy link
Contributor Author

Can we also make the plot without the mutation types colored? For this figure and also the figure with the top 8 most significant genes? Thanks

victorlin referenced this issue Apr 11, 2017
* pass arguments directly to functions
* add argument to generate color map (default no color map)
* make mutation_type_map a global definition
* remove unnecessary color generating functions
* filter by num_genes was not working properly, — fixed
@victorlin
Copy link
Collaborator

Generated the plots. The files are in /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041117/.

You can see the commands I used to generate the plot in the file comutations/examples.txt.

@varsh1090
Copy link
Contributor Author

Thanks! Can we also make 1 for p<0.001 genes? The top 8 in the list.

@victorlin
Copy link
Collaborator

Here is a list of all the most recent plots and sample IDs:

  • custom list of 10 genes: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041117
  • 15 genes (p < 0.001): /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041117
  • 8 genes: /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041717
  • 56 genes (p < 0.01): /ufrc/zhou/share/projects/bioinformatics/SCLC/sclc-scripts/results/comutation-plot_041917

@varsh1090
Copy link
Contributor Author

Thanks Victor!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants