Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roary_plots.py KeyError: "X" not in index #423

Closed
PhilPalmer opened this issue Oct 1, 2018 · 5 comments
Closed

roary_plots.py KeyError: "X" not in index #423

PhilPalmer opened this issue Oct 1, 2018 · 5 comments
Labels

Comments

@PhilPalmer
Copy link

PhilPalmer commented Oct 1, 2018

Hi,
I have run Prokka & Roary with the following commands:

prokka --kingdom Bacteria --outdir results --prefix ${fasta_prefix} ${fasta_file}
roary -e -n -v -r *.gff

For 10 fasta files:
screen shot 2018-10-01 at 13 45 58

I am now trying to visualise the output using roary_plots.py

However, when I try and run this command: python roary_plots.py accessory_binary_genes.fa.newick gene_presence_absence.csv
I get the following error:

Traceback (most recent call last):
  File "../../Roary/contrib/roary_plots/roary_plots.py", line 114, in <module>
    roary_sorted = roary_sorted[[x.name for x in t.get_terminals()]]
  File "/Users/phil/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 1958, in __getitem__
    return self._getitem_array(key)
  File "/Users/phil/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 2002, in _getitem_array
    indexer = self.loc._convert_to_indexer(key, axis=1)
  File "/Users/phil/miniconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 1231, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "['Parachlamydia-related_symbiont_UWE25'] not in index"

Am I using the correct input file for the tree, or do I need to run FastTree? (If so how do I do that?)
Or could it be because Roary did not run correctly, as the contents of summary_statistics.txt is:

Core genes      (99% <= strains <= 100%)        0
Soft core genes (95% <= strains < 99%)  0
Shell genes     (15% <= strains < 95%)  1057
Cloud genes     (0% <= strains < 15%)   4468
Total genes     (0% <= strains <= 100%) 5525

I have also noticed that the core_gene_alignment.aln file is empty asside from the prefix of each fasta file
Might this be because the strains are too diverse from one another?

Any help resolving this would be much appreciated, many thanks in advance

@andrewjpage
Copy link
Member

Roary is designed to work with samples from the same species. From your file names it looks like you have multiple genus/species which would explain the lack of core genes. Other pangenome applications might be more useful for your particular research question, such as PopPunk.

@PhilPalmer
Copy link
Author

Thank you for your prompt reply. I have just run it with only fasta files from the same species and it worked. I will also have a look at PopPunk

@hjafar
Copy link

hjafar commented May 9, 2020

I have run this roary_plots.py my_tree.nwk gene_presence_absence.csv. I have got the following error:

Traceback (most recent call last):
File "/home/genomic-lab/Documents/kfhrc/tools/sanger-pathogens-Roary-adaef93/contrib/roary_plots/roary_plots.py", line 77, in
t = Phylo.read(options.tree, 'newick')
File "/home/genomic-lab/anaconda3/lib/python3.7/site-packages/biopython-1.76-py3.7-linux-x86_64.egg/Bio/Phylo/_io.py", line 62, in read
tree = next(tree_gen)
File "/home/genomic-lab/anaconda3/lib/python3.7/site-packages/biopython-1.76-py3.7-linux-x86_64.egg/Bio/Phylo/_io.py", line 49, in parse
with File.as_handle(file, "r") as fp:
File "/home/genomic-lab/anaconda3/lib/python3.7/contextlib.py", line 112, in enter
return next(self.gen)
File "/home/genomic-lab/anaconda3/lib/python3.7/site-packages/biopython-1.76-py3.7-linux-x86_64.egg/Bio/File.py", line 120, in as_handle
with open(handleish, mode, **kwargs) as fp:
FileNotFoundError: [Errno 2] No such file or directory: 'my_tree_test.nwk'

My roary summary_statistics.txt is :
Core genes (99% <= strains <= 100%) 5443
Soft core genes (95% <= strains < 99%) 0
Shell genes (15% <= strains < 95%) 1058
Cloud genes (0% <= strains < 15%) 0
Total genes (0% <= strains <= 100%) 6501

Kindly give your sugistion how to solve this error.

Best,
Hussain

@hjafar
Copy link

hjafar commented May 9, 2020

I have use this command line
python '/home/genomic-lab/Documents/kfhrc/tools/sanger-pathogens-Roary-adaef93/contrib/roary_plots/roary_plots.py' my_tree_test.tre '/home/genomic-lab/Documents/kfhrc/annotation_pangenome_roary/demo/gene_presence_absence.csv'

@neelam19051
Copy link

Hi, i am trying to run roary_plot.py and i got this error-
I tried a lot but unable to run it.
python3.6 roary_plots.py accessory_binary_genes.fa.newick gene_presence_absence.csv
Traceback (most recent call last):
File "roary_plots.py", line 112, in
roary_sorted = roary_sorted[[x.name for x in t.get_terminals()]]
File "/home/neel@m95/.local/lib/python3.6/site-packages/pandas/core/frame.py", line 2912, in getitem
indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
File "/home/neel@m95/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1254, in _get_listlike_indexer
self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
File "/home/neel@m95/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1304, in _validate_read_indexer
raise KeyError(f"{not_found} not in index")
KeyError: "['GIMC5015'] not in index"

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants