Skip to content

KeyError: 'GO' in funannotate compare #363

@atiweb

Description

@atiweb

Are you using the latest release?
Using 1.7.2

Describe the bug

[03:09 PM]: OS: linux2, 16 cores, ~ 33 GB RAM. Python: 2.7.17
[03:09 PM]: Running 1.7.2
[03:09 PM]: Now parsing 20 genomes
[03:09 PM]: working on Pythium_insidiosum Pi-s
[03:09 PM]: working on Pythium_insidiosum MTPI_04
[03:10 PM]: working on Pythium_insidiosum CR02
[03:10 PM]: working on Pythium_insidiosum CBS_573.85
[03:11 PM]: working on Pythium_insidiosum MCC_13
[03:12 PM]: working on Pythium_insidiosum CDC_B5653
[03:13 PM]: working on Phytopythium_vexans HF1
[03:13 PM]: working on Phytopythium_vexans DAOM_BR484
[03:14 PM]: working on Phytophthora_sojae P6497
[03:14 PM]: working on Pythium_oligandrum ATCC_38472_TT
[03:15 PM]: working on Pythium_oligandrum CBS_530.74
[03:15 PM]: working on Pythium_oligandrum Po37
[03:16 PM]: working on Pythium_periplocum CBS_532.74
[03:16 PM]: working on Pythium_aphanidermatum DAOM_BR444
[03:17 PM]: working on Pythium_irregulare CBS_494.86
[03:18 PM]: working on Pythium_irregulare DAOM_BR486
[03:18 PM]: working on Pythium_iwayamai DAOM_BR242034
[03:19 PM]: working on Pythium_arrhenomanes ATCC_12531
[03:20 PM]: working on Pythium_guiyangense Su
[03:21 PM]: working on Globisporangium_ultimum DAOM_BR144
[03:21 PM]: putative transcript from ncbi:DAOMBR144_012983-T1 has no ID
(ncbi:DAOMBR144_012983-T1 None ncbi:DAOMBR144_012983-T1)
[03:21 PM]: Summarizing secondary metabolism gene clusters
[03:21 PM]: Summarizing PFAM domain results
[03:21 PM]: Summarizing InterProScan results
[03:21 PM]: Loading InterPro descriptions
[03:21 PM]: Summarizing MEROPS protease results
[03:21 PM]: found 43/103 MEROPS familes with stdev >= 1.000000
[03:21 PM]: Summarizing CAZyme results
[03:22 PM]: found 104/171 CAZy familes with stdev >= 1.000000
[03:22 PM]: No COG annotations found
[03:22 PM]: Summarizing secreted protein results
[03:22 PM]: Summarizing fungal transcription factors
[03:22 PM]: Running GO enrichment for each genome
Traceback (most recent call last):
  File "/usr/local/bin/funannotate", line 4, in <module>
    __import__('pkg_resources').run_script('funannotate==1.7.2', 'funannotate')
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py",
line 658, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py",
line 1438, in run_script
    exec(code, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/funannotate-1.7.2-py2.7.egg/EGG-INFO/scripts/funannotate",
line 657, in <module>
    main()
  File "/usr/local/lib/python2.7/dist-packages/funannotate-1.7.2-py2.7.egg/EGG-INFO/scripts/funannotate",
line 647, in main
    mod.main(arguments)
  File "/usr/local/lib/python2.7/dist-packages/funannotate-1.7.2-py2.7.egg/funannotate/compare.py",
line 791, in main
    df2['GO'].astype(str)+'">'+df2['GO']+'</a>'
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py",
line 2927, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexes/base.py",
line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 108, in
pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in
pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in
pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in
pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'GO'

What command did you issue?
funannotate compare --input /mnt/sdb/auto_pyt/Pythium_insidiosum/Pi-s/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_insidiosum/MTPI_04/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_insidiosum/CR02/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_insidiosum/CBS_573.85/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_insidiosum/MCC_13/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_insidiosum/CDC_B5653/fun_out/annotate_results /mnt/sdb/auto_pyt/Phytopythium_vexans/HF1/fun_out/annotate_results /mnt/sdb/auto_pyt/Phytopythium_vexans/DAOM_BR484/fun_out/annotate_results /mnt/sdb/auto_pyt/Phytophthora_sojae/P6497/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_oligandrum/ATCC_38472_TT/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_oligandrum/CBS_530.74/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_oligandrum/Po37/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_periplocum/CBS_532.74/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_aphanidermatum/DAOM_BR444/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_irregulare/CBS_494.86/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_irregulare/DAOM_BR486/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_iwayamai/DAOM_BR242034/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_arrhenomanes/ATCC_12531/fun_out/annotate_results /mnt/sdb/auto_pyt/Pythium_guiyangense/Su/fun_out/annotate_results /mnt/sdb/auto_pyt/Globisporangium_ultimum/DAOM_BR144/fun_out/annotate_results --out /mnt/sdb/auto_pyt/funannotate_compare_less --cpus 15 --run_dnds estimate

Logfiles
funannotate-compare.log
associations.txt
population.txt
Phytopythium_vexans_HF1.go.enrichment.txt

OS/Install Information

Checking dependencies for 1.7.2

You are running Python v 2.7.17. Now checking python packages...
biopython: 1.73
goatools: 0.9.5
matplotlib: 2.2.4
natsort: 6.0.0
numpy: 1.13.3
pandas: 0.24.2
psutil: 5.6.2
requests: 2.22.0
scikit-learn: 0.20.3
scipy: 1.2.2
seaborn: 0.9.0
All 11 python packages installed

You are running Perl v 5.026001. Now checking perl modules...
Bio::Perl: 1.007002
Carp: 1.42
Clone: 0.39
DBD::SQLite: 1.62
DBD::mysql: 4.046
DBI: 1.64
DB_File: 1.852
Data::Dumper: 2.167
File::Basename: 2.85
File::Which: 1.23
Getopt::Long: 2.49
Hash::Merge: 0.300
JSON: 4.02
LWP::UserAgent: 6.31
Logger::Simple: 2.0
POSIX: 1.76
Parallel::ForkManager: 2.02
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 2.62
Text::Soundex: 3.05
Thread::Queue: 3.12
Tie::File: 1.02
URI::Escape: 3.31
YAML: 1.28
threads: 2.15
threads::shared: 1.56
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/mnt/sdb/funannotate/DB_funannotate
$PASAHOME=/mnt/sdb/funannotate/PASApipeline
$TRINITYHOME=/mnt/sdb/funannotate/trinityrnaseq-v2.9.0
$EVM_HOME=/mnt/sdb/funannotate/EVidenceModeler-1.1.1
$AUGUSTUS_CONFIG_PATH=/mnt/sdb/funannotate/Augustus/config
$GENEMARK_PATH=/mnt/sdb/funannotate/gm_et_linux_64/gmes_petap
All 6 environmental variables are set

Checking external dependencies...
PASA: 2.3.3
CodingQuarry: 2.0
Trinity: 2.9.0
augustus: 3.3.2
bamtools: bamtools 2.5.1
bedtools: bedtools v2.26.0
blat: BLAT v36x2
diamond: 0.9.24
emapper.py: 2.0.1
ete3: 3.1.1
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2017-11-15
gmes_petap.pl: 4.38
hisat2: 2.1.0
hmmscan: HMMER 3.2.1 (June 2018)
hmmsearch: HMMER 3.2.1 (June 2018)
java: 11.0.5
kallisto: 0.46.0
mafft: v7.310 (2017/Mar/17)
makeblastdb: makeblastdb 2.9.0+
minimap2: 2.17-r943-dirty
proteinortho: 6.0.10
pslCDnaFilter: no way to determine
salmon: salmon 0.14.0
samtools: samtools 1.9-66-gc15e884
signalp: 4.1
snap: 2006-07-28
stringtie: 1.3.6
tRNAscan-SE: 1.3.1 (January 2012)
tantan: tantan 13
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.9.0+
trimal: trimAl v1.4.rev22 build[2015-05-21]
trimmomatic: 0.39
All 36 external dependencies are installed

So, to keep working, in line 781 in compare.py, I added after that line:

df.columns = df.columns.str.replace(r'^# ', '')

because as you say in the code: # goatools keeps changing output - which really sucks....
went to Phytopythium_vexans_HF1.go.enrichment.txt file and saw that header line was beginning with # symbol, and in your code that was not taken in count.
you call library.py in line 774:
goresult = lib.checkgoatools(file)
went there in library.py, line 7293:

if line.startswith('GO\tNS'):
and added:
if line.startswith('GO\tNS') or line.startswith('#'):

So the header line count work properly.

All this solve the issue, but, you have the last and best word in this and in all of funannotate.
Great work, and thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions