You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, first of all, thanks for creating and supporting this amazing software, it's been very helpful so far.
I am doing a pangenome from several E. coli strains we have sequenced in our lab. I got their annotation using bakta with the latest complete db (5.1), and then fed these annotations to the complete workflow:
However, when it comes to writing all gene-data in the h5f file I'm getting an error related to the object class:
Traceback (most recent call last):
File "tables/tableextension.pyx", line 1676, in tables.tableextension.Row.__setitem__
TypeError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/bin/ppanggolin", line 10, in <module>
sys.exit(main())
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/ppanggolin/main.py", line 219, in main
ppanggolin.workflow.all.launch(args)
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/ppanggolin/workflow/all.py", line 288, in launch
launch_workflow(args, panrgp=True, panmodule=True)
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/ppanggolin/workflow/all.py", line 61, in launch_workflow
write_pangenome(pangenome, filename, args.force, disable_bar=args.disable_prog_bar)
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/ppanggolin/formats/writeBinaries.py", line 711, in write_pangenome
write_annotations(pangenome, h5f, disable_bar=disable_bar)
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/ppanggolin/formats/writeAnnotations.py", line 342, in write_annotations
write_genedata(pangenome, h5f, annotation, genedata2gene, disable_bar)
File "/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/ppanggolin/formats/writeAnnotations.py", line 309, in write_genedata
genedata_row["name"] = genedata.name
File "tables/tableextension.pyx", line 1681, in tables.tableextension.Row.__setitem__
TypeError: invalid type (<class 'str'>) for column ``name``
/rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/lib/python3.10/site-packages/tables/file.py:113: UnclosedFileWarning:
Closing remaining open file: ppanggolin_results/pangenome.h5
Here is the complete output from the run.
2024-05-30 12:39:04 utils.py:l168 INFO Command: /rds/general/user/dmarti14/home/anaconda3/envs/ppanggo/bin/ppanggolin all --anno genomes.gbff.txt --output ppanggolin_results -c 2 --verbose 2 -f
2024-05-30 12:39:04 utils.py:l169 INFO PPanGGOLiN version: 2.0.5
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--anno: genomes.gbff.txt" has been specified in the command line with a non-default value. Its value overwrites the default value (None).
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--force: True" has been specified in the command line with a non-default value. Its value overwrites the default value (False).
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--output: ppanggolin_results" has been specified in the command line with a non-default value. Its value overwrites the default value (ppanggolin_output_DATE2024-05-30_HOUR12.39.04_PID2061566).
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--verbose: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l668 DEBUG 4 all parameters have non-default value: cpu=2, force=True, output=ppanggolin_results, verbose=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing annotate arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 annotate parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing cluster arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 cluster parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing graph arguments in config file.
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing partition arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 partition parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing rarefaction arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 rarefaction parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing rgp arguments in config file.
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing spot arguments in config file.
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing module arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 module parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing draw arguments in config file.
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing write_pangenome arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 write_pangenome parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l679 DEBUG Parsing write_genomes arguments in config file.
2024-05-30 12:39:04 utils.py:l529 DEBUG The parameter "--cpu: 2" has been specified in the command line with a non-default value. Its value overwrites the default value (1).
2024-05-30 12:39:04 utils.py:l709 DEBUG 1 write_genomes parameters have a non-default value: cpu=2
2024-05-30 12:39:04 utils.py:l722 INFO 11 parameters have a non-default value.
2024-05-30 12:39:04 annotate.py:l503 INFO Reading genomes.gbff.txt the list of genome files ...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:01<00:00, 5.76file/s]
2024-05-30 12:39:06 annotate.py:l535 INFO gene identifiers used in the provided annotation files were unique, PPanGGOLiN will use them.
2024-05-30 12:39:06 writeBinaries.py:l709 INFO Writing genome annotations...
2024-05-30 12:39:06 writeAnnotations.py:l71 DEBUG Writing 8 genomes
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 160547.52genome/s]
2024-05-30 12:39:06 writeAnnotations.py:l105 DEBUG Writing 1600 contigs
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1600/1600 [00:00<00:00, 652365.74contigs/s]
2024-05-30 12:39:06 writeAnnotations.py:l148 DEBUG Writing 36656 genes
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 36656/36656 [00:00<00:00, 169713.94gene/s]
2024-05-30 12:39:06 writeAnnotations.py:l297 DEBUG Writing 36509 gene-related data (can be lower than the number of genes)
93%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 33897/36509 [00:00<00:00, 687257.45genedata/s]
I can supply a few of the annotation files that I'm using as a test if necessary.
Thanks a lot.
The text was updated successfully, but these errors were encountered:
The error seems a bit similar to the one encountered in these issues: #95, #175 and #222. However, here the problem seems to be with the gene name and not the product.
You might try to catch any problematic characters with this grep command on your gbff files: LC_ALL=C grep -n -P [$'\x80'-$'\xFF'] *.g*ff
That was it! For the record, it was again one of these double-wing motiff proteins (gene mmcQ) the responsible for the error.
I tried removing the non-ASCII characters from the gff3 files and this time it worked with a test subset I was playing with.
Hi, first of all, thanks for creating and supporting this amazing software, it's been very helpful so far.
I am doing a pangenome from several E. coli strains we have sequenced in our lab. I got their annotation using bakta with the latest complete db (5.1), and then fed these annotations to the complete workflow:
ppanggolin all --anno genomes.gbff.txt --output ppanggolin_results -c 2 --verbose 2 -f
However, when it comes to writing all gene-data in the h5f file I'm getting an error related to the object class:
Here is the complete output from the run.
I can supply a few of the annotation files that I'm using as a test if necessary.
Thanks a lot.
The text was updated successfully, but these errors were encountered: