Speed comparison with GBParsy #39

xapple · 2023-07-06T11:00:01Z

I was looking for a fast way of processing large amounts of genbank entries, and found your library. It definitely offers an improvement over biopython, but I'm wondering why did you not include GBParsy in the speed comparison? It is a parser written in pure C, and likely even faster than gb-io.

Lee TH, Kim YK, Nahm BH. GBParsy: a GenBank flatfile parser library with high speed. BMC Bioinformatics. 2008 Jul 25;9:321. doi: 10.1186/1471-2105-9-321. PMID: 18652706; PMCID: PMC2516526.

The text was updated successfully, but these errors were encountered:

althonos · 2023-07-06T12:07:14Z

Hi @xapple,

I did not include GBParsy because i was not aware of this project, and since it's not on PyPI it's not exactly the most convenient, tools-included GenBank parser out there. Additionally, I tried to build from source from the GitHub repository you linked, but the code seems quite outdated (it still uses the PyString_FromStringAndSize C API, which was removed from Python 3)...

xapple · 2023-07-06T12:15:31Z

Yes, you are right, the code was written in 2008 which is sixteen years ago, and is probably not compatible with the current Python C API. Also, it has not been uploaded to PyPI or conda-forge.

Digging a bit deeper I did realize that the code on the GitHub repository is an export of the old google-code repository and doesn't represent the latest version. The repository has v0.5.0 while the supplementary file of the publication iteself includes v0.6.0 (2008-07-10) at:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2516526/bin/1471-2105-9-321-S1.tgz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed comparison with GBParsy #39

Speed comparison with GBParsy #39

xapple commented Jul 6, 2023

althonos commented Jul 6, 2023

xapple commented Jul 6, 2023

Speed comparison with GBParsy #39

Speed comparison with GBParsy #39

Comments

xapple commented Jul 6, 2023

althonos commented Jul 6, 2023

xapple commented Jul 6, 2023