Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arthropod database build errors in version 1.5.1 (Python 3.7) #77

Closed
JonathanVanHamme opened this issue Nov 12, 2020 · 3 comments
Closed

Comments

@JonathanVanHamme
Copy link

Hi Jon! Hope you are staying sane and healthy. I'm back to arthropods after a bit of a break, and downloaded the most recent data from BOLD a couple days ago to ensure that the samples we submitted there are included when we do taxonomy assignment here. I noticed you mentioned updating the database build method in v1.4.0, but am not sure if I am getting to the updated instructions:

https://github.com/nextgenusfs/amptk/blob/master/docs/taxonomy.rst

No problems with bold2utax.py and concatenating the resulting files, but the database builds all fail.

I'm running on:
Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-66-generic x86_64), 24 cores, 378G memory.
In a Conda environment with freshly installed AMPtk 1.5.1 that is working beautifully for fungal and bacterial data.

Looks like all three database builds are calling the same errors (see attached):
2020Nov11_Arthropod_Database_builderrors.txt

[05:05:46 PM]: Now dereplicating sequences (collapsing identical sequences)
Traceback (most recent call last):
File "/results/Miniconda/miniconda3/envs/amptk151/bin/amptk", line 788, in
main()
File "/results/Miniconda/miniconda3/envs/amptk151/bin/amptk", line 779, in main
mod.main(arguments)
File "/results/Miniconda/miniconda3/envs/amptk151/lib/python3.7/site-packages/amptk/extract_region.py", line 614, in main
dereplicate(derep_tmp, OutName, args=args)
File "/results/Miniconda/miniconda3/envs/amptk151/lib/python3.7/site-packages/amptk/extract_region.py", line 58, in dereplicate
if not sequence in seqs:
TypeError: unhashable type: 'dict'

Thanks for any help,
Jon

nextgenusfs pushed a commit that referenced this issue Nov 12, 2020
@nextgenusfs
Copy link
Owner

Hi Jon -- hope you are well. Sorry about that, looks like a typo when I switched to pyfastx parsing. You can install the master over the top of your conda install, just move into that environment and run:

python -m pip install --no-deps --force git+https://github.com/nextgenusfs/amptk.git

@JonathanVanHamme
Copy link
Author

Thanks for the quick response, Jon! Worked like a charm! Database all set up and first data set through. Nice to see that you keep adding functionality to AMPtk, it is my favourite pipeline to work with.

@nextgenusfs
Copy link
Owner

Great! Let me know if any other issues show up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants