Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The gtdbtk module throws an error: 'EXCEPTION: OSError MESSAGE: [Errno 12] Cannot allocate memory‘ when running METABOLIC-C.pl. #173

Open
ZhengXiaoxuan11542 opened this issue Jan 2, 2024 · 2 comments

Comments

@ZhengXiaoxuan11542
Copy link

Hello,
thank you for operating such a great tool. Everything goes well when I run METABOLIC-C.pl until I encounter an error in the gtdbtk

ERROR: Uncontrolled exit resulting from an unexpected error.
================================================================================
EXCEPTION: OSError
MESSAGE: [Errno 12] Cannot allocate memory

This results in all three PDF figures under the path ~/b73l-1_matabolic_c/METABOLIC_Figures/ being blank (there might be other issues I haven't noticed). I specified the database version as 207_v2; does the error have any connection with it? Additionally, I used the 'gtdbtk check_install --db_version 207' command to check the database, and there were no issues. I have attached my log file for your reference. I hope to receive your assistance in resolving this issue!
METABOLIC_log.log

@ChaoLab
Copy link
Collaborator

ChaoLab commented Jan 3, 2024

Can you run GTDB-Tk under METABOLIC conda env to see if there are some problems?
From the error message, it seems that GTDB-Tk breaks due to the memory shortage

@ZhengXiaoxuan11542
Copy link
Author

Can you run GTDB-Tk under METABOLIC conda env to see if there are some problems? From the error message, it seems that GTDB-Tk breaks due to the memory shortage

Yes, as you mentioned, this is due to insufficient memory within GTDB-Tk, causing an interruption during the execution of pplacer. However, even though I specified --pplacer_cpus --scratch_dir in the code, I still encountered errors. I attempted to run this code with the GTDB-Tk/v207 database outside of METABOLIC env, and it ran smoothly, generating the expected content.

(METABOLIC_v4.0) [xiaoxuan@tc6001 ~]$ gtdbtk classify_wf --cpus 1 -x fasta --genome_dir /public/home/xiaoxuan/bxdata/10.bin/single/b73l-1/BIN_REFINEMENT/metawrap_50_10_bins --skip_ani_screen --pplacer_cpus 1 --scratch_dir /public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/pplacer --out_dir /public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/intermediate_files/gtdbtk_Genome_files
[2024-01-03 18:59:39] INFO: GTDB-Tk v2.3.2
[2024-01-03 18:59:39] INFO: gtdbtk classify_wf --cpus 1 -x fasta --genome_dir /public/home/xiaoxuan/bxdata/10.bin/single/b73l-1/BIN_REFINEMENT/metawrap_50_10_bins --skip_ani_screen --pplacer_cpus 1 --scratch_dir /public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/pplacer --out_dir /public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/intermediate_files/gtdbtk_Genome_files
[2024-01-03 18:59:39] INFO: Using GTDB-Tk reference data version r207: /public/home/xiaoxuan/database/gtdbtk/release207/
[2024-01-03 18:59:39] INFO: Identifying markers in 4 genomes with 1 threads.
[2024-01-03 18:59:39] TASK: Running Prodigal V2.6.3 to identify genes.
[2024-01-03 18:59:39] INFO: Completed 4 genomes in 0.02 seconds (189.02 genomes/second).
[2024-01-03 18:59:39] WARNING: Prodigal skipped 4 genomes due to pre-existing data, see warnings.log
[2024-01-03 18:59:39] TASK: Identifying TIGRFAM protein families.
[2024-01-03 18:59:39] INFO: Completed 4 genomes in 0.00 seconds (878.76 genomes/second).
[2024-01-03 18:59:39] WARNING: TIGRFAM skipped 4 genomes due to pre-existing data, see warnings.log
[2024-01-03 18:59:39] TASK: Identifying Pfam protein families.
[2024-01-03 18:59:39] INFO: Completed 4 genomes in 0.00 seconds (1,543.16 genomes/second).
[2024-01-03 18:59:39] WARNING: Pfam skipped 4 genomes due to pre-existing data, see warnings.log
[2024-01-03 18:59:39] INFO: Annotations done using HMMER 3.1b2 (February 2015).
[2024-01-03 18:59:39] TASK: Summarising identified marker genes.
[2024-01-03 18:59:39] INFO: Completed 4 genomes in 0.07 seconds (55.59 genomes/second).
[2024-01-03 18:59:39] INFO: Done.
[2024-01-03 18:59:39] INFO: Aligning markers in 4 genomes with 1 CPUs.
[2024-01-03 18:59:39] INFO: Processing 4 genomes identified as bacterial.
[2024-01-03 18:59:48] INFO: Read concatenated alignment for 62,291 GTDB genomes.
[2024-01-03 18:59:48] TASK: Generating concatenated alignment for each marker.
[2024-01-03 18:59:48] INFO: Completed 4 genomes in 0.04 seconds (111.86 genomes/second).
[2024-01-03 18:59:48] TASK: Aligning 108 identified markers using hmmalign 3.1b2 (February 2015).
[2024-01-03 19:00:08] INFO: Completed 108 markers in 19.90 seconds (5.43 markers/second).
[2024-01-03 19:00:08] TASK: Masking columns of bacterial multiple sequence alignment using canonical mask.
[2024-01-03 19:01:44] INFO: Completed 62,295 sequences in 1.59 minutes (39,166.12 sequences/minute).
[2024-01-03 19:01:44] INFO: Masked bacterial alignment from 41,084 to 5,036 AAs.
[2024-01-03 19:01:44] INFO: 0 bacterial user genomes have amino acids in <10.0% of columns in filtered MSA.
[2024-01-03 19:01:44] INFO: Creating concatenated alignment for 62,295 bacterial GTDB and user genomes.
[2024-01-03 19:02:04] INFO: Creating concatenated alignment for 4 bacterial user genomes.
[2024-01-03 19:02:05] INFO: Done.
[2024-01-03 19:02:05] INFO: Using a scratch file for pplacer allocations. This decreases memory usage and performance.
[2024-01-03 19:02:05] TASK: Placing 4 bacterial genomes into backbone reference tree with pplacer using 1 CPUs (be patient).
[2024-01-03 19:02:05] INFO: pplacer version: v1.1.alpha19-0-g807f6f3
==> Step 1 of 9: Starting pplacer.Uncaught exception: Sys_error("/public/home/xiaoxuan/database/gtdbtk/release207/split/backbone/pplacer/gtdbtk_package_backbone.refpkg: No such file or directory")
Fatal error: exception Sys_error("/public/home/xiaoxuan/database/gtdbtk/release207/split/backbone/pplacer/gtdbtk_package_backbone.refpkg: No such file or directory")
==> Running pplacer v1.1.alpha19-0-g807f6f3 analysis on /public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/intermediate_files/gtdbtk_Genome_files/align/gtdbtk.bac120.user_msa.fasta.gz....Process Process-10:
Traceback (most recent call last):
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 124, in _worker
raise PplacerException('An error was encountered while '
gtdbtk.exceptions.PplacerException: An error was encountered while running pplacer, check the log file: /public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/intermediate_files/gtdbtk_Genome_files/classify/intermediate_results/pplacer/pplacer.backbone.bac120.out
[2024-01-03 19:02:06] ERROR: Uncontrolled exit resulting from an unexpected error.

================================================================================
EXCEPTION: FileNotFoundError
MESSAGE: [Errno 2] No such file or directory: '/public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/pplacer/gtdbtk.pplacer.scratch'

Traceback (most recent call last):
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 92, in run
raise PplacerException(
gtdbtk.exceptions.PplacerException: An error was encountered while running pplacer.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/main.py", line 102, in main
gt_parser.parse_options(args)
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/main.py", line 1186, in parse_options
self.classify(options,all_classified_ani= all_classified_ani)
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/main.py", line 587, in classify
reports = classify.run(genomes=genomes,
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/classify.py", line 564, in run
high_classify_tree = self.place_genomes(user_msa_file,
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/classify.py", line 270, in place_genomes
pplacer.run(self.pplacer_cpus, 'wag', pplacer_ref_pkg, pplacer_json_out,
File "/public/home/xiaoxuan/miniconda3/envs/METABOLIC_v4.0/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 100, in run
os.remove(mmap_file)
FileNotFoundError: [Errno 2] No such file or directory: '/public/home/xiaoxuan/bxdata/10.bin_gene/b73l-1_matabolic_c/pplacer/gtdbtk.pplacer.scratch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants