Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTDBtk is being run on only one (of several) samples, even when running with the option --binning_map_mode own #641

Closed
amizeranschi opened this issue Jul 25, 2024 · 15 comments
Labels
bug Something isn't working

Comments

@amizeranschi
Copy link
Contributor

Description of the bug

When running the pipeline on several samples, it appears that the pipeline only runs GTDBtk on a single sample, which is featured in the output. This happens even when running the pipeline with the option --binning_map_mode own.

The job list printed by the pipeline also suggests that there was only one GTDBtk job spawned, for one sample:

[-        ] process > NFCORE_MAG:mag:CAT                                                                          -
[-        ] process > NFCORE_MAG:mag:CAT_SUMMARY                                                                  -
[ab/2da685] process > NFCORE_MAG:mag:GTDBTK:GTDBTK_CLASSIFYWF (MEGAHIT-MetaBAT2-unclassified-unrefined-CAPES_S21) [100%] 1 of 1 ✔
[c6/da5e34] process > NFCORE_MAG:mag:GTDBTK:GTDBTK_SUMMARY                                                        [100%] 1 of 1 ✔

This issue can be reproduced using data from the test_all profile, as shown below.

Command used and terminal output

## create a sample sheet

echo 'sample,group,short_reads_1,short_reads_2,long_reads
CAPES_S11,0,s3://ngi-igenomes/test-data/mag/ERR3201918_1.fastq.gz,s3://ngi-igenomes/test-data/mag/ERR3201918_2.fastq.gz,
CAPES_S21,0,s3://ngi-igenomes/test-data/mag/ERR3201928_1.fastq.gz,s3://ngi-igenomes/test-data/mag/ERR3201928_2.fastq.gz,
CAPES_S7,0,s3://ngi-igenomes/test-data/mag/ERR3201914_1.fastq.gz,s3://ngi-igenomes/test-data/mag/ERR3201914_2.fastq.gz,' > test-samples.csv


## run the pipeline

clear
nextflow run nf-core/mag -r 3.0.2 \
-profile docker \
--outdir test-mag \
--input test-samples.csv \
--skip_spades \
--skip_prodigal \
--skip_prokka \
--skip_metaeuk \
--skip_maxbin2 \
--skip_concoct \
--gtdb_db gtdbtk_r214_data/release214/ \
--binning_map_mode own \
--busco_db bacteria_odb10.2024-01-08.tar.gz


## check results

$ /.../test-mag/Taxonomy/GTDB-Tk/MEGAHIT/MetaBAT2
total 4.0K
drwxrwxr-x 2 ubuntu ubuntu 4.0K Jul 25 07:52 CAPES_S21
drwxrwxr-x 3 ubuntu ubuntu   31 Jul 25 07:52 .
drwxrwxr-x 3 ubuntu ubuntu   30 Jul 25 07:52 ..

Relevant files

No response

System information

No response

@amizeranschi amizeranschi added the bug Something isn't working label Jul 25, 2024
@amizeranschi amizeranschi changed the title GTDBtk outputs are missing results for all but one sample GTDBtk is being run on only one (of several) samples, even when running with the option --binning_map_mode own Jul 25, 2024
@amizeranschi
Copy link
Contributor Author

Some more info, in case it's helpful for identifying the issue: the file gtdbtk_summary.tsv only contains results for the CAPES_S21 sample which was evaluated, although there are also empty lines for bins from the two missing samples (CAPES_S11 and CAPES_S7). This is what the file looks like:

user_genome	classification	fastani_reference	fastani_reference_radius	fastani_taxonomy	fastani_ani	fastani_af	closest_placement_reference	closest_placement_radius	closest_placement_taxonomy	closest_placement_ani	closest_placement_af	pplacer_taxonomy	classification_method	note	other_related_references(genome_id,species_name,radius,ANI,AF)	msa_percent	translation_table	red_value	warnings
MEGAHIT-MetaBAT2-CAPES_S21.16.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.2.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.2.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.4.fa																			
MEGAHIT-MetaBAT2-CAPES_S7.1.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.8.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.9.fa																			
MEGAHIT-MetaBAT2-CAPES_S7.6.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.3.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.14.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.22.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.15.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.25.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.12.fa																			
MEGAHIT-MetaBAT2-CAPES_S7.12.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.19.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.21.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.3.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.22.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.18.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.9.fa																			
MEGAHIT-MetaBAT2-CAPES_S7.7.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.12.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.30.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.6.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.1.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.5.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.6.fa																			
MEGAHIT-MetaBAT2-CAPES_S7.15.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.20.fa																			
MEGAHIT-MetaBAT2-CAPES_S11.26.fa																			
MEGAHIT-MetaBAT2-CAPES_S7.9.fa																			
MEGAHIT-MetaBAT2-CAPES_S21.1.fa	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__Bacteroides uniformis	GCA_025147485.1	95.0	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__Bacteroides uniformis	99.08	0.904	GCA_025147485.1	95.0	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__Bacteroides uniformis	99.08	0.904	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_000614125.1, s__Bacteroides rodentium, 95.0, 94.97, 0.721; GCF_004793475.1, s__Bacteroides sp002491635, 95.0, 92.23, 0.651; GCA_905197435.1, s__Bacteroides sp905197435, 95.0, 87.92, 0.685; GCF_000195635.1, s__Bacteroides fluxus, 95.0, 81.73, 0.487; GCA_902388495.1, s__Bacteroides sp902388495, 95.0, 81.69, 0.506; GCA_905203765.1, s__Bacteroides sp905203765, 95.0, 81.2, 0.564; GCA_025147325.1, s__Bacteroides stercoris, 95.0, 80.23, 0.415; GCF_900129655.1, s__Bacteroides clarus, 95.0, 80.15, 0.404; GCA_021203515.1, s__Bacteroides sp021203515, 95.0, 80.09, 0.421; GCA_910578895.1, s__Bacteroides sp910578895, 95.0, 80.07, 0.368; GCA_025146565.1, s__Bacteroides eggerthii, 95.0, 80.0, 0.384; GCF_018390535.1, s__Bacteroides propionicigenes, 95.0, 79.97, 0.399; GCF_000374365.1, s__Bacteroides gallinarum, 95.0, 79.87, 0.394; GCF_004342845.1, s__Bacteroides heparinolyticus, 95.0, 79.77, 0.409; GCF_000186225.1, s__Bacteroides helcogenes, 95.0, 79.69, 0.398; GCF_900241005.1, s__Bacteroides cutis, 95.0, 79.53, 0.338; GCA_905215345.1, s__Bacteroides sp905215345, 95.0, 79.44, 0.366; GCF_002998435.1, s__Bacteroides zoogleoformans, 95.0, 79.36, 0.392; GCF_000172175.1, s__Bacteroides intestinalis, 95.0, 79.21, 0.369; GCA_945607605.1, s__Bacteroides sp945607605, 95.0, 79.2, 0.461; GCA_023458215.1, s__Bacteroides sp023458215, 95.0, 79.19, 0.409; GCF_900142015.1, s__Bacteroides stercorirosoris, 95.0, 79.18, 0.404; GCA_910585845.1, s__Bacteroides sp910585845, 95.0, 79.15, 0.308; GCA_900555635.1, s__Bacteroides sp900555635, 95.0, 79.14, 0.427; GCF_000315485.1, s__Bacteroides oleiciplenus, 95.0, 79.07, 0.395; GCF_902364365.1, s__Bacteroides sp900556215, 95.2143, 79.07, 0.378; GCA_905207245.1, s__Bacteroides sp905207245, 95.0, 79.02, 0.39; GCF_000513195.1, s__Bacteroides timonensis, 95.0, 79.02, 0.355; GCF_020091405.1, s__Bacteroides sp900552405, 95.0, 78.97, 0.378; GCF_003464595.1, s__Bacteroides intestinalis_A, 95.2143, 78.88, 0.381; GCF_000158035.1, s__Bacteroides cellulosilyticus, 95.0, 78.81, 0.377; GCA_900755095.1, s__Bacteroides sp900755095, 95.0, 78.7, 0.122; GCA_905215555.1, s__Bacteroides sp905215555, 95.0, 78.66, 0.324; GCF_903181435.1, s__Bacteroides sp900765785, 95.0, 78.41, 0.216; GCF_009193325.2, s__Bacteroides zhangwenhongii, 95.8579, 78.39, 0.204; GCA_902362375.1, s__Bacteroides sp902362375, 95.0, 78.27, 0.204; GCF_024623065.1, s__Bacteroides acidifaciens, 95.0, 78.22, 0.202; GCF_002222615.2, s__Bacteroides caccae, 95.0, 78.14, 0.189; GCF_009193295.2, s__Bacteroides luhongzhouii, 95.0, 78.12, 0.191; GCA_910586915.1, s__Bacteroides sp910586915, 95.0, 78.1, 0.225; GCF_001688725.2, s__Bacteroides caecimuris, 95.0, 78.1, 0.197; GCF_001314995.1, s__Bacteroides ovatus, 95.0, 78.09, 0.213; GCF_000156195.1, s__Bacteroides finegoldii, 95.8579, 78.07, 0.205; GCF_002849695.1, s__Bacteroides fragilis_A, 95.0, 78.05, 0.199; GCA_002293435.1, s__Bacteroides sp002293435, 95.0, 78.04, 0.239; GCF_000210075.1, s__Bacteroides xylanisolvens, 95.0, 78.02, 0.2; GCF_000011065.1, s__Bacteroides thetaiotaomicron, 95.0, 78.02, 0.212; GCF_900130125.1, s__Bacteroides congonensis, 95.0, 78.0, 0.226; GCA_000613465.1, s__Bacteroides nordii, 95.0, 77.99, 0.182; GCF_003865075.1, s__Bacteroides faecalis, 95.0, 77.99, 0.154; GCF_014334015.1, s__Bacteroides intestinigallinarum, 95.0, 77.96, 0.201; GCF_900155865.1, s__Bacteroides bouchesdurhonensis, 95.0, 77.95, 0.148; GCF_019583405.1, s__Bacteroides fragilis_B, 95.0, 77.93, 0.2; GCA_944322345.1, s__Bacteroides sp944322345, 95.0, 77.9, 0.124; GCA_014385165.1, s__Bacteroides sp014385165, 95.0, 77.9, 0.155; GCF_000381365.1, s__Bacteroides salyersiae, 95.0, 77.89, 0.193; GCF_900106755.1, s__Bacteroides faecis, 95.0, 77.87, 0.209; GCA_900547205.1, s__Bacteroides sp900547205, 95.0, 77.86, 0.186; GCF_000614145.1, s__Bacteroides faecichinchillae, 95.0, 77.86, 0.135; GCF_014750685.1, s__Bacteroides sp014750685, 95.0, 77.81, 0.22; GCA_934716785.1, s__Bacteroides sp934716785, 95.0, 77.81, 0.165; GCF_012113595.1, s__Bacteroides sp012113595, 95.0, 77.73, 0.222; GCA_900766005.1, s__Bacteroides sp900766005, 95.0, 77.67, 0.202; GCF_000025985.1, s__Bacteroides fragilis, 95.0, 77.67, 0.19; GCA_900556625.1, s__Bacteroides sp900556625, 95.0, 77.65, 0.154; GCA_900761785.1, s__Bacteroides sp900761785, 95.0, 77.64, 0.186; GCA_023426145.1, s__Bacteroides sp023426145, 95.0, 77.61, 0.096; GCA_900557355.1, s__Bacteroides sp900557355, 95.0, 77.57, 0.173; GCF_014196225.1, s__Bacteroides pyogenes_A, 95.0, 77.4, 0.165; GCF_000428105.1, s__Bacteroides pyogenes, 95.0, 77.38, 0.151; GCA_022648295.1, s__Bacteroides sp022648295, 95.0, 77.23, 0.119; GCA_002471185.1, s__Bacteroides sp002471185, 95.0, 77.22, 0.071; GCA_019416685.1, s__Bacteroides sp900762525, 95.0, 77.21, 0.154; GCF_000517545.1, s__Bacteroides reticulotermitis, 95.0, 77.19, 0.121; GCA_019412865.1, s__Bacteroides sp019412865, 95.0, 77.14, 0.096; GCA_007896885.1, s__Bacteroides sp007896885, 95.0, 77.11, 0.112; GCA_009929715.1, s__Bacteroides sp009929715, 95.0, 77.08, 0.096; GCA_900766195.1, s__Bacteroides sp900766195, 95.0, 77.03, 0.105; GCA_017992695.1, s__Bacteroides sp017992695, 95.0, 77.02, 0.133; GCA_002471195.1, s__Bacteroides sp002471195, 95.0, 77.02, 0.068; GCF_000499785.1, s__Bacteroides neonati, 95.0, 77.01, 0.114	77.76	11		
MEGAHIT-MetaBAT2-CAPES_S21.10.fa	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ruminococcus_B;s__Ruminococcus_B gnavus	GCF_008121495.1	95.0	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ruminococcus_B;s__Ruminococcus_B gnavus	98.97	0.882	GCF_008121495.1	95.0	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ruminococcus_B;s__Ruminococcus_B gnavus	98.97	0.882	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Ruminococcus_B;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCA_900544395.1, s__Ruminococcus_B sp900544395, 95.0, 85.35, 0.715; GCA_018883985.1, s__Ruminococcus_B intestinipullorum, 95.0, 78.68, 0.203	93.86	11		
MEGAHIT-MetaBAT2-CAPES_S21.11.fa	d__Bacteria;p__Bacillota;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Ligilactobacillus;s__Ligilactobacillus salivarius	GCF_001435955.1	95.0	d__Bacteria;p__Bacillota;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Ligilactobacillus;s__Ligilactobacillus salivarius	98.32	0.835	GCF_001435955.1	95.0	d__Bacteria;p__Bacillota;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Ligilactobacillus;s__Ligilactobacillus salivarius	98.32	0.835	d__Bacteria;p__Bacillota;c__Bacilli;o__Lactobacillales;f__Lactobacillaceae;g__Ligilactobacillus;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_001654615.1, s__Ligilactobacillus aviarius_B, 95.0, 79.65, 0.076; GCF_001434535.1, s__Ligilactobacillus animalis, 95.0, 79.37, 0.126; GCF_000615845.1, s__Ligilactobacillus hayakitensis, 95.0, 79.33, 0.274; GCF_001591685.1, s__Ligilactobacillus murinus, 95.0, 79.2, 0.124; GCA_934201845.1, s__Ligilactobacillus sp934201845, 95.0, 79.11, 0.055; GCF_001436215.1, s__Ligilactobacillus agilis, 95.0, 78.97, 0.115; GCA_910587695.1, s__Ligilactobacillus sp910587695, 95.0, 78.92, 0.13; GCF_900113455.1, s__Ligilactobacillus ruminis, 95.0, 78.88, 0.076; GCF_001436315.1, s__Ligilactobacillus aviarius, 95.0, 78.85, 0.074; GCA_905204935.1, s__Ligilactobacillus sp900765635, 95.0, 78.77, 0.154; GCF_003864415.1, s__Ligilactobacillus salitolerans, 95.0, 78.76, 0.054; GCA_019120355.1, s__Ligilactobacillus faecavium, 95.0, 78.75, 0.084; GCF_000423245.1, s__Ligilactobacillus ceti, 95.0, 78.74, 0.157; GCF_001434405.1, s__Ligilactobacillus apodemi, 95.0, 78.66, 0.152; GCA_900110005.1, s__Ligilactobacillus ruminis_A, 95.0, 78.65, 0.113; GCF_001435375.1, s__Ligilactobacillus araffinosus, 95.0, 78.53, 0.1; GCF_000423265.1, s__Ligilactobacillus saerimneri, 95.0, 78.34, 0.091; GCF_001435755.1, s__Ligilactobacillus acidipiscis, 95.0, 77.9, 0.083; GCA_019116985.1, s__Ligilactobacillus excrementavium, 95.0, 77.87, 0.08; GCA_019119415.1, s__Ligilactobacillus avistercoris, 95.0, 77.86, 0.035; GCF_022836475.1, s__Ligilactobacillus excrementipullorum, 95.0, 77.86, 0.098; GCF_024160875.1, s__Ligilactobacillus sp024160875, 95.0, 77.85, 0.167; GCF_000615765.1, s__Ligilactobacillus equi, 95.0, 77.78, 0.124; GCA_019114385.1, s__Ligilactobacillus excrementigallinarum, 95.0, 77.61, 0.13; GCF_000349725.1, s__Ligilactobacillus pobuzihii, 95.0, 77.29, 0.093; GCA_944326135.1, s__Ligilactobacillus sp944326135, 95.0, 77.26, 0.176; GCA_934215475.1, s__Ligilactobacillus sp934215475, 95.0, 77.0, 0.03	92.99	11		
MEGAHIT-MetaBAT2-CAPES_S21.13.fa	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__Bacteroides fragilis_A	GCF_002849695.1	95.0	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__Bacteroides fragilis_A	97.98	0.741	GCF_002849695.1	95.0	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__Bacteroides fragilis_A	97.98	0.741	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Bacteroides;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_019583405.1, s__Bacteroides fragilis_B, 95.0, 93.23, 0.703; GCF_000025985.1, s__Bacteroides fragilis, 95.0, 86.89, 0.661; GCA_019412865.1, s__Bacteroides sp019412865, 95.0, 82.62, 0.134; GCA_900557355.1, s__Bacteroides sp900557355, 95.0, 82.06, 0.123; GCF_001314995.1, s__Bacteroides ovatus, 95.0, 81.26, 0.214; GCF_900106755.1, s__Bacteroides faecis, 95.0, 81.01, 0.243; GCF_003865075.1, s__Bacteroides faecalis, 95.0, 80.8, 0.204; GCA_025147485.1, s__Bacteroides uniformis, 95.0, 80.44, 0.175; GCF_014334015.1, s__Bacteroides intestinigallinarum, 95.0, 80.33, 0.229; GCA_902362375.1, s__Bacteroides sp902362375, 95.0, 80.3, 0.224; GCF_000614125.1, s__Bacteroides rodentium, 95.0, 79.76, 0.161; GCF_000156195.1, s__Bacteroides finegoldii, 95.8579, 79.73, 0.195; GCF_000210075.1, s__Bacteroides xylanisolvens, 95.0, 79.72, 0.225; GCF_001688725.2, s__Bacteroides caecimuris, 95.0, 79.55, 0.192; GCF_009193295.2, s__Bacteroides luhongzhouii, 95.0, 79.46, 0.202; GCA_900755095.1, s__Bacteroides sp900755095, 95.0, 79.35, 0.151; GCA_000613465.1, s__Bacteroides nordii, 95.0, 79.29, 0.224; GCF_004793475.1, s__Bacteroides sp002491635, 95.0, 79.22, 0.18; GCF_903181435.1, s__Bacteroides sp900765785, 95.0, 79.18, 0.202; GCF_009193325.2, s__Bacteroides zhangwenhongii, 95.8579, 79.17, 0.209; GCF_000011065.1, s__Bacteroides thetaiotaomicron, 95.0, 79.16, 0.234; GCF_012113595.1, s__Bacteroides sp012113595, 95.0, 78.99, 0.23; GCA_905197435.1, s__Bacteroides sp905197435, 95.0, 78.95, 0.173; GCF_900130125.1, s__Bacteroides congonensis, 95.0, 78.93, 0.22; GCF_014750685.1, s__Bacteroides sp014750685, 95.0, 78.87, 0.222; GCF_024623065.1, s__Bacteroides acidifaciens, 95.0, 78.79, 0.188; GCF_000381365.1, s__Bacteroides salyersiae, 95.0, 78.78, 0.21; GCF_002222615.2, s__Bacteroides caccae, 95.0, 78.72, 0.215; GCF_900155865.1, s__Bacteroides bouchesdurhonensis, 95.0, 78.7, 0.182; GCA_934716785.1, s__Bacteroides sp934716785, 95.0, 78.63, 0.151; GCA_025147325.1, s__Bacteroides stercoris, 95.0, 78.53, 0.192; GCA_910586915.1, s__Bacteroides sp910586915, 95.0, 78.53, 0.224; GCA_900547205.1, s__Bacteroides sp900547205, 95.0, 78.46, 0.198; GCA_014385165.1, s__Bacteroides sp014385165, 95.0, 78.46, 0.186; GCF_000614145.1, s__Bacteroides faecichinchillae, 95.0, 78.43, 0.173; GCA_900761785.1, s__Bacteroides sp900761785, 95.0, 78.39, 0.215; GCA_910578895.1, s__Bacteroides sp910578895, 95.0, 78.33, 0.172; GCF_018390535.1, s__Bacteroides propionicigenes, 95.0, 78.32, 0.192; GCF_000517545.1, s__Bacteroides reticulotermitis, 95.0, 78.32, 0.141; GCF_000195635.1, s__Bacteroides fluxus, 95.0, 78.31, 0.169; GCF_000172175.1, s__Bacteroides intestinalis, 95.0, 78.28, 0.18; GCA_900556625.1, s__Bacteroides sp900556625, 95.0, 78.27, 0.198; GCA_905207245.1, s__Bacteroides sp905207245, 95.0, 78.27, 0.162; GCF_020091405.1, s__Bacteroides sp900552405, 95.0, 78.26, 0.19; GCF_900142015.1, s__Bacteroides stercorirosoris, 95.0, 78.25, 0.184; GCA_019416685.1, s__Bacteroides sp900762525, 95.0, 78.17, 0.211; GCF_000499785.1, s__Bacteroides neonati, 95.0, 78.17, 0.147; GCF_000513195.1, s__Bacteroides timonensis, 95.0, 78.16, 0.178; GCF_000315485.1, s__Bacteroides oleiciplenus, 95.0, 78.15, 0.178; GCF_902364365.1, s__Bacteroides sp900556215, 95.2143, 78.15, 0.167; GCA_025146565.1, s__Bacteroides eggerthii, 95.0, 78.09, 0.161; GCA_902388495.1, s__Bacteroides sp902388495, 95.0, 78.06, 0.16; GCF_000374365.1, s__Bacteroides gallinarum, 95.0, 78.04, 0.137; GCA_910585845.1, s__Bacteroides sp910585845, 95.0, 78.04, 0.151; GCF_900241005.1, s__Bacteroides cutis, 95.0, 77.98, 0.172; GCA_905203765.1, s__Bacteroides sp905203765, 95.0, 77.98, 0.208; GCA_900766005.1, s__Bacteroides sp900766005, 95.0, 77.94, 0.192; GCA_900555635.1, s__Bacteroides sp900555635, 95.0, 77.94, 0.153; GCF_000186225.1, s__Bacteroides helcogenes, 95.0, 77.93, 0.173; GCF_900129655.1, s__Bacteroides clarus, 95.0, 77.92, 0.188; GCF_000158035.1, s__Bacteroides cellulosilyticus, 95.0, 77.86, 0.166; GCF_003464595.1, s__Bacteroides intestinalis_A, 95.2143, 77.86, 0.184; GCF_002998435.1, s__Bacteroides zoogleoformans, 95.0, 77.81, 0.138; GCA_023426145.1, s__Bacteroides sp023426145, 95.0, 77.77, 0.151; GCA_021203515.1, s__Bacteroides sp021203515, 95.0, 77.76, 0.175; GCA_945607605.1, s__Bacteroides sp945607605, 95.0, 77.71, 0.219; GCA_905215345.1, s__Bacteroides sp905215345, 95.0, 77.67, 0.157; GCF_014196225.1, s__Bacteroides pyogenes_A, 95.0, 77.66, 0.156; GCA_944322345.1, s__Bacteroides sp944322345, 95.0, 77.63, 0.139; GCA_017992695.1, s__Bacteroides sp017992695, 95.0, 77.63, 0.192; GCF_004342845.1, s__Bacteroides heparinolyticus, 95.0, 77.57, 0.148; GCF_000428105.1, s__Bacteroides pyogenes, 95.0, 77.54, 0.156; GCA_002293435.1, s__Bacteroides sp002293435, 95.0, 77.49, 0.136; GCA_002471185.1, s__Bacteroides sp002471185, 95.0, 77.42, 0.103; GCA_023458215.1, s__Bacteroides sp023458215, 95.0, 77.41, 0.142; GCA_905215555.1, s__Bacteroides sp905215555, 95.0, 77.33, 0.14; GCA_009929715.1, s__Bacteroides sp009929715, 95.0, 77.28, 0.109; GCA_022648295.1, s__Bacteroides sp022648295, 95.0, 76.84, 0.087; GCA_007896885.1, s__Bacteroides sp007896885, 95.0, 76.8, 0.085; GCA_900766195.1, s__Bacteroides sp900766195, 95.0, 76.8, 0.091; GCA_002471195.1, s__Bacteroides sp002471195, 95.0, 76.76, 0.099	71.82	11		
MEGAHIT-MetaBAT2-CAPES_S21.17.fa	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Phocaeicola;s__Phocaeicola dorei	GCF_013009555.1	95.359	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Phocaeicola;s__Phocaeicola dorei	99.04	0.857	GCF_013009555.1	95.359	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Phocaeicola;s__Phocaeicola dorei	99.04	0.857	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Bacteroidaceae;g__Phocaeicola;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_000012825.1, s__Phocaeicola vulgatus, 95.359, 95.49, 0.752; GCA_019414665.1, s__Phocaeicola sp019414665, 95.0, 92.19, 0.449; GCA_019414725.1, s__Phocaeicola sp019414725, 95.0, 91.89, 0.428; GCA_900760795.1, s__Phocaeicola sp900760795, 95.0, 91.16, 0.596; GCF_000614185.1, s__Phocaeicola sartorii, 95.0, 90.08, 0.637; GCA_011959205.1, s__Phocaeicola sp011959205, 95.0, 89.9, 0.594; GCA_902388365.1, s__Phocaeicola sp902388365, 95.0, 81.31, 0.474; GCF_000382445.1, s__Phocaeicola massiliensis, 95.0, 81.08, 0.367; GCF_021730445.1, s__Phocaeicola faecalis, 95.0, 80.47, 0.394; GCF_013618865.1, s__Phocaeicola faecicola, 95.0, 78.31, 0.126; GCF_900128455.1, s__Phocaeicola mediterraneensis, 95.0, 78.3, 0.142; GCA_900546355.1, s__Phocaeicola sp900546355, 95.0, 77.99, 0.117; GCA_900546645.1, s__Phocaeicola sp900546645, 95.0, 77.92, 0.154; GCF_000374585.1, s__Phocaeicola barnesiae, 95.0, 77.88, 0.155; GCA_019550975.1, s__Phocaeicola excrementipullorum, 95.0, 77.82, 0.092; GCF_021531075.1, s__Phocaeicola caecigallinarum_B, 95.0, 77.82, 0.134; GCA_017472925.1, s__Phocaeicola sp017472925, 95.0, 77.75, 0.186; GCF_000187895.1, s__Phocaeicola plebeius, 95.0, 77.74, 0.129; GCA_900542985.1, s__Phocaeicola sp900542985, 95.0, 77.73, 0.193; GCA_900551645.1, s__Phocaeicola sp900551645, 95.0, 77.72, 0.158; GCF_000154845.1, s__Phocaeicola coprocola, 95.0, 77.67, 0.149; GCF_900128495.1, s__Phocaeicola ilei, 95.0, 77.65, 0.144; GCF_000190575.1, s__Phocaeicola salanitronis, 95.0, 77.65, 0.105; GCA_000434735.1, s__Phocaeicola sp000434735, 95.0, 77.65, 0.111; GCA_902362595.1, s__Phocaeicola merdigallinarum, 95.0, 77.56, 0.149; GCF_003437535.1, s__Phocaeicola plebeius_A, 95.0, 77.55, 0.147; GCA_015060705.1, s__Phocaeicola sp015060705, 95.0, 77.54, 0.169; GCF_021530995.1, s__Phocaeicola faecigallinarum, 95.0, 77.54, 0.123; GCA_018884025.1, s__Phocaeicola faecipullorum, 95.0, 77.52, 0.097; GCA_015060735.1, s__Phocaeicola sp015060735, 95.0, 77.51, 0.167; GCA_944320855.1, s__Phocaeicola sp944320855, 95.0, 77.51, 0.114; GCF_016901995.1, s__Phocaeicola sp900066445, 95.0, 77.5, 0.128; GCF_016900355.1, s__Phocaeicola sp900551065, 95.0, 77.5, 0.142; GCA_900551445.1, s__Phocaeicola sp900551445, 95.0, 77.49, 0.169; GCA_015060765.1, s__Phocaeicola sp015060765, 95.0, 77.48, 0.188; GCF_000613805.1, s__Phocaeicola paurosaccharolyticus, 95.0, 77.47, 0.104; GCA_900544075.1, s__Phocaeicola sp900544075, 95.0, 77.45, 0.105; GCA_944321335.1, s__Phocaeicola sp944321335, 95.0, 77.43, 0.174; GCA_017467385.1, s__Phocaeicola sp017467385, 95.0, 77.42, 0.195; GCF_000157915.1, s__Phocaeicola coprophilus, 95.0, 77.39, 0.158; GCA_017503235.1, s__Phocaeicola sp017503235, 95.0, 77.38, 0.173; GCA_020026175.1, s__Phocaeicola sp020026175, 95.0, 77.37, 0.123; GCA_017416735.1, s__Phocaeicola sp017416735, 95.0, 77.37, 0.136; GCA_900552645.1, s__Phocaeicola sp900552645, 95.0, 77.28, 0.147; GCA_020026225.1, s__Phocaeicola sp020026225, 95.0, 77.26, 0.136; GCA_900553185.1, s__Phocaeicola sp900553185, 95.0, 77.21, 0.164; GCA_900556845.1, s__Phocaeicola sp900556845, 95.0, 77.18, 0.139; GCA_900552075.1, s__Phocaeicola sp900552075, 95.0, 77.09, 0.166; GCA_905198575.1, s__Phocaeicola sp905198575, 95.0, 77.09, 0.125; GCF_016902295.1, s__Phocaeicola caecigallinarum_A, 95.0, 77.08, 0.137; GCA_000432735.1, s__Phocaeicola sp000432735, 95.0, 77.08, 0.161; GCA_023415545.1, s__Phocaeicola sp023415545, 95.0, 77.05, 0.122; GCF_014837055.1, s__Phocaeicola faecium, 95.0, 77.01, 0.13; GCF_014837065.1, s__Phocaeicola intestinalis, 95.0, 76.99, 0.117; GCA_022767125.1, s__Phocaeicola plebeius_C, 95.0, 76.97, 0.102; GCA_017420875.1, s__Phocaeicola sp017420875, 95.0, 76.95, 0.156; GCA_944320815.1, s__Phocaeicola sp944320815, 95.0, 76.95, 0.114; GCA_944320605.1, s__Phocaeicola sp944320605, 95.0, 76.94, 0.1; GCA_944320845.1, s__Phocaeicola sp944320845, 95.0, 76.91, 0.111; GCA_019116145.1, s__Phocaeicola gallistercoris, 95.0, 76.89, 0.114; GCA_017558175.1, s__Phocaeicola sp017558175, 95.0, 76.87, 0.149; GCA_900544675.1, s__Phocaeicola sp900544675, 95.0, 76.85, 0.113; GCA_017466635.1, s__Phocaeicola sp017466635, 95.0, 76.82, 0.175; GCA_944319745.1, s__Phocaeicola sp944319745, 95.0, 76.75, 0.094; GCA_944322395.1, s__Phocaeicola sp944322395, 95.0, 76.72, 0.085; GCA_000436795.1, s__Phocaeicola sp000436795, 95.0, 76.7, 0.152; GCA_902779535.1, s__Phocaeicola sp902779535, 95.0, 76.68, 0.077; GCA_019120125.1, s__Phocaeicola excrementigallinarum, 95.0, 76.68, 0.115; GCF_002161765.1, s__Phocaeicola sp002161765, 95.0, 76.67, 0.129; GCA_004558305.1, s__Phocaeicola plebeius_B, 95.0, 76.66, 0.094; GCA_022764505.1, s__Phocaeicola sp022764505, 95.0, 76.63, 0.081; GCA_944320135.1, s__Phocaeicola sp944320135, 95.0, 76.6, 0.091; GCA_017559935.1, s__Phocaeicola sp017559935, 95.0, 76.53, 0.166; GCA_021295095.1, s__Phocaeicola sp021295095, 95.0, 76.12, 0.075; GCF_000312445.1, s__Phocaeicola abscessus, 95.0, 76.12, 0.058	78.59	11		
MEGAHIT-MetaBAT2-CAPES_S21.18.fa	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__Bifidobacterium breve	GCF_001025175.1	95.0	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__Bifidobacterium breve	97.82	0.873	GCF_001025175.1	95.0	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__Bifidobacterium breve	97.82	0.873	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_000269965.1, s__Bifidobacterium infantis, 95.0, 84.59, 0.697; GCF_000196555.1, s__Bifidobacterium longum, 95.0, 83.85, 0.667; GCA_022739095.1, s__Bifidobacterium sp022739095, 95.0, 83.85, 0.486; GCF_019331675.1, s__Bifidobacterium miconisargentati, 95.0, 81.5, 0.622; GCF_003129905.1, s__Bifidobacterium callitrichidarum, 95.0, 81.49, 0.604; GCF_002259745.1, s__Bifidobacterium myosotis, 95.0, 80.98, 0.596; GCF_009193305.1, s__Bifidobacterium cebidarum, 95.0, 80.89, 0.528; GCF_018555455.1, s__Bifidobacterium colobi, 95.0, 80.84, 0.552; GCF_000741695.1, s__Bifidobacterium reuteri, 95.0, 80.79, 0.568; GCF_002802915.1, s__Bifidobacterium felsineum, 95.0, 80.75, 0.556; GCF_012932375.1, s__Bifidobacterium sp012932375, 95.0, 80.65, 0.552; GCF_002860405.1, s__Bifidobacterium imperatoris, 95.0, 80.65, 0.528; GCF_000741715.1, s__Bifidobacterium saguini, 95.0, 80.32, 0.534; GCF_002802865.1, s__Bifidobacterium scaligerum, 95.0, 80.03, 0.512; GCF_018555385.1, s__Bifidobacterium santillanense, 95.0, 79.93, 0.43; GCF_019331725.1, s__Bifidobacterium saguinibicoloris, 95.0, 79.86, 0.4; GCF_010667645.1, s__Bifidobacterium platyrrhinorum, 95.0, 79.81, 0.428; GCF_002860365.1, s__Bifidobacterium parmae, 95.0, 79.75, 0.43; GCF_001042635.1, s__Bifidobacterium scardovii, 95.0, 79.71, 0.456; GCF_000741175.1, s__Bifidobacterium callitrichos, 95.0, 79.64, 0.426; GCF_001417815.1, s__Bifidobacterium aesculapii, 95.0, 79.59, 0.396; GCF_008698235.1, s__Bifidobacterium rousetti, 95.0, 79.58, 0.436; GCF_000741785.1, s__Bifidobacterium stellenboschense, 95.0, 79.47, 0.44; GCF_010667685.1, s__Bifidobacterium aerophilum, 95.0, 79.47, 0.404; GCF_008698145.1, s__Bifidobacterium vespertilionis, 95.0, 79.41, 0.337; GCF_019331715.1, s__Bifidobacterium simiiventris, 95.0, 79.39, 0.398; GCF_019331805.1, s__Bifidobacterium pongonis, 95.0, 79.37, 0.345; GCF_009299505.1, s__Bifidobacterium ramosum, 95.0, 79.34, 0.396; GCF_009299475.1, s__Bifidobacterium avesanii, 95.0, 79.25, 0.361; GCF_012932365.1, s__Bifidobacterium sp012932365, 95.0, 79.25, 0.375; GCF_018555435.2, s__Bifidobacterium amazonense, 95.0, 79.24, 0.4; GCF_019331735.1, s__Bifidobacterium miconis, 95.0, 79.22, 0.376; GCF_014898155.1, s__Bifidobacterium eulemuris, 95.0, 79.21, 0.43; GCF_012932425.1, s__Bifidobacterium sp012932425, 95.0, 79.21, 0.48; GCF_009193355.1, s__Bifidobacterium leontopitheci, 95.0, 79.09, 0.38; GCF_001025135.1, s__Bifidobacterium bifidum, 95.0, 79.03, 0.363; GCF_014898175.1, s__Bifidobacterium lemurum, 95.0, 78.93, 0.43; GCF_003952945.1, s__Bifidobacterium samirii, 95.0, 78.9, 0.323; GCF_000800455.1, s__Bifidobacterium kashiwanohense_A, 95.1942, 78.89, 0.317; GCF_000741165.1, s__Bifidobacterium biavatii, 95.0, 78.89, 0.4; GCF_000010425.1, s__Bifidobacterium adolescentis, 95.0, 78.85, 0.351; GCF_900129045.1, s__Bifidobacterium merycicum, 95.0, 78.85, 0.363; GCA_002451435.1, s__Bifidobacterium sp002451435, 95.0, 78.82, 0.321; GCF_001025155.1, s__Bifidobacterium angulatum, 95.0, 78.78, 0.339; GCF_016899725.1, s__Bifidobacterium pullorum_C, 95.0, 78.74, 0.325; GCF_000741215.1, s__Bifidobacterium pullorum_B, 95.0, 78.74, 0.333; GCF_000770925.1, s__Bifidobacterium ruminantium, 95.0, 78.66, 0.375; GCF_012932675.1, s__Bifidobacterium sp012932675, 95.0, 78.65, 0.357; GCF_001025215.1, s__Bifidobacterium pseudocatenulatum, 95.0, 78.61, 0.291; GCF_000522505.1, s__Bifidobacterium moukalabense, 95.0, 78.59, 0.339; GCF_001042615.1, s__Bifidobacterium kashiwanohense, 96.3948, 78.53, 0.291; GCF_002742445.1, s__Bifidobacterium sp002742445, 95.0, 78.5, 0.295; GCF_001042595.1, s__Bifidobacterium dentium, 95.0, 78.41, 0.321; GCF_003951095.1, s__Bifidobacterium goeldii, 95.0, 78.39, 0.249; GCF_009078285.1, s__Bifidobacterium jacchi, 95.0, 78.38, 0.343; GCF_019331775.1, s__Bifidobacterium phasiani, 95.0, 78.25, 0.311; GCF_001025195.1, s__Bifidobacterium catenulatum, 96.3948, 78.22, 0.285; GCF_000771405.1, s__Bifidobacterium pullorum, 95.0, 78.17, 0.319; GCA_002298605.1, s__Bifidobacterium sp002298605, 95.0, 78.09, 0.267; GCF_010667615.1, s__Bifidobacterium choloepi, 95.0, 78.08, 0.185; GCF_012932445.1, s__Bifidobacterium sp012932445, 95.0, 78.04, 0.233; GCF_000741575.1, s__Bifidobacterium cuniculi, 95.0, 77.91, 0.221; GCA_000741495.1, s__Bifidobacterium thermophilum_A, 95.0, 77.81, 0.215; GCF_004155535.1, s__Bifidobacterium pseudolongum_C, 95.0, 77.78, 0.203; GCA_000771265.1, s__Bifidobacterium thermophilum, 95.0, 77.75, 0.249; GCF_000741255.1, s__Bifidobacterium magnum, 95.0, 77.64, 0.165; GCF_000741775.1, s__Bifidobacterium subtile, 95.0, 77.63, 0.225; GCF_002286915.1, s__Bifidobacterium italicum, 95.0, 77.6, 0.167; GCF_002259755.1, s__Bifidobacterium hapali, 95.0, 77.57, 0.177; GCF_000741135.1, s__Bifidobacterium choerinum, 95.0, 77.49, 0.189; GCA_024104275.1, s__Bifidobacterium sp024104275, 95.0, 77.48, 0.167; GCF_000260715.1, s__Bifidobacterium animalis, 95.0, 77.48, 0.174; GCF_000741535.1, s__Bifidobacterium boum, 95.0, 77.47, 0.225; GCF_000741205.1, s__Bifidobacterium gallicum, 95.0, 77.42, 0.122; GCA_022640935.1, s__Bifidobacterium sp022640935, 95.0, 77.38, 0.169; GCF_009371885.1, s__Bifidobacterium tibiigranuli, 95.0, 77.37, 0.213; GCF_000741295.1, s__Bifidobacterium globosum, 95.0, 77.36, 0.209; GCF_000771225.1, s__Bifidobacterium pseudolongum, 95.0, 77.34, 0.199; GCF_003952025.1, s__Bifidobacterium castoris, 95.0, 77.33, 0.191; GCF_000741285.1, s__Bifidobacterium mongoliense, 95.0, 77.3, 0.181; GCA_022640915.1, s__Bifidobacterium sp022640915, 95.0, 77.25, 0.125; GCF_003408845.1, s__Bifidobacterium vaginale_H, 95.0, 77.2, 0.034; GCF_002286935.1, s__Bifidobacterium criceti, 95.0, 77.19, 0.133; GCF_003315635.1, s__Bifidobacterium aemilianum, 95.0, 77.15, 0.147; GCF_003397585.1, s__Bifidobacterium piotii, 95.0, 77.09, 0.032; GCF_000741525.1, s__Bifidobacterium bohemicum, 95.0, 77.07, 0.12; GCF_009299485.1, s__Bifidobacterium apri, 95.0, 77.05, 0.219; GCF_009727065.1, s__Bifidobacterium canis, 95.0, 77.04, 0.167; GCF_002860345.1, s__Bifidobacterium anseris, 95.0, 76.96, 0.179; GCA_003585735.1, s__Bifidobacterium sp003585735, 95.0, 76.92, 0.044; GCF_000263635.1, s__Bifidobacterium vaginale_C, 95.0, 76.71, 0.051; GCF_000741765.1, s__Bifidobacterium tsurumiense, 95.0, 76.68, 0.116; GCF_001546455.1, s__Bifidobacterium vaginale_B, 95.0, 76.61, 0.033; GCA_022649135.1, s__Bifidobacterium sp022649135, 95.0, 76.54, 0.131; GCF_900094885.1, s__Bifidobacterium commune, 95.0, 76.47, 0.084; GCF_001042655.1, s__Bifidobacterium vaginale, 95.0, 76.43, 0.024; GCF_002896555.1, s__Bifidobacterium vaginale_F, 95.0, 76.29, 0.029; GCF_003951975.1, s__Bifidobacterium dolichotidis, 95.0, 76.2, 0.08; GCF_000737845.1, s__Bifidobacterium bombi, 95.0, 76.04, 0.082	55.33	11		
MEGAHIT-MetaBAT2-CAPES_S21.20.fa	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Tannerellaceae;g__Parabacteroides;s__Parabacteroides distasonis	GCF_000012845.1	95.0	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Tannerellaceae;g__Parabacteroides;s__Parabacteroides distasonis	97.42	0.881	GCF_000012845.1	95.0	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Tannerellaceae;g__Parabacteroides;s__Parabacteroides distasonis	97.42	0.881	d__Bacteria;p__Bacteroidota;c__Bacteroidia;o__Bacteroidales;f__Tannerellaceae;g__Parabacteroides;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_004793765.1, s__Parabacteroides distasonis_A, 95.0, 94.34, 0.662; GCF_011038785.1, s__Parabacteroides sp011038785, 95.0, 90.13, 0.706; GCA_022009975.1, s__Parabacteroides sp900549585, 95.0, 81.83, 0.11; GCA_905196875.1, s__Parabacteroides sp905196875, 95.0, 78.41, 0.206; GCF_900186615.1, s__Parabacteroides bouchesdurhonensis, 95.0, 78.39, 0.212; GCA_900760525.1, s__Parabacteroides sp900760525, 95.0, 78.36, 0.202; GCF_900128505.1, s__Parabacteroides timonensis, 95.0, 78.35, 0.23; GCF_000969825.1, s__Parabacteroides gordonii, 95.0, 78.34, 0.231; GCF_015550595.1, s__Parabacteroides sp900540715, 95.0, 78.31, 0.226; GCF_003480915.1, s__Parabacteroides sp003480915, 95.0, 78.27, 0.241; GCA_015060925.1, s__Parabacteroides distasonis_B, 95.0, 78.26, 0.165; GCA_025151045.1, s__Parabacteroides johnsonii, 95.0, 78.23, 0.228; GCF_900108035.1, s__Parabacteroides chinchillae, 95.0, 78.2, 0.194; GCF_014287585.1, s__Parabacteroides sp014287585, 95.0, 78.18, 0.244; GCF_014647375.1, s__Parabacteroides faecis, 95.0, 78.15, 0.239; GCF_003473295.1, s__Parabacteroides sp003473295, 95.0, 78.12, 0.181; GCF_000969835.1, s__Parabacteroides goldsteinii, 95.0, 78.09, 0.241; GCF_900155425.1, s__Parabacteroides massiliensis, 95.0, 78.08, 0.216; GCA_934725305.1, s__Parabacteroides sp934725305, 95.0, 78.08, 0.202; GCA_025151215.1, s__Parabacteroides merdae, 95.0, 78.08, 0.217; GCA_019114945.1, s__Parabacteroides intestinigallinarum, 95.0, 78.06, 0.311; GCA_021531385.1, s__Parabacteroides distasonis_C, 95.0, 78.05, 0.2; GCF_003363715.1, s__Parabacteroides acidifaciens, 95.0, 78.05, 0.229; GCA_022775005.1, s__Parabacteroides sp022775005, 95.0, 78.04, 0.217; GCA_019417325.1, s__Parabacteroides sp019417325, 95.0, 77.98, 0.172; GCA_021204335.1, s__Parabacteroides sp021204335, 95.0, 77.97, 0.19; GCA_017937385.1, s__Parabacteroides sp017937385, 95.0, 77.96, 0.192; GCA_900541965.1, s__Parabacteroides sp900541965, 95.0, 77.93, 0.165; GCA_017481945.1, s__Parabacteroides sp017481945, 95.0, 77.93, 0.151; GCA_900770835.1, s__Parabacteroides sp900770835, 95.0, 77.93, 0.168; GCA_910577325.1, s__Parabacteroides sp910577325, 95.0, 77.86, 0.215; GCA_021531775.1, s__Parabacteroides sp004562445, 95.0, 77.86, 0.125; GCA_934718155.1, s__Parabacteroides sp934718155, 95.0, 77.84, 0.205; GCA_017560165.1, s__Parabacteroides sp017560165, 95.0, 77.69, 0.129; GCF_023744375.1, s__Parabacteroides sp900548175, 95.0, 77.69, 0.126; GCF_022809355.1, s__Parabacteroides faecavium, 95.0, 77.63, 0.111; GCA_900552465.1, s__Parabacteroides sp900552465, 95.0, 77.4, 0.2; GCA_944325345.1, s__Parabacteroides sp944325345, 95.0, 77.39, 0.095; GCA_900552415.1, s__Parabacteroides intestinipullorum, 95.0, 77.37, 0.223; GCA_020026285.1, s__Parabacteroides sp020026285, 95.0, 77.25, 0.066; GCA_944326775.1, s__Parabacteroides sp944326775, 95.0, 77.25, 0.117; GCA_019411205.1, s__Parabacteroides sp019411205, 95.0, 77.2, 0.218; GCA_900547435.1, s__Parabacteroides sp900547435, 95.0, 77.01, 0.112; GCF_002159645.1, s__Parabacteroides sp002159645, 95.0, 76.99, 0.094; GCA_944325335.1, s__Parabacteroides sp944325335, 95.0, 76.99, 0.095; GCA_019118725.1, s__Parabacteroides intestinavium, 95.0, 76.88, 0.11; GCA_944325355.1, s__Parabacteroides sp944325355, 95.0, 76.65, 0.093; GCA_944323815.1, s__Parabacteroides sp944323815, 95.0, 76.57, 0.116	93.9	11		
MEGAHIT-MetaBAT2-CAPES_S21.23.fa	d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae_A;g__Parasutterella;s__Parasutterella gallistercoris	GCA_000980495.1	95.0	d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae_A;g__Parasutterella;s__Parasutterella gallistercoris	99.22	0.863	GCA_000980495.1	95.0	d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae_A;g__Parasutterella;s__Parasutterella gallistercoris	99.22	0.863	d__Bacteria;p__Pseudomonadota;c__Gammaproteobacteria;o__Burkholderiales;f__Burkholderiaceae_A;g__Parasutterella;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCA_900766055.1, s__Parasutterella sp900766055, 95.0, 92.93, 0.847; GCA_900552195.1, s__Parasutterella sp900552195, 95.0, 92.2, 0.82; GCF_000205025.1, s__Parasutterella excrementihominis, 95.0, 90.24, 0.901; GCA_900554375.1, s__Parasutterella sp900554375, 95.0, 85.16, 0.696; GCF_009767915.1, s__Parasutterella sp009767915, 95.0, 78.29, 0.304	85.72	11		
MEGAHIT-MetaBAT2-CAPES_S21.24.fa	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__Bifidobacterium longum	GCF_000196555.1	95.0	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__Bifidobacterium longum	98.32	0.912	GCF_000196555.1	95.0	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__Bifidobacterium longum	98.32	0.912	d__Bacteria;p__Actinomycetota;c__Actinomycetia;o__Actinomycetales;f__Bifidobacteriaceae;g__Bifidobacterium;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_000269965.1, s__Bifidobacterium infantis, 95.0, 94.8, 0.811; GCA_022739095.1, s__Bifidobacterium sp022739095, 95.0, 93.61, 0.55; GCF_001025175.1, s__Bifidobacterium breve, 95.0, 87.15, 0.725; GCF_019331675.1, s__Bifidobacterium miconisargentati, 95.0, 84.44, 0.695; GCF_003129905.1, s__Bifidobacterium callitrichidarum, 95.0, 84.18, 0.711; GCF_002259745.1, s__Bifidobacterium myosotis, 95.0, 83.46, 0.687; GCF_009193305.1, s__Bifidobacterium cebidarum, 95.0, 83.43, 0.62; GCF_000741695.1, s__Bifidobacterium reuteri, 95.0, 83.28, 0.651; GCF_018555455.1, s__Bifidobacterium colobi, 95.0, 83.24, 0.624; GCF_002802915.1, s__Bifidobacterium felsineum, 95.0, 83.16, 0.622; GCF_002802865.1, s__Bifidobacterium scaligerum, 95.0, 83.0, 0.606; GCF_012932375.1, s__Bifidobacterium sp012932375, 95.0, 82.93, 0.622; GCF_000741715.1, s__Bifidobacterium saguini, 95.0, 82.75, 0.608; GCF_002860405.1, s__Bifidobacterium imperatoris, 95.0, 82.69, 0.58; GCF_019331725.1, s__Bifidobacterium saguinibicoloris, 95.0, 82.66, 0.512; GCF_018555385.1, s__Bifidobacterium santillanense, 95.0, 82.54, 0.594; GCF_008698235.1, s__Bifidobacterium rousetti, 95.0, 82.53, 0.56; GCF_002860365.1, s__Bifidobacterium parmae, 95.0, 82.51, 0.546; GCF_001025135.1, s__Bifidobacterium bifidum, 95.0, 82.47, 0.48; GCF_010667645.1, s__Bifidobacterium platyrrhinorum, 95.0, 82.43, 0.576; GCF_000741785.1, s__Bifidobacterium stellenboschense, 95.0, 82.42, 0.568; GCF_000741175.1, s__Bifidobacterium callitrichos, 95.0, 82.41, 0.582; GCF_001417815.1, s__Bifidobacterium aesculapii, 95.0, 82.35, 0.534; GCF_001042635.1, s__Bifidobacterium scardovii, 95.0, 82.3, 0.594; GCF_018555435.2, s__Bifidobacterium amazonense, 95.0, 81.93, 0.494; GCF_009299475.1, s__Bifidobacterium avesanii, 95.0, 81.87, 0.524; GCF_009193355.1, s__Bifidobacterium leontopitheci, 95.0, 81.75, 0.484; GCF_000741165.1, s__Bifidobacterium biavatii, 95.0, 81.7, 0.502; GCF_012932425.1, s__Bifidobacterium sp012932425, 95.0, 81.69, 0.564; GCF_019331735.1, s__Bifidobacterium miconis, 95.0, 81.66, 0.526; GCF_012932365.1, s__Bifidobacterium sp012932365, 95.0, 81.64, 0.48; GCF_009299505.1, s__Bifidobacterium ramosum, 95.0, 81.49, 0.556; GCF_019331805.1, s__Bifidobacterium pongonis, 95.0, 81.49, 0.458; GCF_010667685.1, s__Bifidobacterium aerophilum, 95.0, 81.41, 0.532; GCF_000010425.1, s__Bifidobacterium adolescentis, 95.0, 81.41, 0.482; GCF_008698145.1, s__Bifidobacterium vespertilionis, 95.0, 81.4, 0.508; GCF_014898155.1, s__Bifidobacterium eulemuris, 95.0, 81.39, 0.558; GCF_012932675.1, s__Bifidobacterium sp012932675, 95.0, 81.32, 0.47; GCF_000770925.1, s__Bifidobacterium ruminantium, 95.0, 81.31, 0.414; GCF_014898175.1, s__Bifidobacterium lemurum, 95.0, 81.26, 0.568; GCF_016899725.1, s__Bifidobacterium pullorum_C, 95.0, 81.25, 0.47; GCF_019331715.1, s__Bifidobacterium simiiventris, 95.0, 81.24, 0.528; GCF_900129045.1, s__Bifidobacterium merycicum, 95.0, 81.13, 0.472; GCF_001025155.1, s__Bifidobacterium angulatum, 95.0, 81.05, 0.428; GCF_001042615.1, s__Bifidobacterium kashiwanohense, 96.3948, 80.93, 0.438; GCF_003952945.1, s__Bifidobacterium samirii, 95.0, 80.9, 0.492; GCA_002451435.1, s__Bifidobacterium sp002451435, 95.0, 80.9, 0.398; GCF_009078285.1, s__Bifidobacterium jacchi, 95.0, 80.88, 0.448; GCF_001025215.1, s__Bifidobacterium pseudocatenulatum, 95.0, 80.8, 0.42; GCF_000741215.1, s__Bifidobacterium pullorum_B, 95.0, 80.77, 0.472; GCF_000771405.1, s__Bifidobacterium pullorum, 95.0, 80.73, 0.46; GCF_000800455.1, s__Bifidobacterium kashiwanohense_A, 95.1942, 80.7, 0.426; GCF_002742445.1, s__Bifidobacterium sp002742445, 95.0, 80.68, 0.41; GCF_003951095.1, s__Bifidobacterium goeldii, 95.0, 80.57, 0.367; GCF_001025195.1, s__Bifidobacterium catenulatum, 96.3948, 80.57, 0.422; GCF_000522505.1, s__Bifidobacterium moukalabense, 95.0, 80.56, 0.452; GCF_019331775.1, s__Bifidobacterium phasiani, 95.0, 80.49, 0.464; GCF_001042595.1, s__Bifidobacterium dentium, 95.0, 80.47, 0.428; GCF_012932445.1, s__Bifidobacterium sp012932445, 95.0, 80.32, 0.353; GCA_002298605.1, s__Bifidobacterium sp002298605, 95.0, 80.19, 0.345; GCF_000741575.1, s__Bifidobacterium cuniculi, 95.0, 80.18, 0.371; GCF_004155535.1, s__Bifidobacterium pseudolongum_C, 95.0, 79.86, 0.331; GCF_010667615.1, s__Bifidobacterium choloepi, 95.0, 79.7, 0.351; GCA_000771265.1, s__Bifidobacterium thermophilum, 95.0, 79.69, 0.337; GCF_002259755.1, s__Bifidobacterium hapali, 95.0, 79.6, 0.305; GCF_000741535.1, s__Bifidobacterium boum, 95.0, 79.54, 0.331; GCF_000741295.1, s__Bifidobacterium globosum, 95.0, 79.44, 0.333; GCA_000741495.1, s__Bifidobacterium thermophilum_A, 95.0, 79.41, 0.345; GCF_003952025.1, s__Bifidobacterium castoris, 95.0, 79.36, 0.351; GCF_000771225.1, s__Bifidobacterium pseudolongum, 95.0, 79.34, 0.317; GCF_009299485.1, s__Bifidobacterium apri, 95.0, 79.34, 0.307; GCF_000741135.1, s__Bifidobacterium choerinum, 95.0, 79.31, 0.331; GCF_002286915.1, s__Bifidobacterium italicum, 95.0, 79.15, 0.327; GCF_000741255.1, s__Bifidobacterium magnum, 95.0, 79.11, 0.267; GCF_002860345.1, s__Bifidobacterium anseris, 95.0, 78.98, 0.313; GCF_009371885.1, s__Bifidobacterium tibiigranuli, 95.0, 78.92, 0.331; GCF_000741285.1, s__Bifidobacterium mongoliense, 95.0, 78.9, 0.295; GCF_000260715.1, s__Bifidobacterium animalis, 95.0, 78.88, 0.317; GCF_000741775.1, s__Bifidobacterium subtile, 95.0, 78.83, 0.337; GCA_024104275.1, s__Bifidobacterium sp024104275, 95.0, 78.79, 0.297; GCF_009727065.1, s__Bifidobacterium canis, 95.0, 78.69, 0.299; GCA_022640935.1, s__Bifidobacterium sp022640935, 95.0, 78.68, 0.275; GCF_000741205.1, s__Bifidobacterium gallicum, 95.0, 78.4, 0.213; GCF_002286935.1, s__Bifidobacterium criceti, 95.0, 78.38, 0.275; GCF_003315635.1, s__Bifidobacterium aemilianum, 95.0, 78.28, 0.255; GCF_000741525.1, s__Bifidobacterium bohemicum, 95.0, 78.09, 0.217; GCF_003951975.1, s__Bifidobacterium dolichotidis, 95.0, 78.07, 0.124; GCF_000263635.1, s__Bifidobacterium vaginale_C, 95.0, 77.97, 0.055; GCA_022640915.1, s__Bifidobacterium sp022640915, 95.0, 77.92, 0.207; GCF_000741765.1, s__Bifidobacterium tsurumiense, 95.0, 77.88, 0.175; GCA_022649135.1, s__Bifidobacterium sp022649135, 95.0, 77.87, 0.229; GCF_003408845.1, s__Bifidobacterium vaginale_H, 95.0, 77.83, 0.054; GCF_900094885.1, s__Bifidobacterium commune, 95.0, 77.62, 0.167; GCF_000737845.1, s__Bifidobacterium bombi, 95.0, 77.56, 0.106; GCA_003585735.1, s__Bifidobacterium sp003585735, 95.0, 77.47, 0.061; GCF_001546455.1, s__Bifidobacterium vaginale_B, 95.0, 77.16, 0.062; GCF_002896555.1, s__Bifidobacterium vaginale_F, 95.0, 77.15, 0.057; GCF_001042655.1, s__Bifidobacterium vaginale, 95.0, 76.93, 0.05; GCF_003397585.1, s__Bifidobacterium piotii, 95.0, 76.84, 0.058	65.44	11		
MEGAHIT-MetaBAT2-CAPES_S21.7.fa	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Agathobacter;s__Agathobacter rectalis	GCA_000020605.1	95.0	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Agathobacter;s__Agathobacter rectalis	97.75	0.909	GCA_000020605.1	95.0	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Agathobacter;s__Agathobacter rectalis	97.75	0.909	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Agathobacter;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCA_900546625.1, s__Agathobacter sp900546625, 95.0, 94.74, 0.826; GCA_900317585.1, s__Agathobacter sp900317585, 95.0, 94.67, 0.733; GCA_934365945.1, s__Agathobacter sp934365945, 95.0, 92.86, 0.692; GCA_905209075.1, s__Agathobacter sp905209075, 95.0, 81.41, 0.379; GCA_902363675.1, s__Agathobacter sp000434275, 95.0, 80.51, 0.209; GCF_001406815.1, s__Agathobacter faecis, 95.0, 79.72, 0.201; GCA_934359755.1, s__Agathobacter sp934359755, 95.0, 79.26, 0.202; GCA_900550545.1, s__Agathobacter sp900550545, 95.0, 79.12, 0.167; GCA_900549895.1, s__Agathobacter sp900549895, 95.0, 78.88, 0.144; GCA_900548765.1, s__Agathobacter sp900548765, 95.0, 78.82, 0.258; GCA_022767075.1, s__Agathobacter sp022767075, 95.0, 78.69, 0.122; GCA_900552085.1, s__Agathobacter sp900552085, 95.0, 78.43, 0.164; GCA_017409825.1, s__Agathobacter sp017409825, 95.0, 78.31, 0.081; GCA_934316545.1, s__Agathobacter sp934316545, 95.0, 78.3, 0.143; GCA_002474415.1, s__Agathobacter sp002474415, 95.0, 78.29, 0.163; GCF_002735305.1, s__Agathobacter ruminis, 95.0, 78.27, 0.1; GCA_022772675.1, s__Agathobacter sp022772675, 95.0, 78.25, 0.131; GCA_900543445.1, s__Agathobacter sp900543445, 95.0, 78.22, 0.155; GCA_022769125.1, s__Agathobacter sp022769125, 95.0, 78.13, 0.138; GCA_017936725.1, s__Agathobacter sp017936725, 95.0, 78.12, 0.099; GCA_022775045.1, s__Agathobacter sp022775045, 95.0, 78.04, 0.141; GCA_015057085.1, s__Agathobacter sp015057085, 95.0, 77.95, 0.099; GCA_934307695.1, s__Agathobacter sp934307695, 95.0, 77.93, 0.142; GCA_934358675.1, s__Agathobacter sp934358675, 95.0, 77.91, 0.167; GCA_015057035.1, s__Agathobacter sp015057035, 95.0, 77.8, 0.092; GCA_017623935.1, s__Agathobacter sp017623935, 95.0, 77.74, 0.106; GCA_910587655.1, s__Agathobacter sp910587655, 95.0, 77.68, 0.075; GCA_945868435.1, s__Agathobacter sp945868435, 95.0, 77.62, 0.131; GCA_944381555.1, s__Agathobacter sp944381555, 95.0, 77.59, 0.091; GCA_023666425.1, s__Agathobacter sp023666425, 95.0, 77.37, 0.076; GCA_900316805.1, s__Agathobacter sp900316805, 95.0, 77.24, 0.13; GCA_017481285.1, s__Agathobacter sp017481285, 95.0, 77.18, 0.076; GCA_023663685.1, s__Agathobacter sp023663685, 95.0, 76.96, 0.065; GCA_017523185.1, s__Agathobacter sp017523185, 95.0, 76.53, 0.05; GCA_017407605.1, s__Agathobacter sp017407605, 95.0, 76.43, 0.053; GCA_022511395.1, s__Agathobacter sp022511395, 95.0, 76.27, 0.062; GCA_021199135.1, s__Agathobacter sp021199135, 95.0, 75.91, 0.033	88.1	11		
MEGAHIT-MetaBAT2-CAPES_S21.8.fa	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster;s__Enterocloster bolteae	GCF_002234575.2	95.0	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster;s__Enterocloster bolteae	98.87	0.863	GCF_002234575.2	95.0	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster;s__Enterocloster bolteae	98.87	0.863	d__Bacteria;p__Bacillota_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Enterocloster;s__	taxonomic classification defined by topology and ANI	topological placement and ANI have congruent species assignments	GCF_000424325.1, s__Enterocloster clostridioformis_A, 95.0, 91.54, 0.561; GCF_020297485.1, s__Enterocloster clostridioformis, 95.0, 91.11, 0.624; GCA_021201905.1, s__Enterocloster sp021201905, 95.0, 90.61, 0.775; GCF_005845215.1, s__Enterocloster sp005845215, 95.0, 88.11, 0.763; GCF_003434055.1, s__Enterocloster aldenensis, 95.0, 80.37, 0.407; GCA_000155435.1, s__Enterocloster pacaense, 95.0, 80.34, 0.39; GCF_000233455.1, s__Enterocloster citroniae, 95.0, 79.58, 0.371; GCA_900555045.1, s__Enterocloster sp900555045, 95.0, 78.68, 0.186; GCF_020554865.1, s__Enterocloster sp900753815, 95.0, 78.5, 0.229; GCA_001304855.1, s__Enterocloster sp001304855, 95.0, 78.44, 0.271; GCA_900770345.1, s__Enterocloster sp900770345, 95.0, 78.43, 0.165; GCF_009696375.1, s__Enterocloster porci, 95.0, 78.15, 0.157; GCA_025149125.1, s__Enterocloster asparagiformis, 95.0, 78.14, 0.197; GCF_902364025.1, s__Enterocloster lavalensis, 95.0, 78.11, 0.245; GCA_900547035.1, s__Enterocloster excrementigallinarum, 95.0, 78.05, 0.11; GCA_900549235.1, s__Enterocloster sp900549235, 95.0, 78.02, 0.175; GCA_900551225.1, s__Enterocloster sp900551225, 95.0, 77.87, 0.219; GCF_001517625.2, s__Enterocloster sp001517625, 95.0, 77.7, 0.139; GCF_013282095.1, s__Enterocloster sp900538485, 95.0, 77.63, 0.15; GCA_944382045.1, s__Enterocloster sp944382045, 95.0, 77.63, 0.16; GCA_018380885.1, s__Enterocloster sp900555905, 95.0, 77.61, 0.238; GCA_944384065.1, s__Enterocloster sp944384065, 95.0, 77.51, 0.146; GCA_019119575.1, s__Enterocloster excrementipullorum, 95.0, 77.44, 0.148; GCA_019118585.1, s__Enterocloster faecavium, 95.0, 77.44, 0.117; GCF_003473545.1, s__Enterocloster sp000431375, 95.0, 77.37, 0.098; GCA_944376135.1, s__Enterocloster sp944376135, 95.0, 77.31, 0.129; GCA_900543885.1, s__Enterocloster sp900543885, 95.0, 77.11, 0.097; GCA_944383785.1, s__Enterocloster sp944383785, 95.0, 77.06, 0.106; GCA_900759225.1, s__Enterocloster sp900759225, 95.0, 76.81, 0.159; GCA_900541315.1, s__Enterocloster sp900541315, 95.0, 76.78, 0.089	51.22	11		

@prototaxites
Copy link
Contributor

The GTDB-Tk process only takes bins which pass a given (user-specifiable) quality threshold (a minimum completeness of 50% and maximum contamination of 10%). Could it be that none of the bins from the samples that aren't going into GTDB-Tk meet these thresholds?

@amizeranschi
Copy link
Contributor Author

@prototaxites

I've also observed the same behavior when running the pipeline on my own data (7 samples), where I only had GTDBtk running on a single sample. I think it's highly unlikely that only that specific sample had bins passing the quality thresholds, and none of the other 6 samples did...

@prototaxites
Copy link
Contributor

Certainly sounds like it could be a bug in that case - could you double-check the bin qualities from the busco_summary.tsv file first, just to make sure? If there are bins passing the default threshold there that might help in figuring out where the problem is.

@amizeranschi
Copy link
Contributor Author

Had a look now inside bin_summary.tsv (uploaded here, below) and there are definitely bins from the other samples that pass the default quality thresholds for completeness and maximum contamination.

I'm also uploading a screenshot with some filters I made in Excel.
bin_summary.tsv.txt
Screenshot

@amizeranschi
Copy link
Contributor Author

@jfy133

Here is the log file for the single GTDBtk job that was ran for the data posted above, as well as the full log and execution trace for the pipeline run.

command.log.txt
command.sh.txt
nextflow.log.txt
execution_trace_2024-07-25_03-36-39.txt

@amizeranschi
Copy link
Contributor Author

@prototaxites
I've just realized now that I was filtering earlier on the wrong column for maximum contamination (%Missing (specific) instead of %Complete and duplicated (specific)). Still, even after fixing this mistake, there are still several bins showing up on those two missing samples.

@seoic
Copy link

seoic commented Jul 31, 2024

I’m dealing with the same problem. I tried several times by modifying the quality thresholds for GTDB, and found that only the first bin meeting the thresholds gets processed by GTDB-tk classifywf. It doesn’t process all the bins that meet the thresholds.

@prototaxites
Copy link
Contributor

I've done some brute-force print debugging (sorry manuscript...) - I can't find any obvious fault with the code in the GTDBTK subworkflow that collects the QC metrics, matches them to the input channel, and filters using the provided thresholds - at least with the test data (with a low enough min_completeness threshold), samples from both test samples should get through to the GTDBTK_CLASSIFYWF process.

So I'm not really sure what's up here...

@amizeranschi
Copy link
Contributor Author

@prototaxites

Thanks a lot for looking into this. By any chance, have you tried running the commands from the first post above, and did you manage to reproduce the issue in this way?

@prototaxites
Copy link
Contributor

Afraid I didn’t try to reproduce your example above as I was poking about inside a Gitpod instance - which doesn’t have enough juice to actually run GTDB! But I had hoped to catch any obvious problems with the filtering…

@seoic
Copy link

seoic commented Jul 31, 2024

I also provided a directory for --gtdb_db like amizeranschi did, then only the first bin that met the GTDB threshold was processed by GTDBTK_CLASSIFYWF. So, I tried again by providing a tar.gz file for --gtdb_db, and this time it worked. All bins that met the thresholds were processed.

@amizeranschi
Copy link
Contributor Author

I can also confirm that running the pipeline with --gtdb_db gtdbtk_r214_data.tar.gz instead of passing the uncompressed directory made things work properly and the outputs include results for bins from all the samples.

I'm guessing that the quickest "fix" for this issue would be to change the docs to include instructions for only passing the tar.gz archive to the --gtdb_db parameter here:

https://github.com/nf-core/mag/blob/dev/nextflow_schema.json#L539

@prototaxites
Copy link
Contributor

prototaxites commented Aug 1, 2024

Ah - I think I found the problem! Could one of you test the this fix with gtdb directory input, replacing the standard nextflow run command with the below?

nextflow run prototaxites/mag -r fix_gtdb_dir_input [rest of command]

edit: updated the fix

@amizeranschi
Copy link
Contributor Author

@prototaxites

I ran your branch now (took 7 hours to run on the data from the first post above) with --gtdb_db gtdbtk_r214_data/release214/ and it worked fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants