Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAG_DEPTHS_PLOT fails if only a single bin is produced #638

Closed
d-callan opened this issue Jul 17, 2024 · 3 comments · Fixed by #639
Closed

MAG_DEPTHS_PLOT fails if only a single bin is produced #638

d-callan opened this issue Jul 17, 2024 · 3 comments · Fixed by #639
Assignees
Labels
bug Something isn't working

Comments

@d-callan
Copy link
Contributor

Description of the bug

If a single mag/bin is produced upstream, then making the depths heatmap fails bc it cant calculate a distance matrix.

Command used and terminal output

cmd:

nextflow run nf-core/mag \
		 -c mag.config \
		 --input mag_samplesheet.csv \
		 --outdir mag_out \
		 -work-dir mag_work \
		 -r 3.0.1 \
		 --skip_gtdbtk \
		 --skip_spades \
		 --skip_spadeshybrid \
		 --skip_concoct \
		 --kraken2_db k2_pluspf_20240112.tar.gz \
		 --genomad_db genomad_db

ex depths file:

bin     3114327 3101140 3106384 3106237 3108596
MEGAHIT-MetaBAT2-3114327.1.fa   7.610989999999999       55.530100000000004      69.84535        12.7621 21.8551

error:

Traceback (most recent call last):
  File "/home/dcallan/.nextflow/assets/nf-core/mag/bin/plot_mag_depths.py", line 83, in <module>
    sys.exit(main())
  File "/home/dcallan/.nextflow/assets/nf-core/mag/bin/plot_mag_depths.py", line 70, in main
    sns.clustermap(
  File "/usr/local/lib/python3.9/site-packages/seaborn/_decorators.py", line 46, in inner_f
    return f(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 1402, in clustermap
    return plotter.plot(metric=metric, method=method,
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 1220, in plot
    self.plot_dendrograms(row_cluster, col_cluster, metric, method,
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 1065, in plot_dendrograms
    self.dendrogram_row = dendrogram(
  File "/usr/local/lib/python3.9/site-packages/seaborn/_decorators.py", line 46, in inner_f
    return f(**kwargs)
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 784, in dendrogram
    plotter = _DendrogramPlotter(data, linkage=linkage, axis=axis,
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 594, in __init__
    self.linkage = self.calculated_linkage
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 661, in calculated_linkage
    return self._calculate_linkage_scipy()
  File "/usr/local/lib/python3.9/site-packages/seaborn/matrix.py", line 629, in _calculate_linkage_scipy
    linkage = hierarchy.linkage(self.array, method=self.method,
  File "/usr/local/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 1068, in linkage
    n = int(distance.num_obs_y(y))
  File "/usr/local/lib/python3.9/site-packages/scipy/spatial/distance.py", line 2572, in num_obs_y
    raise ValueError("The number of observations cannot be determined on "
ValueError: The number of observations cannot be determined on an empty distance matrix.`, size: 2148 (max: 255)


### Relevant files

_No response_

### System information

LSF Cluster using Apptainer
@d-callan d-callan added the bug Something isn't working label Jul 17, 2024
@d-callan d-callan self-assigned this Jul 17, 2024
@jfy133
Copy link
Member

jfy133 commented Jul 18, 2024

@d-callan thank you for this!

To further better understand the problem, could you also share the contents of .command.sh , and check what is the contents of the input depth files that don't have any values (given it's only a single sample that gets coverage).?

@d-callan
Copy link
Contributor Author

d-callan commented Jul 18, 2024

the command:

plot_mag_depths.py --bin_depths MEGAHIT-MetaBAT2-3114327-binDepths.tsv                     --groups sample_groups.tsv                     --out "MEGAHIT-MetaBAT2-3114327-binDepths.heatmap.png"

cat <<-END_VERSIONS > versions.yml
"NFCORE_MAG:MAG:DEPTHS:MAG_DEPTHS_PLOT":
    python: $(python --version 2>&1 | sed 's/Python //g')
    pandas: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('pandas').version)")
    seaborn: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('seaborn').version)")
END_VERSIONS

the other depth files actually look better than this one. what i posted above were the contents of the file MEGAHIT-MetaBAT2-3114327-binDepths.tsv from the above command.

if, just to confirm my suspicion that the error is caused by this file having only a single row, i modify my local mag here to only attempt the clustermap if len(df.index) > 1 and run the command manually then all is well.

@d-callan d-callan linked a pull request Jul 18, 2024 that will close this issue
11 tasks
jfy133 added a commit that referenced this issue Aug 16, 2024
Fix #638 [MAG_DEPTHS_PLOT fails if only a single bin is produced]
@jfy133
Copy link
Member

jfy133 commented Aug 16, 2024

Thank you very much @d-callan for both the issue and PR (even better <3)!

@jfy133 jfy133 closed this as completed Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants