You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello while running haystac --mode abundances on my data I got this following snakemake error :
Error in rule calculate_dirichlet_abundances:
jobid: 0
output: /crex/proj/snic2022-6-144/nobackup/user/Meta_project/Virus/Haystac_analysis/Scripts/Herpesvirus_sequences/L010_Herpesvirus_db_output/probabilities/L010/L010_posterior_abundance.tsv
log: /crex/proj/snic2022-6-144/nobackup/user/Meta_project/Virus/Haystac_analysis/Scripts/Herpesvirus_sequences/L010_Herpesvirus_db_output/probabilities/L010/L010_posterior_abundance.log (check log file(s) for error message)
conda-env: /home/user/haystac/cache/conda/3983e9d32a119d531dde75cf1fd7d0a0
RuleException:
CalledProcessError in line 169 of /home/user/.conda/envs/haystac/lib/python3.6/site-packages/haystac/workflow/rules/metagenomics.smk:
Command 'source /sw/apps/conda/latest/rackham_stage/bin/activate '/home/user/haystac/cache/conda/3983e9d32a119d531dde75cf1fd7d0a0'; set -euo pipefail; python /crex/proj/snic2022-6-144/nobackup/user/Meta_project/Virus/Haystac_analysis/Scripts/Herpesvirus_sequences/.snakemake/scripts/tmpjizpoz5a.calculate_dirichlet_abundances.py' returned non-zero exit status 1.
File "/home/user/.conda/envs/haystac/lib/python3.6/site-packages/snakemake/executors/__init__.py", line 2352, in run_wrapper
File "/home/user/.conda/envs/haystac/lib/python3.6/site-packages/haystac/workflow/rules/metagenomics.smk", line 169, in __rule_calculate_dirichlet_abundances
File "/home/user/.conda/envs/haystac/lib/python3.6/site-packages/snakemake/executors/__init__.py", line 569, in _callback
File "/home/user/.conda/envs/haystac/lib/python3.6/concurrent/futures/thread.py", line 56, in run
File "/home/user/.conda/envs/haystac/lib/python3.6/site-packages/snakemake/executors/__init__.py", line 555, in cached_or_run
File "/home/user/.conda/envs/haystac/lib/python3.6/site-packages/snakemake/executors/__init__.py", line 2364, in run_wrapper
Trying to restart job 0.`
And when looking at the log file I see :
144/nobackup/user/Meta_project/Virus/Haystac_analysis/Scripts/Herpesvirus_sequences/L010_Herpesvirus_db_output/probabilities/L010/L010_posterior_abundance.log
Traceback (most recent call last):
File "/crex/proj/snic2022-6-144/nobackup/user/Meta_project/Virus/Haystac_analysis/Scripts/Herpesvirus_sequences/.snakemake/scripts/tmp5j1s7h8q.calculate_dirichlet_abu
ndances.py", line 145, in <module>
calculate_dirichlet_abundances(
File "/crex/proj/snic2022-6-144/nobackup/user/Meta_project/Virus/Haystac_analysis/Scripts/Herpesvirus_sequences/.snakemake/scripts/tmp5j1s7h8q.calculate_dirichlet_abu
ndances.py", line 84, in calculate_dirichlet_abundances
a = ts_tv_matrix.groupby("Taxon").sum().squeeze().astype(float)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/generic.py", line 6324, in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 451, in astype
return self.apply(
^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/internals/managers.py", line 352, in apply
applied = getattr(b, f)(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/internals/blocks.py", line 511, in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/dtypes/astype.py", line 242, in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/dtypes/astype.py", line 187, in astype_array
values = _astype_nansafe(values, dtype, copy=copy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.local/lib/python3.11/site-packages/pandas/core/dtypes/astype.py", line 138, in _astype_nansafe
return arr.astype(dtype, copy=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: 'ERR5024913.26762701ERR5024914.36341165ERR5024913.39738624ERR5024914.9765425ERR5024914.48021130ERR5024913.12985897ERR5024913.17020599
So I looked into the haystac code and
On line 78 in the code of calculate_dirichlet_abundances.py I found this :
if len(ts_tv_matrix.Taxon.unique()) > 1:
a = ts_tv_matrix.groupby("Taxon").sum().squeeze().astype(float)
else:
a = ts_tv_matrix.groupby("Taxon").sum().iloc[:, 0].astype(float)
a.loc["Grey_Matter"] = grey_matter.sum()
It looks like that it is this part that gives this error : ValueError: could not convert string to float: 'ERR5024913.26762701ERR5024914.36341165ERR5024913.39738624ERR5024914.9765425ERR5024914.48021130ERR5024913.12985897ERR5024913.17020599
Because the ts_tv_matrix at this step looks like that :
So it looks like my tsv files does look like has it should our did I missed something ?
Thanks for your help.
Here is the full haystac code I used : haystac analyse --mode abundances --database Herpesvirus_db --sample L010 --output L010_Herpesvirus_db_output --cores 15
The text was updated successfully, but these errors were encountered:
I hope you are doing great, and apologies for not getting back to you earlier !
You are right in terms of which line is causing that ValueError to be raised, and your ts_tv_matrix also looks fine.
I have been trying to reproduce your error locally, using the the example snippet of your ts_tv_matrix . I have managed to reproduce your error only when I use a version of pandas higher than the one that is specified in the yaml file used to create the /home/user/haystac/cache/conda/3983e9d32a119d531dde75cf1fd7d0a0conda env. In that yaml file the pandas version is specified to be 1.0.3, and it appears you might be using a version that is >= 1.4.1. Which makes sense as your python version within that env also changes to python3.11. Unfortunately I am not sure how this happened as whenever I am creating that env locally, it gets built correctly.
I think if you downgrade your pandas, or build that env with the correct dependency versions (as pinned in the envs/scipy.yaml), your analysis should work.
That being said, thank you for raising this issue ! It's a good reason to update the versions of some dependencies, which I will get round to doing soon.
Hello while running haystac --mode abundances on my data I got this following snakemake error :
And when looking at the log file I see :
So I looked into the haystac code and
On line 78 in the code of calculate_dirichlet_abundances.py I found this :
It looks like that it is this part that gives this error :
ValueError: could not convert string to float: 'ERR5024913.26762701ERR5024914.36341165ERR5024913.39738624ERR5024914.9765425ERR5024914.48021130ERR5024913.12985897ERR5024913.17020599
Because the ts_tv_matrix at this step looks like that :
So it looks like my tsv files does look like has it should our did I missed something ?
Thanks for your help.
Here is the full haystac code I used :
haystac analyse --mode abundances --database Herpesvirus_db --sample L010 --output L010_Herpesvirus_db_output --cores 15
The text was updated successfully, but these errors were encountered: