Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError in study_data/mock_2_key.csv #2

Closed
Daniel-VM opened this issue Feb 14, 2024 · 1 comment
Closed

KeyError in study_data/mock_2_key.csv #2

Daniel-VM opened this issue Feb 14, 2024 · 1 comment

Comments

@Daniel-VM
Copy link

Daniel-VM commented Feb 14, 2024

By running the example in the main README.md file I got a key error in study_data/mock_2_key.csv. The script expects the column name Sample but mock_2_key.csv is Strain.

Installed version

conda install -c bioconda hamroaster 

Command

hAMRoaster --ham_out study_data/ham_sum.tsv  \
           --name test1 \
           --AMR_key study_data/mock_2_key.csv \
           --db_files db_files

conda install -c bioconda hAMRoaster

error log:

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
Traceback (most recent call last):
  File "/opt/miniconda3/envs/hamroaster/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 153, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 182, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Sample'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/miniconda3/envs/hamroaster/bin/hAMRoaster", line 57, in <module>
    mock['Sample'] = mock['Sample'].str.lower()
                     ~~~~^^^^^^^^^^
  File "/opt/miniconda3/envs/hamroaster/lib/python3.12/site-packages/pandas/core/frame.py", line 4090, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/hamroaster/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3809, in get_loc
    raise KeyError(key) from err
KeyError: 'Sample'

@ewissel
Copy link
Owner

ewissel commented Mar 8, 2024

Hi! Thanks for your message.

hAMRoaster v2 requires a sample column that wasn't previously required for hAMRoaster v1. I've added the column now to the data file on github here. The data was simulated such that all clinical isolates are in one sample, so the "Sample" column for this dataset only has one value - one. This column is more useful for simulated data / input data with multiple samples, such as the low resistance dataset referenced in the prprint.

Let me know if you have any additional issues. Closing for now as the added column should solve the issue.

@ewissel ewissel closed this as completed Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants