KeyError in study_data/mock_2_key.csv #2

Daniel-VM · 2024-02-14T12:24:09Z

By running the example in the main README.md file I got a key error in study_data/mock_2_key.csv. The script expects the column name Sample but mock_2_key.csv is Strain.

Installed version

conda install -c bioconda hamroaster

Command

hAMRoaster --ham_out study_data/ham_sum.tsv  \
           --name test1 \
           --AMR_key study_data/mock_2_key.csv \
           --db_files db_files

conda install -c bioconda hAMRoaster

error log:

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
Traceback (most recent call last):
  File "/opt/miniconda3/envs/hamroaster/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 153, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 182, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Sample'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/miniconda3/envs/hamroaster/bin/hAMRoaster", line 57, in <module>
    mock['Sample'] = mock['Sample'].str.lower()
                     ~~~~^^^^^^^^^^
  File "/opt/miniconda3/envs/hamroaster/lib/python3.12/site-packages/pandas/core/frame.py", line 4090, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/hamroaster/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3809, in get_loc
    raise KeyError(key) from err
KeyError: 'Sample'

The text was updated successfully, but these errors were encountered:

ewissel · 2024-03-08T21:49:14Z

Hi! Thanks for your message.

hAMRoaster v2 requires a sample column that wasn't previously required for hAMRoaster v1. I've added the column now to the data file on github here. The data was simulated such that all clinical isolates are in one sample, so the "Sample" column for this dataset only has one value - one. This column is more useful for simulated data / input data with multiple samples, such as the low resistance dataset referenced in the prprint.

Let me know if you have any additional issues. Closing for now as the added column should solve the issue.

ewissel closed this as completed Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError in study_data/mock_2_key.csv #2

KeyError in study_data/mock_2_key.csv #2

Daniel-VM commented Feb 14, 2024 •

edited

Loading

ewissel commented Mar 8, 2024

KeyError in study_data/mock_2_key.csv #2

KeyError in study_data/mock_2_key.csv #2

Comments

Daniel-VM commented Feb 14, 2024 • edited Loading

Installed version

Command

error log:

ewissel commented Mar 8, 2024

Daniel-VM commented Feb 14, 2024 •

edited

Loading