Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation importation dataframe has a "duration" column #479

Closed
lucasgautheron opened this issue Jul 7, 2024 · 1 comment
Closed

Annotation importation dataframe has a "duration" column #479

lucasgautheron opened this issue Jul 7, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@lucasgautheron
Copy link
Collaborator

Will notably happen with the EL1000 importer: https://gin.g-node.org/LAAC-LSCP/tools/src/master/EL1000/annotations.py

Error is:

Traceback (most recent call last):
  File "/home/lgautheron/.conda/envs/stan/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'duration'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch2/lgautheron/data/kidd-vtc/scripts/import_vtc.py", line 14, in <module>
    importer.process(
  File "/home/lgautheron/.conda/envs/stan/lib/python3.12/site-packages/EL1000/annotations.py", line 66, in process
    self.am.import_annotations(input, threads = threads)
  File "/home/lgautheron/.conda/envs/stan/lib/python3.12/site-packages/ChildProject/annotations.py", line 634, in import_annotations
    assert (input_processed["range_offset"] <= input_processed.merge(self.project.recordings,
                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lgautheron/.conda/envs/stan/lib/python3.12/site-packages/pandas/core/frame.py", line 3807, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lgautheron/.conda/envs/stan/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 3804, in get_loc
    raise KeyError(key) from err
KeyError: 'duration'

This is probably because of performing a merge on dataframes with duplicate columns.

@lucasgautheron lucasgautheron added the bug Something isn't working label Jul 7, 2024
@LoannPeurey
Copy link
Contributor

b63e15a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants