You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems to be an issue when config.general.show_nested_entities = True and the following code path runs. Traceback below.
In particular, it seems the problem is in medcat/cat.py around L1500-1510, reproduce below.
for _ent in doc._.ents:
entity = Span(doc, _ent['start'], _ent['end'], label=_ent['label'])
entity._.cui = _ent['cui']
entity._.detected_name = _ent['detected_name']
entity._.context_similarity = _ent['context_similarity']
entity._.id = _ent['id']
if 'meta_anns' in _ent:
entity._.meta_anns = _ent['meta_anns']
_ents.append(entity)
If I replace _ent["start"] with _ent.start (and similar for the other getitem calls which need to be getattirbute) then this code doesn't crash. Perhaps it was previously a dict but now is spacy Span objects and this results in the issue?
Traceback (most recent call last):
File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 211, in <module>
main()
File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 203, in main
prf_df, global_prf, fpnames_df = score_docs(gold_pages, cat)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 124, in score_docs
result = cat.get_entities(gold_doc.text)
^^^
File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 124, in score_docs
result = cat.get_entities(gold_doc.text)
^^^
File "/usr/lib/python3.11/bdb.py", line 90, in trace_dispatch
return self.dispatch_line(frame)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/bdb.py", line 115, in dispatch_line
if self.quitting: raise BdbQuit
The text was updated successfully, but these errors were encountered:
I looked into this a little bit. And it seems that part may have been broken for a long time.
The main problem I ran into was not being able to come up with a test case to reach that part of the code. With the limited model available during automated testing, I couldn't find a way to get a Span that has any entries in doc._.ents. Thus, in the cases I was able to come up with, this part of the code didn't run.
We may not have had many people use this part of the library (i.e run stuff with show_nested_entities).
With that said, I've come up with a change that should fix the issue within my testing. PR in a minute.
EDIT: PR #326
I'm using
spacy == 3.5.3
medcat == 1.7.0
There seems to be an issue when
config.general.show_nested_entities = True
and the following code path runs. Traceback below.In particular, it seems the problem is in
medcat/cat.py
around L1500-1510, reproduce below.If I replace
_ent["start"]
with_ent.start
(and similar for the other getitem calls which need to be getattirbute) then this code doesn't crash. Perhaps it was previously a dict but now is spacy Span objects and this results in the issue?The text was updated successfully, but these errors were encountered: