Show nested ents results in error #325

jkgenser · 2023-06-09T16:56:21Z

I'm using
spacy == 3.5.3
medcat == 1.7.0

There seems to be an issue when config.general.show_nested_entities = True and the following code path runs. Traceback below.

In particular, it seems the problem is in medcat/cat.py around L1500-1510, reproduce below.

                for _ent in doc._.ents:
                    entity = Span(doc, _ent['start'], _ent['end'], label=_ent['label'])
                    entity._.cui = _ent['cui']
                    entity._.detected_name = _ent['detected_name']
                    entity._.context_similarity = _ent['context_similarity']
                    entity._.id = _ent['id']
                    if 'meta_anns' in _ent:
                        entity._.meta_anns = _ent['meta_anns']
                    _ents.append(entity)

If I replace _ent["start"] with _ent.start (and similar for the other getitem calls which need to be getattirbute) then this code doesn't crash. Perhaps it was previously a dict but now is spacy Span objects and this results in the issue?

Traceback (most recent call last):
  File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 211, in <module>
    main()
  File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 203, in main
    prf_df, global_prf, fpnames_df = score_docs(gold_pages, cat)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 124, in score_docs
    result = cat.get_entities(gold_doc.text)
             ^^^
  File "/home/j/oler-medcat/src/scripts/basic_eval.py", line 124, in score_docs
    result = cat.get_entities(gold_doc.text)
             ^^^
  File "/usr/lib/python3.11/bdb.py", line 90, in trace_dispatch
    return self.dispatch_line(frame)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/bdb.py", line 115, in dispatch_line
    if self.quitting: raise BdbQuit

The text was updated successfully, but these errors were encountered:

mart-r · 2023-06-15T15:51:17Z

I looked into this a little bit. And it seems that part may have been broken for a long time.

The main problem I ran into was not being able to come up with a test case to reach that part of the code. With the limited model available during automated testing, I couldn't find a way to get a Span that has any entries in doc._.ents. Thus, in the cases I was able to come up with, this part of the code didn't run.

We may not have had many people use this part of the library (i.e run stuff with show_nested_entities).

With that said, I've come up with a change that should fix the issue within my testing. PR in a minute.
EDIT: PR #326

mart-r · 2023-06-26T13:58:08Z

Will be included in next release

This was referenced Jun 15, 2023

Fix for Issue 325 mart-r/MedCAT#24

Closed

Fix for Issue 325 #326

Merged

mart-r closed this as completed Jun 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show nested ents results in error #325

Show nested ents results in error #325

jkgenser commented Jun 9, 2023 •

edited

Loading

mart-r commented Jun 15, 2023 •

edited

Loading

mart-r commented Jun 26, 2023

Show nested ents results in error #325

Show nested ents results in error #325

Comments

jkgenser commented Jun 9, 2023 • edited Loading

mart-r commented Jun 15, 2023 • edited Loading

mart-r commented Jun 26, 2023

jkgenser commented Jun 9, 2023 •

edited

Loading

mart-r commented Jun 15, 2023 •

edited

Loading