medcat.utils.preprocess_snomed.Snomed - FileNotFoundError #198

KeironO · 2022-01-25T15:07:53Z

Hi there,

Whenever I attempt to use the Snomed preprocess utility set, I have file not found errors:

from medcat.utils.preprocess_snomed import Snomed
snomed = Snomed("C:/path/to/dir/uk_sct2cl_32.7.0_20211124000001Z/")
cdf = snomed.to_concept_df()

Returns

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-23-5eb639e435ed> in <module>
----> 1 cdf = snomed.to_concept_df()

~\Projects\nlp\env\lib\site-packages\medcat\utils\preprocess_snomed.py in to_concept_df(self)
     50                     snomed_v = m.group(1)
     51 
---> 52             int_terms = parse_file(f'{contents_path}/sct2_Concept_Snapshot_{snomed_v}_{snomed_release}.txt')
     53             active_terms = int_terms[int_terms.active == '1']
     54             del int_terms

~\Projects\nlp\env\lib\site-packages\medcat\utils\preprocess_snomed.py in parse_file(filename, first_row_header, columns)
      7 
      8 def parse_file(filename, first_row_header=True, columns=None):
----> 9     with open(filename, encoding='utf-8') as f:
     10         entities = [[n.strip() for n in line.split('\t')] for line in f]
     11         return pd.DataFrame(entities[1:], columns=entities[0] if first_row_header else columns)

FileNotFoundError: [Errno 2] No such file or directory: 'C:/path/to/dir/uk_sct2cl_32.7.0_20211124000001Z/SnomedCT_UKClinicalRefsetsRF2_PRODUCTION_20211124T000001Z\\Snapshot\\Terminology/sct2_Concept_Snapshot_INT_20211124.txt'

Where the file is named sct2_Concept_UKCRSnapshot_GB1000000_20211124.txt

Best wishes,

Keiron

The text was updated successfully, but these errors were encountered:

antsh3k · 2022-01-25T20:28:10Z

Dear Keiron,

Thank you for flagging to the team that the format of new UK extension releases has now changed.
I will make changes to enable the processing of the new release format and will let you know when it is done.

KeironO · 2022-01-25T20:51:08Z

@antsh3k no worries, I can have a go at fixing it if you're snowed under?

antsh3k · 2022-01-25T20:57:15Z

dw, I can change it. Although, If you can test and feedback that would be amazing!

The changes will be reviewed and integrated by tomorrow.

antsh3k · 2022-01-26T00:42:57Z

The changes have been made PR #199. The following should work now:
from medcat.utils.preprocess_snomed import Snomed
snomed = Snomed("C:/path/to/dir/uk_sct2cl_32.7.0_20211124000001Z/")
snomed.uk_ext = True # Note: this will only work with UK extensions >2021 with the new release format. Prior UK extension releases should skip this step.
cdf = snomed.to_concept_df()

Let me know if there are any further issues.

KeironO · 2022-01-26T07:41:53Z

Seems to work fine now. Thank you for your work :D

w-is-h assigned w-is-h and antsh3k and unassigned w-is-h Jan 25, 2022

w-is-h closed this as completed Jan 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

medcat.utils.preprocess_snomed.Snomed - FileNotFoundError #198

medcat.utils.preprocess_snomed.Snomed - FileNotFoundError #198

KeironO commented Jan 25, 2022

antsh3k commented Jan 25, 2022

KeironO commented Jan 25, 2022

antsh3k commented Jan 25, 2022

antsh3k commented Jan 26, 2022

KeironO commented Jan 26, 2022 •

edited

medcat.utils.preprocess_snomed.Snomed - FileNotFoundError #198

medcat.utils.preprocess_snomed.Snomed - FileNotFoundError #198

Comments

KeironO commented Jan 25, 2022

antsh3k commented Jan 25, 2022

KeironO commented Jan 25, 2022

antsh3k commented Jan 25, 2022

antsh3k commented Jan 26, 2022

KeironO commented Jan 26, 2022 • edited

KeironO commented Jan 26, 2022 •

edited