You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your interest in our work! Due to the copyright issue, we cannot upload the file related to mimic-cxr. However, you can follow the step below to preprocess the original dataset of mimic-cxr:
obtain the label_index:
(1) Visit https://physionet.org/content/mimic-cxr-jpg/2.0.0/ to download the file mimic-cxr-2.0.0-chexpert.csv.gz which contains the label information for each data pair.
(2) Combine the original json file with the label file mentioned above to add the label information to the original file.
(3) Convert the label information to a 0-1 array, i.e., If the item belongs to the label, we set to 1; otherwise, we set to 0.
2 obtain knowledge triplet
(1) Use the Stanza to extract named entities from the medical reports in each data pair.
(2) Use RadGraph to obtain the related triplets with each entity.
(3) Store all the knowledge triplets (in the format of entity-relation-entity) in a list, and add the list as an item named "triplet" to the dataset.
Could you please provide the code for preprocessing? According to the text, it is somewhat difficult to follow the specific details of the json file, especially the original dataset does not have a json file.
Very cool job! this file includes "label_index", what does it mean?
Looking forward to your reply. Thanks!
The text was updated successfully, but these errors were encountered: