Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how can I obtain the "mimic_train_kg_AO.json" file? #2

Open
mk-runner opened this issue Aug 4, 2023 · 2 comments
Open

how can I obtain the "mimic_train_kg_AO.json" file? #2

mk-runner opened this issue Aug 4, 2023 · 2 comments

Comments

@mk-runner
Copy link

Very cool job! this file includes "label_index", what does it mean?

Looking forward to your reply. Thanks!

@chenzcv7
Copy link
Owner

Thanks for your interest in our work! Due to the copyright issue, we cannot upload the file related to mimic-cxr. However, you can follow the step below to preprocess the original dataset of mimic-cxr:

  1. obtain the label_index:
    (1) Visit https://physionet.org/content/mimic-cxr-jpg/2.0.0/ to download the file mimic-cxr-2.0.0-chexpert.csv.gz which contains the label information for each data pair.
    (2) Combine the original json file with the label file mentioned above to add the label information to the original file.
    (3) Convert the label information to a 0-1 array, i.e., If the item belongs to the label, we set to 1; otherwise, we set to 0.

2 obtain knowledge triplet
(1) Use the Stanza to extract named entities from the medical reports in each data pair.
(2) Use RadGraph to obtain the related triplets with each entity.
(3) Store all the knowledge triplets (in the format of entity-relation-entity) in a list, and add the list as an item named "triplet" to the dataset.

Hope that this answer is helpful to you!

@anothersin
Copy link

Stanza

Hi,

Could you please provide the code for preprocessing? According to the text, it is somewhat difficult to follow the specific details of the json file, especially the original dataset does not have a json file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants