RoMCP

RoMCP is a new representation about medical features. It can capture both diagnosis information and temporal relations between days. The learned diagnosis embedding grasps the key factors of the disease, and each day embedding is determined by the diagnosis together with the preorder days.

More detail can be referred to the following paper:

Xiao Xu, Ying Wang, Tao Jin and Jianmin Wang. Learning the Representation of Medical Features for Clinical Pathway Analysis. DASFAA, 2018.

Running RoMCP

Step1: Installation

1.Install python, Tensorflow. We use Python 2.7, Tensorflow 1.3.0.

2.If you want to use GPU to accelerate computation, install CUDA

3.Download/Clone the RoMCP code

Step2: Preparing data

To run RoMCP, three kinds of data need to be prepared.

1. D2bow(Day to bow): The first data file names "d2bow" which record medical activity for everyday of all visits. D2bow is a two dimension list, spliting every visits by -1. About medical acitvity for a day, it consists of a series of tuples. The tuple has two values, one is medical acitvity index(code) and the other is dosage after normalization.

For example:

[
    [(2,0.8236),(6,0.7632),(100,0.6666),(4212,0.1566)],
    [(9, 0.5), (12, 0.7048), (14, 0.5)]
    [(6, 0.5), (7, 0.7048), (9, 0.5), (12, 0.7048), (14, 0.5)],
    -1,
    [(5, 0.7048), (12, 0.5), (14, 0.5), (18, 0.5)],
    [(2, 0.9745), (3, 0.5), (36, 0.5), (37, 0.5)]
]

Above is a example of d2bow, which represents 2 visits, one has 3 days and the other has 2 days.

2. D2diag(Day to diagnosis): Second data file names "d2diag", recording diagnosis index(code) of everyday. D2diag is a one dimension list, spliting every visits by 0. In a visit, everyday has some diagnosis code.

For example:

[188,188,188,0,2,2]

Above example represents 2 visits, the first visit has three days and the diagnosis code is 188, the second visit has two days and the diagnosis code is 2.

3. Mask: Thrid data file names "mask". It is a indicator to use aggreate data by win_size. The detail about how to use the mask is coded in the "Disease2Vec.py" function _init_aggregate_day(). Mask is a two dimension list, spliting every visits by [0]. In mask, a day maps to [1].

For example:

[[1],[1],[1],[0],[1],[1]]

Above example represents 2 visits, the first visit has three days and the second visit has two days .

Step3: Training model

After prepared the data, you can feed the data into the function main() in "Disease2VecRunner.py". In function main(), it splits your data into train set and test set and show the NDCG and Recall score about the next day medical activity prediction.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Disease2Vec.py		Disease2Vec.py
Disease2VecRunner.py		Disease2VecRunner.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RoMCP

Running RoMCP

Step1: Installation

Step2: Preparing data

Step3: Training model

About

Releases

Packages

Contributors 2

Languages

wuyuxiaobi/RoMCP

Folders and files

Latest commit

History

Repository files navigation

RoMCP

Running RoMCP

Step1: Installation

Step2: Preparing data

Step3: Training model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages