GitHub - LIANGKE23/Knowledge_Assisted_Medical_Dialogue_Generation_Mechanism: KG-assisted Medical Dialogue Generation

Using natural language processing (NLP) technologies to develop medical chatbots makes the diagnosis of the patient more convenient and efficient, which is a typical application in healthcare AI. Because of its importance, lots of researches have come out. Recently, the neural generative models have shown their impressive ability as the core of chatbot, while it cannot scale well when directly applied to medical conversation due to the lack of medical-specific knowledge. To address the limitation, a scalable medical knowledge-assisted mechanism (MKA) is proposed in this paper. The mechanism is aimed at assisting general neural generative models to achieve better performance on the medical conversation task. The medical-specific knowledge graph is designed within the mechanism, which contains 6 types of medical-related information, including department, drug, check, symptom, disease, and food. Besides, the specific token concatenation policy is defined to effectively inject medical information into the input data. Evaluation of our method is carried out on two typical medical datasets, MedDG and MedDialog-CN. The evaluation results demonstrate that models combined with our mechanism outperform original methods in multiple automatic evaluation metrics. Besides, MKA-BERT-GPT achieves state-of-the-art performance.

Step 1: MKA Mechanism ----- Data Processing It is based on the data not only for multiple rounds of dialogue data, but also for the related patient self-reports and the self-generated medical knowledge graphs. The knowledge graphs are generated based on the self-report (EHR) in each turns. Meanwhile, the medical key words in the dialogue data in each turns also make efforts on the graph genereation procedure. In the Data-Prepare folder there are 6 files:

dpt_text2kg.json: The map function between medical departments and departments in the knowledge graph.
key_body_symprelated.json: The map function between each body parts and its corresponding expressions.
kg_disease.txt: The disease entities in medical knowlegde graphs.
kg_symptom.txt: The symptom entities in medical knowlegde graphs.
Medical_Specific_Knowledge_Generator_new.py: Generating the medical knowledge sub-graph for each turn.
data_gjy_seen.py: Combining the graphs and dialogue context as the input for the model.

Step 2: MKA Mechanism ----- Large Scale Model Application

2-1 Using Bert-GPT: Within bert-gpt folder, there are 4 files:

preprocess.py: Data preprocessing for Bert-GPT model.
bert_gpt_train_dpt.py: Training.
generate.py: Testing.
validate.py: Evaluation for Perplexity, BLEU, NIST, METEOR, Entropy, Dist.

2-2 Using Transformer: Within transformer folder, there are 3 files:

trans_preprocess.py: Data preprocessing for Transformer model.
trans_train.py: Training
trans_generate.py: Testing and evaluation.

If you use the repository, please cite the paper: Ke Liang, Sifan Wu, Jiayi Gu, "MKA: A Scalable Medical Knowledge-Assisted Mechanism for Generative Models on Medical Conversation Tasks", Computational and Mathematical Methods in Medicine, vol. 2021, Article ID 5294627, 10 pages, 2021. https://doi.org/10.1155/2021/5294627

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
bert-gpt		bert-gpt
data-prepare/haodf		data-prepare/haodf
transformer		transformer
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bert-gpt

bert-gpt

data-prepare/haodf

data-prepare/haodf

transformer

transformer

LICENSE

LICENSE

README.md

README.md

Repository files navigation

About

Releases

Packages

Contributors 2

Languages

License

LIANGKE23/Knowledge_Assisted_Medical_Dialogue_Generation_Mechanism

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages