Welcome to my personal repository, a curated collection of cutting-edge research at the intersection of machine learning and healthcare. As an AI researcher with a strong interest in healthcare applications, I've compiled this repository to showcase innovative works mostly in natural language processing (NLP) and multimodal learning within the healthcare domain. While this collection reflects my personal research focus, it aims to serve as a valuable resource for anyone passionate about leveraging machine learning for healthcare. I welcome contributions and discussions, so feel free to share ideas or suggest papers!
- (2023/11) Meditron-70b: Scaling medical pretraining for large language models [paper]
- Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare [paper]
- Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks [paper]
- Health-LLM: Large language models for health prediction via wearable sensor data [paper]
- (2022/03) MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering [paper]
- (2023/07) Med-HALT: Medical Domain Hallucination Test for Large Language Models [paper]
- (2024/01) K-QA: A Real-World Medical Q&A Benchmark [paper]
- (2023/05) MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain [paper]
- (2023/11) MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning [paper]
- (2024/02) Ai hospital: Interactive evaluation and collaboration of llms as intern doctors for clinical diagnosis [paper]
- (2024/02) AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning [paper]
- (2024/04) Adaptive Collaboration Strategy for LLMs in Medical Decision Making [paper]
- (2024/05) Agent hospital: A simulacrum of hospital with evolvable medical agents [paper]
- (2024/05) AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments [paper]
- (2024/05) DrHouse: An LLM-empowered Diagnostic Reasoning System through Harnessing Outcomes from Sensor Data and Expert Knowledge [paper]
- (2024/06) ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World [paper]
- (2024/07) MMedAgent: Learning to Use Medical Tools with Multi-modal Agent [paper]
- (2024/08) MEDCO: Medical Education Copilots Based on A Multi-Agent Framework [paper]
- (2024/08) Interactive Agents: Simulating Counselor-Client Psychological Counseling via Role-Playing LLM-to-LLM Interactions [paper]
- (2017/03) Generating Multi-label Discrete Patient Records using Generative Adversarial Networks [paper]
- (2010/10) Data-driven approach for creating synthetic electronic medical records [paper]
- (2023/03) EHRDiff: Exploring Realistic EHR Synthesis with Diffusion Models [paper]
- (2023/04) Synthesize High-dimensional Longitudinal Electronic Health Records via Hierarchical Autoregressive Language Model [paper]
- (2023/08) EHR-Safe: generating high-fidelity and privacy-preserving synthetic electronic health records [paper]
- LLMSYN: Generating Synthetic Electronic Health Records Without Patient-Level Data [paper]
- GenHPF: General Healthcare Predictive Framework with Multi-task Multi-source Learning [paper]
- REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models [paper]
- EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling [paper]
- EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models [paper]
- MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images [paper]
- Learning Missing Modal Electronic Health Records with Unified Multi-modal Data Embedding and Modality-Aware Attention [paper]
- From Basic to Extra Features: Hypergraph Transformer Pretrain-then-Finetuning for Balanced Clinical Predictions on EHR [paper]
- FlexCare: Leveraging Cross-Task Synergy for Flexible Multimodal Healthcare Prediction [paper]
- MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models [paper]
- Multimodal Patient Representation Learning with Missing Modalities and Labels [paper]
-
Toward a Natural Language Interface for EHR Questions
Authors: Kirk Roberts, Dina Demner‑Fushman
Published in: AMIA Joint Summits on Translational Science Proceedings (2015)
[paper] -
Annotating Logical Forms for EHR Questions
Authors: Kirk Roberts, Dina Demner‑Fushman
Published in: LREC 2016
[paper] -
A Semantic Parsing Method for Mapping Clinical Questions to Logical Forms
Authors: Kirk Roberts, Dina Demner‑Fushman
Published in: AMIA Annual Symposium Proceedings (2017)
[paper] -
emrQA: A Large Corpus for Question Answering on Electronic Medical Records
Authors: Pampri et al.
Published in: EMNLP 2018
[arxiv] -
Text-to-SQL Generation for Question Answering on Electronic Medical Records
Authors: Wang et al.
Published in: WWW 2020
[arxiv] -
Using FHIR to Construct a Corpus of Clinical Questions Annotated with Logical Forms and Answers
Authors: Soni et al.
Published in: AMIA Joint Summits on Translational Science Proceedings 2019
[paper] -
Dataset and Enhanced Model for Eligibility Criteria-to-SQL Semantic Parsing
Authors: Yu et al.
Published in: LREC 2020
[paper] -
Paraphrasing to Improve the Performance of Electronic Health Records Question Answering
Authors: Soni, Roberts
Published in: AMIA Joint Summits on Translational Science Proceedings (2020)
[paper] -
Knowledge Graph-based Question Answering with Electronic Health Records
Authors: Park et al.
Published in: ML4H 2020
[arxiv] -
emrKBQA: A Clinical Knowledge-Base Question Answering Dataset
Authors: Raghavan et al.
Published in: ACL 2021 BioNLP Workshop
[paper] -
Question Answering for Complex Electronic Health Records Database using Unified Encoder-Decoder Architecture
Authors: Bae et al.
Published in: ML4H 2021
[arxiv] -
Uncertainty-Aware Text-to-Program for Question Answering on Structured Electronic Health Records
Authors: Kim et al.
Published in: CHIL 2022
[arxiv] -
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
Authors: Bardhan et al.
Published in: LREC 2022
[arxiv] -
Learning to Ask Like a Physician
Authors: Lehman et al.
Published in: ACL 2022 Clinical NLP Workshop
[arxiv] -
RadQA: A Question Answering Dataset to Improve Comprehension of Radiology Reports
Authors: Soni et al.
Published in: LREC 2022
[paper] -
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
Authors: Lee et al.
Published in: NeurIPS 2022 (Datasets and Benchmarks Track)
[arxiv] | [code] -
LeafAI: query generator for clinical cohort discovery rivaling a human programmer
Authors: Dobbins et al.
Published in: JAMIA 2023
[arxiv] -
Toward a Neural Semantic Parsing System for EHR Question Answering
Authors: Soni, Roberts
Published in: AMIA Annual Symposium Proceedings 2023
[paper] -
quEHRy: a question answering system to query electronic health records
Authors: Soni et al.
Published in: JAMIA 2023
[paper] -
ECG-QA: A Comprehensive Question Answering Dataset Combined With Electrocardiogram
Authors: Oh et al.
Published in: NeurIPS 2023 (Datasets and Benchmarks Track)
[arxiv] -
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Authors: Fleming et al.
Published in: AAAI 2024
[arxiv] -
Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization
Authors: Veen et al.
Published in: Nature medicine (2024)
[arxiv] -
Question Answering for Electronic Health Records: A Scoping Review of Datasets and Models
Authors: Bardhan et al.
Published in: JMIR 2024
[arxiv] -
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
Authors: Bae, Kyung et al.
Published in: NeurIPS 2023 (Datasets and Benchmarks Track)
[arxiv] | [code] | [physionet] -
EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records
Authors: Shi, Xu et al.
Published in: EMNLP 2024
[arxiv] | [code] -
EHRNoteQA: A Patient-Specific Question Answering Benchmark for Evaluating Large Language Models in Clinical Settings
Authors: Kweon, Kim et al.
Published in: NeurIPS 2024 (Datasets and Benchmarks Track)
[arxiv] | [code] | [physionet] -
A Benchmark of Domain-Adapted Large Language Models for Generating Brief Hospital Course Summaries
Authors: Aali et al.
Status: arXiv preprint
[arxiv]
-
Explainable Automated Fact-Checking for Public Health Claims
Authors: Neema Kotonya, Francesca Toni
Published in: Proceedings of EMNLP 2020
[paper] | [github] -
Evidence-based Fact-Checking of Health-related Claims
Authors: Sarrouti et al.
Published in: Findings of EMNLP 2021
[paper] | [code] -
HealthFC: Verifying Health Claims with Evidence-Based Medical Fact-Checking
Authors: Juraj Vladika, Phillip Schneider, Florian Matthes
Published in: LREC-COLING 2024
[paper] | [github] -
DOSSIER: Fact Checking in Electronic Health Records while Preserving Patient Privacy
Authors: Zhang et al.
Published in: MLHC 2024
[paper] | [code] -
EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records
Authors: Kwon, Kim, et al.
Published in: Accepted at NeurIPS 2024
[arxiv] | [github] | [physionet] -
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
Authors: Heiman et al.
Published in: Accepted at CVPR 2025
[arxiv] | [github] -
VeriFact: Verifying Facts in LLM-Generated Clinical Text with Electronic Health Records
Authors: Chung et al.
Status: arXiv preprint
[arxiv] | [github] | [physionet]
- (2019/12) MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports [paper]
- (2019/01) MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs [paper]
- (2023/10) Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge [paper]
- (2024/03) A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities [paper]
- (2024/04) RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis [paper]
- (2024/06) Shadow and Light: Digitally Reconstructed Radiographs for Disease Classification [paper]
- (2024/08) MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine [paper]
- (2024/01) CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation [paper]
- (2024/05) Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [paper]
- (2020/04) CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT [paper]
- (2021/06) RadGraph: Extracting Clinical Entities and Relations from Radiology Reports [paper]
- (2023/08) Radgraph2: Modeling disease progression in radiology reports via hierarchical information extraction [paper]
- (2023/09) Evaluating progress in automatic chest x-ray radiology report generation [paper]
- (2023/11) Radiology-Aware Model-Based Evaluation Metric for Report Generation [paper]
- (2024/03) Evaluating GPT-V4 (GPT-4 with Vision) on Detection of Radiologic Findings on Chest Radiographs [paper]
- (2024/04) LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation [paper]
- (2024/05) GREEN: Generative Radiology Report Evaluation and Error Notation [paper]
-
MediConfusion: Can You Trust Your AI Radiologist? Probing the Reliability of Multimodal Medical Foundation Models
Authors: Sepehri et al.
Published in: ICLR 2025
[arxiv] -
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Authors: Xia et al.
Published in: ICLR 2025
[arxiv] -
GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning
[arxiv] -
Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence
[arxiv]