Welcome to the official repository for Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature.
Arxiv: Arxiv | Paper: PSB 2024 | Demo: Live App
If you would like to see some functionality or have a comment, open an issue on this repo, we will try to reply as soon as possible
Jul 05, 2024 Update
: Added Support for Google's API models
Millions of medical research articles are published every year. On the other side, healthcare professionals and medical researchers are expected to stay abreast of the latest scientific discoveries pertinent to their daily practice. However, with limited time and a broad field to cover, keeping up-to-date can be a challenging task. Clinfo.AI searches and synthesizes medical literature tailored to a specific clinical question to provide an answer grounded on indexed literature. By leveraging a chain of LLMS clinfo.ai, can analyze the context of the inquiry to identify and present the most relevant articles pertinent to a scientific question.
Questions based on scientific evidence reported in the literature, for example:
-
What percentage of HIV-positive patients transmit the virus to their children?
-
When do most episodes of COVID-19 rebound after stopping paxlovid treatment?
-
Does magnesium consumption significantly improve sleep quality?
Broad questions: These types of questions could potentially be answered by clinfo.AI, but it is highly probable you won’t get what you are looking for. How to correct this type of question? Provide context. For example: "Chest pain pediatrics?"
We recommend asking a specific question to get the best answer:
Original Question: "Chest pain pediatrics?"
Improved Question: "What are common causes of chest pain in pediatric patients?"
Clinfo.AI is a RetA LLM system, it consists of a collection of four LLMs working conjointly (an LLM chain) coupled to a Search Index as depicted in the above Figure:
- First, the input (the question submitted by the user) is converted to a query by an LLM (Question2Query). E.g. for PubMed, the question is converted to a query containing MeSH terms.
- The generated queries are then used to retrieve articles from indexed sources (e.g. PubMed)
- Then give an article and the original question an LLM is tasked to classify if the article is relevant (if enabled BM25 is used to rank the selected articles).
- Relevant articles are individually summarized by an LLM.
- Lasyty an LLM aggregates all summaries to provide an overview of all relevant articles.
Create an OpenAI account, get an API Key, and edit the key field OPENAI_API_KEY
in config.py
with your own key.
Clinfo.ai retrieves literature using the NCBI API; while access does not require an account, calls are limited for unregistered users. We recommend creating an NCBI account. Once generated, save the NCBI API key and email under NCBI_API_KEY
and EMAIL
, respectively.
In summary edit the following variables inside config.py:
OPENAI_API_KEY = "YOUR API TOKEN"
NCBI_API_KEY = "YOUR API TOKEN" (optional)
EMAIL = "YOUR EMAIL" (optional)
from src.clinfoai.clinfoai import ClinfoAI
from config import OPENAI_API_KEY, NCBI_API_KEY, EMAIL
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
question = "What is the prevalence of COVID-19 in the United States?"
clinfo = ClinfoAI(llm="gpt-3.5-turbo",openai_key=OPENAI_API_KEY, email= EMAIL)
answer = clinfo.forward(question=question)
src/notebooks/01_UsingClinfoAI.ipynb
has a quick run-through and explanation for each individaul clinfo.AI component.
Clinfo.ai has full integration with vLLM. We can use any open source LLM as a backbone following two simple steps:
First, we use vLLM to create an API selecting the model you want to work with:
In the following example we use Qwen/Qwen2-beta-7B-Chat
python -m vllm.entrypoints.openai.api_server --model Qwen/Qwen2-beta-7B-Chat
Instantiate a clinfoAI object with the desired LLM :
from src.clinfoai.clinfoai import ClinfoAI
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
question = "What is the prevalence of COVID-19 in the United States?"
clinfo = ClinfoAI(llm="Qwen/Qwen2-beta-7B-Chat")
answer = clinfo.forward(question=question)
While anyone can use Clinfo.AI, our goal is to augment medical experts not replace them. Read our disclaimer disclaimer and DO NOT use clinfo.AI for medical diagnosis.
If you use Clinfo.ai, please consider citing:
@inproceedings{lozano2023clinfo,
title={Clinfo. ai: An open-source retrieval-augmented large language model system for answering medical questions using scientific literature},
author={Lozano, Alejandro and Fleming, Scott L and Chiang, Chia-Chun and Shah, Nigam},
booktitle={PACIFIC SYMPOSIUM ON BIOCOMPUTING 2024},
pages={8--23},
year={2023},
organization={World Scientific}
}