

# Building a healthcare chatbot with Mixtral 8x7b, Haystack, and PubMed


*notebook by Tilde Thurium:
 [Mastodon](https://tech.lgbt/@annthurium) || [Twitter](https://twitter.com/annthurium) || [LinkedIn](https://www.linkedin.com/in/annthurium/)*

##Introduction
**📚 Check out the [Building a healthcare chatbot with Mixtral 8x7b, Haystack, and PubMed](https://haystack.deepset.ai/blog/mixtral-8x7b-healthcare-chatbot) article for a detailed run through of this example.**

**Prerequisites:**

*   [HuggingFace Access Token](https://huggingface.co/settings/tokens)

## Installing Haystack

To start, let's install the latest release of Haystack with `pip`, as well as any other libraries we're going to need:

In [None]:
%%bash

pip install haystack-ai
pip install pymed
pip install huggingface_hub
pip install transformers

This asks for keys the notebook needs.

In [None]:
# For the manual token authentication
# from getpass import getpass
# huggingface_token = getpass("Enter your Hugging Face api token:")

# For automatic token authentication
from huggingface_hub import notebook_login

notebook_login()

## PubMed Fetcher

PubMed is the best source of up to date medical research. Now we are going to write our own custom class to pull scientific papers from PubMed that are relevant to the query at hand.

The PubMed sdk basically just wraps the PubMed API so it's easier to query.


In [None]:
# pymed → A library for querying PubMed articles.
# List → Used for type hinting to specify lists.
# haystack.component → Defines reusable Haystack components.
# haystack.Document → A class that stores the text content and metadata of a document.
from pymed import PubMed
from typing import List
from haystack import component
from haystack import Document
import os

# Run "export NCBI_API_KEY="your_actual_api_key" with the actual api key on terminal
# NCBI_API_KEY variable will be generated only in the current terminal session
# Retrieve API key from environment variable
api_key = os.getenv("NCBI_API_KEY")  

# Creates an instance of the PubMed API client with a user-defined identifier to track API usage.
pubmed = PubMed(tool = "PubMed_ChatBot", email = "hxb294@case.edu")
pubmed.api_key = api_key  # Securely set API key


# Takes a PubMed article and converts it into a Haystack Document.
def documentize(article):
  return Document(content = article.abstract, meta = {'title': article.title, 'keywords': article.keywords})

# max_results defines the maximum number of articles to fetch per query.
# Modify this value to adjust how many articles are retrieved for each search term.
max_results = 1  

# Defines a custom Haystack component that fetches articles from PubMed.
@component
class PubMedFetcher():

  # Specifies that the run() function outputs a list of Haystack Document objects.
  @component.output_types(articles = List[Document])
  
  # queries: list[str] → Expects a list of search queries.
  def run(self, queries: list[str]):  

    # queries[0].strip().split('\n') → Takes the first query, removes extra spaces, and splits it by new lines to handle multiple queries.
    # queries = ["   diabetes research \n   insulin therapy \n"] -> ["diabetes research", "insulin therapy"]
    cleaned_queries = queries[0].strip().split('\n')

    # Loops through each query(keywords) and fetches a max_results number of articles per query.
    # The pubmed.query(query, max_results=1) sends a query to PubMed based on the keyword(s) in the query string.
	  # max_results=1 limits the number of results returned by PubMed to 1 (you can increase this number to fetch more results).
	  # The response variable contains the list of articles retrieved by PubMed based on the given query or keyword(s).
    # Converts the articles into Haystack Document objects using documentize().
    # Appends the processed articles to the articles list.
    # queries = ["cancer treatment\nimmunotherapy\nlung cancer"] -> cleaned_queries = ["cancer treatment", "immunotherapy", "lung cancer"] -> response = pubmed.query("cancer treatment", max_results=1) / response = pubmed.query("immunotherapy", max_results=1) / response = pubmed.query("lung cancer", max_results=1)
    articles = []
    
    try:
      for query in cleaned_queries:
        response = pubmed.query(query, max_results)
        '''
        documents = []
        for article in response:
          documents.append(documentize(article))
        '''
        documents = [documentize(article) for article in response]
        articles.extend(documents)
    
    # Catches any errors that occur while querying PubMed and prints an error message.
    except Exception as e:
        print(e)
        print(f"Couldn't fetch articles for queries: {queries}" )
    
    # Returns a dictionary containing a list of articles in Haystack Document format.
    '''
    {
      'articles': [<list of Document objects>]  
    }
    '''
    results = {'articles': articles}
    return results

## LLM Setup

Now we add our `PubmedFetcher` into a pipeline.

In [None]:
# HuggingFaceTGIGenerator is used to interact with Hugging Face models (e.g., Mixtral-8x7B-Instruct-v0.1).
# Pipeline is the main structure of Haystack that allows you to chain multiple components (e.g., fetcher, keyword generator, prompt builder).
# PromptBuilder is used to construct prompts that are used to interact with language models.
from haystack.components.generators import HuggingFaceTGIGenerator
# If used manual token authentication -> from haystack.utils import Secret

# This is the model used specifically for keyword generation from the question you want to ask.
# BiomedBERT or BioBERT for the future trial
# This initializes the Mixtral-8x7B-Instruct-v0.1 model (hosted on Hugging Face) for keyword generation.
# The warm_up() function is used to initialize and warm up the model for quicker inference (i.e., load it into memory).
# If used manual token authentication -> keyword_llm = HuggingFaceTGIGenerator(model = "mistralai/Mixtral-8x7B-Instruct-v0.1", token = Secret.from_token(huggingface_token))
keyword_llm = HuggingFaceTGIGenerator(model = "mistralai/Mixtral-8x7B-Instruct-v0.1")
keyword_llm.warm_up()

# This is the model used for generating the final answer to the user’s question based on the fetched articles from PubMed.
# ChatGPT, DeepSeek, Claude for the future trial
# This initializes the Mixtral-8x7B-Instruct-v0.1 model again, this time for generating the final answer to the question based on the fetched articles.
# If used manual token authentication -> llm = HuggingFaceTGIGenerator(model = "mistralai/Mixtral-8x7B-Instruct-v0.1", token = Secret.from_token(huggingface_token))
llm = HuggingFaceTGIGenerator(model = "mistralai/Mixtral-8x7B-Instruct-v0.1")
llm.warm_up()

## Templates

In [None]:

# This is a prompt template used to generate keywords from a given medical question.
# It instructs the language model to extract 3 keywords from the question that can be used to search for relevant articles on PubMed.
# The prompt format also includes an example for the model to follow.

num_keywords = 1  # Change this value to set the number of keywords dynamically

keyword_prompt_template = f"""
Your task is to convert the following question into {num_keywords} keywords that can be used to find relevant medical research papers on PubMed.
Here is an example:
question: "What are the latest treatments for major depressive disorder?"
keywords:
Antidepressive Agents
Depressive Disorder, Major
Treatment-Resistant depression
---
question: {{ question }}
keywords:
"""

# This template is used to create a prompt that asks the model to answer a medical question based on a list of articles fetched from PubMed.
# The model is instructed to use the content and keywords of the articles as context to generate an accurate answer.
prompt_template = """
Answer the question truthfully based on the given documents.
If the documents don't contain an answer, use your existing knowledge base.

q: {{ question }}
Articles:
{% for article in articles %}
  {{article.content}}
  keywords: {{article.meta['keywords']}}
  title: {{article.meta['title']}}
{% endfor %}

"""

## RAG Pipeline

In [None]:
from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder

# Pipeline: This creates a sequence of steps where data flows through each component.
# The components are connected in the following order:

# 1.	keyword_prompt_builder generates a prompt based on the question.
# 2.	keyword_llm generates keywords from the question.
# 3.	pubmed_fetcher uses the generated keywords to search PubMed and retrieve relevant articles.
# 4.	prompt_builder formats the articles as input for the final answer.
# 5.	llm generates the final answer based on the context provided by the articles.

keyword_prompt_builder = PromptBuilder(template = keyword_prompt_template)
prompt_builder = PromptBuilder(template = prompt_template)

fetcher = PubMedFetcher()

pipe = Pipeline()

# This method registers different components in the pipeline. Each component performs a specific task.
# pipe.add_component("keyword_prompt_builder", keyword_prompt_builder)
# This registers the keyword_prompt_builder under the name "keyword_prompt_builder", so it can be referenced later.
pipe.add_component("keyword_prompt_builder", keyword_prompt_builder)
pipe.add_component("keyword_llm", keyword_llm)
pipe.add_component("pubmed_fetcher", fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

# This method links the output of one component to the input of another. It ensures data flows through the pipeline in the correct order.
# pipe.connect("keyword_prompt_builder.prompt", "keyword_llm.prompt")
# This connects the keyword_prompt_builder’s prompt output to keyword_llm’s prompt input.
pipe.connect("keyword_prompt_builder.prompt", "keyword_llm.prompt")
pipe.connect("keyword_llm.replies", "pubmed_fetcher.queries")
pipe.connect("pubmed_fetcher.articles", "prompt_builder.articles")
pipe.connect("prompt_builder.prompt", "llm.prompt")


While we're at it, let's make an `ask` method to wrap our query fetching. This method makes it easy to pull the query response out of the results and highlighting the answer in blue.

In [None]:
from IPython.display import display, HTML

# Controls how many tokens the model will generate as a response. A token in NLP (Natural Language Processing) is usually a word or part of a word.
max_new_tokens = 500

# This function takes a question, runs it through the pipeline, and generates an answer.
# The result is printed out, showing the question followed by the generated answer.
def ask(question):
  output = pipe.run(data = {"keyword_prompt_builder": {"question": question},
                            "prompt_builder": {"question": question},
                            "llm": {"generation_kwargs": {"max_new_tokens": max_new_tokens}}})
  print(question)
  print(output['llm']['replies'][0])
  # display(HTML(f'<div style="color: blue">{output["llm"]['replies'][0]}</div>'))


Give it a try!

In [None]:
# This is an example query passed to the pipeline to fetch relevant medical research and generate an answer.
# The function processes the question and outputs the generated response based on the relevant articles from PubMed.

ask("How are mRNA vaccines being used for cancer treatment?")

**What's next**

In this tutorial, you learned to make a basic chatbot using a custom Haystack Retriever that pulls data from PubMed. In order to make this chatbot fancier, you could make improvements like:


*   Adding additional data sources
*   Giving the bot a name or personality
*   Making it a [stateful agent so it remembers past queries](https://docs.haystack.deepset.ai/docs/agent#conversational-agent-memory)

To see what else is possible with Haystack, you can [browse these tutorials](https://haystack.deepset.ai/tutorials) or check us out on [GitHub](https://github.com/deepset-ai/haystack). Thanks for reading!




