# LLM и RAG

# Кияко Елизавета и Фуад Бабаев

В этом проекте мы делаем чат-бота-врача, используя технику RAG (Retrieval-Augmented-Generation) и фреймворки huggingface и LangChain

Устанавливаем необходимые библиотеки

In [14]:
!pip install datasets langchain_community langchain_chroma langchain langchain_core tiktoken sentence-transformers==2.2.2 lark InstructorEmbedding bitsandbytes accelerate >> /dev/null

Загрузим датасет medal https://huggingface.co/datasets/bigbio/medal

### Загрузим данные MEDAL
Данные содержат медицинские статьи для различных клинических диагнозов

https://github.com/McGill-NLP/medal


Нас интересуют колонки TEXT и LABEL


In [2]:
!wget -nc -P data/ https://zenodo.org/record/4482922/files/train.csv

--2024-12-21 06:57:58--  https://zenodo.org/record/4482922/files/train.csv
Resolving zenodo.org (zenodo.org)... 188.185.48.194, 188.185.45.92, 188.185.43.25, ...
Connecting to zenodo.org (zenodo.org)|188.185.48.194|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: /records/4482922/files/train.csv [following]
--2024-12-21 06:57:59--  https://zenodo.org/records/4482922/files/train.csv
Reusing existing connection to zenodo.org:443.
HTTP request sent, awaiting response... 200 OK
Length: 3541556520 (3.3G) [text/plain]
Saving to: ‘data/train.csv’


2024-12-21 07:02:29 (12.5 MB/s) - ‘data/train.csv’ saved [3541556520/3541556520]



Положим на гугл диск, чтобы можно было быстро доставать

In [1]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [4]:
a = 'data/train.csv'
!cp $a "/content/gdrive/MyDrive"

Можем немного посмотреть на датасет

In [2]:
!cat /content/gdrive/MyDrive/train.csv | head

ABSTRACT_ID,TEXT,LOCATION,LABEL
14145090,velvet antlers vas are commonly used in traditional chinese medicine and invigorant and contain many PET components for health promotion the velvet antler peptide svap is one of active components in vas based on structural study the svap interacts with tgfÎ² receptors and disrupts the tgfÎ² pathway we hypothesized that svap prevents cardiac fibrosis from pressure overload by blocking tgfÎ² signaling SDRs underwent TAC tac or a sham operation T3 one month rats received either svap mgkgday or vehicle for an additional one month tac surgery induced significant cardiac dysfunction FB activation and fibrosis these effects were improved by treatment with svap in the heart tissue tac remarkably increased the expression of tgfÎ² and connective tissue growth factor ctgf ROS species C2 and the phosphorylation C2 of smad and ERK kinases erk svap inhibited the increases in reactive oxygen species C2 ctgf expression and the phosphorylation of smad and erk bu

### Чтение и индексация данных

In [3]:
from langchain_community.document_loaders.csv_loader import CSVLoader


FILE_PATH = '/content/gdrive/MyDrive/train.csv'
loader = CSVLoader(file_path=FILE_PATH)
docs = loader.lazy_load()

In [17]:
!pip install huggingface_hub==0.25.00

Collecting huggingface_hub==0.25.00
  Downloading huggingface_hub-0.25.0-py3-none-any.whl.metadata (13 kB)
Downloading huggingface_hub-0.25.0-py3-none-any.whl (436 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m436.4/436.4 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: huggingface_hub
  Attempting uninstall: huggingface_hub
    Found existing installation: huggingface-hub 0.27.0
    Uninstalling huggingface-hub-0.27.0:
      Successfully uninstalled huggingface-hub-0.27.0
Successfully installed huggingface_hub-0.25.0


Задаем модель для эмбединга наших документов

In [4]:
from langchain.embeddings import HuggingFaceInstructEmbeddings
import torch

emb_model = HuggingFaceInstructEmbeddings(model_name='hkunlp/instructor-large', model_kwargs={'device':'cuda' if torch.cuda.is_available() else 'cpu'})

  from tqdm.autonotebook import trange


load INSTRUCTOR_Transformer
max_seq_length  512


  model.load_state_dict(torch.load(os.path.join(input_path, 'pytorch_model.bin'), map_location=torch.device('cpu')))


Задаем индекс для наших эмбедингов

In [5]:
from langchain.vectorstores import Chroma

persist_directory = 'DB'

vectordb = Chroma(persist_directory=persist_directory, embedding_function = emb_model)
vectordb.persist()

  vectordb = Chroma(persist_directory=persist_directory, embedding_function = emb_model)
  vectordb.persist()


Проиндексируем документы. Делаем это для первых N_DOCS штук (все 3млн долго)

In [6]:
from tqdm.auto import tqdm
from langchain.docstore.document import Document

N_DOCS=1000
i = 0
batch = []

for i, doc in tqdm(enumerate(docs), total=N_DOCS):
  content = doc.page_content
  # print(content)
  text = content.split('\n')[1][6:]
  label = content.split('\n')[3][7:]
  batch.append(Document(page_content=text, metadata={'diagnosis': label}))
  vectordb.add_documents(batch)
  batch = []
  if i >= 1000:
    break

vectordb.persist()

  0%|          | 0/1000 [00:00<?, ?it/s]

In [7]:
vectordb.persist()

In [8]:
!ls -lht DB

total 24M
-rw-r--r-- 1 root root  24M Dec 21 07:24 chroma.sqlite3
drwxr-xr-x 2 root root 4.0K Dec 21 07:15 a5947ad9-c88f-40fc-959b-5aa9d4295b0f


In [9]:
import gc
gc.collect()

81

Теперь подгрузим саму LLM, мы выбрали небольшую vicuna-7b

In [10]:
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
import torch

In [11]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "lmsys/vicuna-7b-v1.5"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_4bit=True)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=200)

llm = HuggingFacePipeline(pipeline=pipe)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use cuda:0
  llm = HuggingFacePipeline(pipeline=pipe)


Определяем ретривер и проверяем, что поиск по индексу работает

In [12]:
retriever = vectordb.as_retriever()

In [13]:
retriever.invoke("ceftobiprole bpr")

[Document(metadata={'diagnosis': 'methicillinsusceptible s aureus'}, page_content='ceftobiprole bpr is an investigational cephalosporin with activity against staphylococcus aureus including methicillinresistant s aureus mrsa strains the pharmacodynamic pd profile of bpr against s aureus strains with a variety of susceptibility phenotypes in an immunocompromised murine pneumonia model was characterized the bpr mics of the test isolates ranged from to mugml pharmacokinetic pk studies were conducted with infected neutropenic balbc mice and the bpr concentrations were measured in plasma epithelial lining fluid elf and lung tissue pd studies with these mice were undertaken with eight s aureus isolates two MSSA strains three hospitalacquired mrsa strains and three CA mrsa strains subcutaneous bpr doses of to mgkg of body weightday were administered and the NC in the number of log cfuml in lungs was evaluated after h of therapy the pd profile was characterized by using the free drug exposures

Все работает

### Prompt Engineering. Создание Prompt Template для выбранной LLM модели

In [15]:
!pip install --upgrade transformers

Collecting transformers
  Downloading transformers-4.47.1-py3-none-any.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.1/44.1 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<0.22,>=0.21 (from transformers)
  Downloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Downloading transformers-4.47.1-py3-none-any.whl (10.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.1/10.1 MB[0m [31m68.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.0/3.0 MB[0m [31m43.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tokenizers, transformers
  Attempting uninstall: tokenizers
    Found existing installation: tokenizers 0.20.3
    Uninstalling tokenizers-0.20.3:
      Successfully uninstalled tokenizers-0.20

Токенизатор для нашей модели

In [20]:
from pprint import pp as pprint
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)

Создадим темплейты (учитывабщие и не учитывающие историю)

prompt template можно посмотреть на https://github.com/chujiezheng/chat_templates и replicate.com. Например, для LLama 3 они тут

https://replicate.com/meta/meta-llama-3-70b-instruct
https://github.com/chujiezheng/chat_templates/blob/main/chat_templates/llama-3-chat.jinja

In [21]:
tokenizer.bos_token

'<s>'

In [22]:
"""LLama 3 template:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

# Для викуни такой темплейт:
"""Vicuna jinja template:
{% if messages[0]['role'] == 'system' %}
    {% set loop_messages = messages[1:] %}
    {% set system_message = messages[0]['content'].strip() + '\n\n' %}
{% else %}
    {% set loop_messages = messages %}
    {% set system_message = '' %}
{% endif %}

{{ bos_token + system_message }}
{% for message in loop_messages %}
    {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
        {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}
    {% endif %}

    {% if message['role'] == 'user' %}
        {{ 'USER: ' + message['content'].strip() + '\n' }}
    {% elif message['role'] == 'assistant' %}
        {{ 'ASSISTANT: ' + message['content'].strip() + eos_token + '\n' }}
    {% endif %}

    {% if loop.last and message['role'] == 'user' and add_generation_prompt %}
        {{ 'ASSISTANT:' }}
    {% endif %}
{% endfor %}
"""

from langchain.prompts import PromptTemplate


SYSTEM_PROMPT = """A chat between a curious user and an artificial intelligence assistant.
The assistant gives helpful, detailed, and polite answers to the user's questions."""
SYSTEM_PROMPT = ' '.join(SYSTEM_PROMPT.split('\n'))

USE_HISTORY = False
if USE_HISTORY:
    # это если у нас есть история диалога
    instruction = '''Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff.
You should answer USER questions about diseases, treatment and other questions concerning health.
Use the following context from USER to answer questions. Question is the last USER's query.
A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer.
After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question.
Be as detailed as possible, but don't make up any information that's not from the context.
If you don't know an answer, say you don't know.'''
    instruction = ' '.join(instruction.split('\n'))
    prompt_template = '''<s>{SYSTEM_PROMPT}

{instruction}

DOCUMENT: {document}

{history}
USER: {input}
ASSISTANT:'''
    prompt = PromptTemplate(input_variables=["document", 'input', 'history'],
                            partial_variables={"SYSTEM_PROMPT": SYSTEM_PROMPT, "instruction": instruction},
                            template=prompt_template)
else:
    instruction = '''Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff.
You should answer USER questions about diseases, treatment and other questions concerning health.
Use the following context from USER to answer questions. Question is the last sentence of USER's query and goes after words "Main Question: ".
Be as detailed as possible, but don't make up any information that's not from the context.
If you don't know an answer, say you don't know.'''
    instruction = ' '.join(instruction.split('\n'))
    prompt_template = '''<s>{SYSTEM_PROMPT}

{instruction}

USER: {context}. Main Question: {question}
ASSISTANT:'''                             # тут документ запихнули в первую реплику пользователя - это один из методов
    prompt = PromptTemplate(input_variables=["context", 'question'],
                            partial_variables={"SYSTEM_PROMPT": SYSTEM_PROMPT, "instruction": instruction},
                            template=prompt_template)
prompt

PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={'SYSTEM_PROMPT': "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.", 'instruction': 'Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last sentence of USER\'s query and goes after words "Main Question: ". Be as detailed as possible, but don\'t make up any information that\'s not from the context. If you don\'t know an answer, say you don\'t know.'}, template='<s>{SYSTEM_PROMPT}\n\n{instruction}\n\nUSER: {context}. Main Question: {question}\nASSISTANT:')

### Создание Chain (Langchain pipeline)

In [23]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers.string import StrOutputParser

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()} | # feature engineering (retrieval augmentation)
    prompt | # препроцессинг (prompt engineering)
    llm | # модель
    StrOutputParser() # постпроцессинг
)

In [24]:
chain.invoke('How to treat pneumonia?')



'<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user\'s questions.\n\nYour role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last sentence of USER\'s query and goes after words "Main Question: ". Be as detailed as possible, but don\'t make up any information that\'s not from the context. If you don\'t know an answer, say you don\'t know.\n\nUSER: venovenous extracorporeal membrane oxygenation ecmo is increasingly used in patients with respiratory failure who fail CT postoperative pneumonia is the most common infection LT imipenem is frequently used for empirical treatment of NP in the intensive ECU nevertheless few data are available on the impact of ecmo on pharmacokin

Видно, что нашелся довольно релевантный документ (в нем хотя бы речь про пневмонию), при этом ответ модели тоже осмысленный и содержит нужную информацию

In [25]:
invoked = chain.invoke('Tell in details what is ceftobiprole bpr?')
invoked



'<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user\'s questions.\n\nYour role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last sentence of USER\'s query and goes after words "Main Question: ". Be as detailed as possible, but don\'t make up any information that\'s not from the context. If you don\'t know an answer, say you don\'t know.\n\nUSER: ceftobiprole bpr is an investigational cephalosporin with activity against staphylococcus aureus including methicillinresistant s aureus mrsa strains the pharmacodynamic pd profile of bpr against s aureus strains with a variety of susceptibility phenotypes in an immunocompromised murine pneumonia model was characterized the bpr

In [26]:
invoked[invoked.find('ASSISTANT:'):] # это ответ модели

'ASSISTANT: Ceftobiprole bpr is an investigational cephalosporin antibiotic that has been shown to have activity against various strains of Staphylococcus aureus, including methicillin-resistant S. aureus (MRSA) strains. In a murine pneumonia model, the pharmacodynamic profile of ceftobiprole was characterized against a variety of S. aureus strains with different levels of susceptibility. The results showed that the antibacterial effects of ceftobiprole were maximized when the free drug exposure (ft mic) ranged from 0.5 to 1 µg/mL, regardless of the phenotypic profile of resistance to beta-lactams, fluoroquinolones, erythromycin, clindamycin, or tetracyclines. The ph'

In [27]:
invoked[invoked.find('USER:'):invoked.find('USER:')+1000] # это часть контекста из ретривера

'USER: ceftobiprole bpr is an investigational cephalosporin with activity against staphylococcus aureus including methicillinresistant s aureus mrsa strains the pharmacodynamic pd profile of bpr against s aureus strains with a variety of susceptibility phenotypes in an immunocompromised murine pneumonia model was characterized the bpr mics of the test isolates ranged from to mugml pharmacokinetic pk studies were conducted with infected neutropenic balbc mice and the bpr concentrations were measured in plasma epithelial lining fluid elf and lung tissue pd studies with these mice were undertaken with eight s aureus isolates two MSSA strains three hospitalacquired mrsa strains and three CA mrsa strains subcutaneous bpr doses of to mgkg of body weightday were administered and the NC in the number of log cfuml in lungs was evaluated after h of therapy the pd profile was characterized by using the free drug exposures f determined from the following parameters the percentage of time that the 

### Добавим в пайплайн историю переписки с ботом

По-хорошему, можно использовать ConversationBufferMemory (но, спойлер, так не вышло)

In [28]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("hello my friend!")
memory.load_memory_variables({})

  memory = ConversationBufferMemory()


{'history': 'Human: hi!\nAI: hello my friend!'}

Тут были эксперименты с ConversationalRetrievalChain, но так сделать не вышло, поэтому удалили. Искали доки и примеры, но ничего не завелось, сделаем по-другому:

Переопределим промпт, чтобы использовалась история диалога

In [29]:
from langchain.prompts import PromptTemplate


SYSTEM_PROMPT = """A chat between a curious user and an artificial intelligence assistant.
The assistant gives helpful, detailed, and polite answers to the user's questions."""
SYSTEM_PROMPT = ' '.join(SYSTEM_PROMPT.split('\n'))

USE_HISTORY = True
if USE_HISTORY:
    # это если у нас есть история диалога
    instruction = '''Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff.
You should answer USER questions about diseases, treatment and other questions concerning health.
Use the following context from USER to answer questions. Question is the last USER's query.
A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer.
After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question.
Be as detailed as possible, but don't make up any information that's not from the context.
If you don't know an answer, say you don't know.'''
    instruction = ' '.join(instruction.split('\n'))
    prompt_template = '''<s>{SYSTEM_PROMPT}

{instruction}

DOCUMENT: {context}

{chat_history}
ASSISTANT:'''
    prompt = PromptTemplate(input_variables=['context', 'chat_history'],
                            partial_variables={"SYSTEM_PROMPT": SYSTEM_PROMPT, "instruction": instruction},
                            template=prompt_template)
else:
    instruction = '''Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff.
You should answer USER questions about diseases, treatment and other questions concerning health.
Use the following context from USER to answer questions. Question is the last sentence of USER's query and goes after words "Main Question: ".
Be as detailed as possible, but don't make up any information that's not from the context.
If you don't know an answer, say you don't know.'''
    instruction = ' '.join(instruction.split('\n'))
    prompt_template = '''<s>{SYSTEM_PROMPT}

{instruction}

USER: {context}. Main Question: {question}
ASSISTANT:'''                             # тут документ запихнули в первую реплику пользователя - это один из методов
    prompt = PromptTemplate(input_variables=["context", 'question'],
                            partial_variables={"SYSTEM_PROMPT": SYSTEM_PROMPT, "instruction": instruction},
                            template=prompt_template)
prompt

PromptTemplate(input_variables=['chat_history', 'context'], input_types={}, partial_variables={'SYSTEM_PROMPT': "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.", 'instruction': 'Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER\'s query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER\'s question. Be as detailed as possible, but don\'t make up any information that\'s not from the context. If you don\'t know an answer, say you don\'t know.'}, template

In [35]:
class InMemoryHistory:
    def __init__(self):
        self.messages = []

    def add_message(self, message):
        """Add a single message to the history."""
        self.messages.append(message)

    def format_history(self):
        """Format the history with special tokens."""
        return "\n".join(self.messages)

chat_history = InMemoryHistory()

chain = (
    {"context": retriever | format_docs, "chat_history": RunnablePassthrough()} | # feature engineering (retrieval augmentation)
    prompt | # препроцессинг (prompt engineering)
    llm | # модель
    StrOutputParser() # постпроцессинг
)

question = 'What is pneumania?'
chat_history.add_message(f"USER: {question}")
formatted_history = chat_history.format_history()
LLM_out = chain.invoke(formatted_history)
chat_history.add_message(f"{LLM_out[LLM_out.find('ASSISTANT: '): ]}")



In [36]:
LLM_out

'<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user\'s questions.\n\nYour role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER\'s query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER\'s question. Be as detailed as possible, but don\'t make up any information that\'s not from the context. If you don\'t know an answer, say you don\'t know.\n\nDOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legio

In [37]:
question = 'How to treat it?'
chat_history.add_message(f"USER: {question}")
formatted_history = chat_history.format_history()
LLM_out = chain.invoke(formatted_history)
chat_history.add_message(f"{LLM_out[LLM_out.find('ASSISTANT: '): ]}")



In [38]:
LLM_out

'<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user\'s questions.\n\nYour role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER\'s query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER\'s question. Be as detailed as possible, but don\'t make up any information that\'s not from the context. If you don\'t know an answer, say you don\'t know.\n\nDOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legio

In [39]:
chat_history = InMemoryHistory()

for question in ['What is pneumania?', 'How to treat it?', 'What is the most common cause of pneumonia?', 'How to avoid this disease?']:
  chat_history.add_message(f"USER: {question}")
  formatted_history = chat_history.format_history()
  LLM_out = chain.invoke(formatted_history)
  chat_history.add_message(f"{LLM_out[LLM_out.find('ASSISTANT: '): ]}")
  print(LLM_out)
  print()
  print('########################################################################')
  print()



<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legionella to mul



<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legionella to mul



<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legionella to mul



<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: ceftobiprole bpr is an investigational cephalosporin with activity against staphylococcus aureus including methicillinresistant s aur

Там в один момент сама модель стала за пользователя генерить вопрос и отвечать на него :))

Можно сделать то же самое, но через while и question = input(). Получится полноценная qa система в этом ноутбуке

In [41]:
chat_history = InMemoryHistory()

try:
    while True:
        question = input()
        chat_history.add_message(f"USER: {question}")
        formatted_history = chat_history.format_history()
        LLM_out = chain.invoke(formatted_history)
        chat_history.add_message(f"{LLM_out[LLM_out.find('ASSISTANT: '): ]}")
        print(LLM_out)
        print()
        print('########################################################################')
        print()
except KeyboardInterrupt:
    pass

What is pneumania?




<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legionella to mul

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legionella to mul



<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: legionnaires disease is a modern environmental infectious disease it stems from the capacity of the causative agent legionella to mul



<s>A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.

Your role here is specific: you are a doctor which knows everything about diseases, illnesses and medical stuff. You should answer USER questions about diseases, treatment and other questions concerning health. Use the following context from USER to answer questions. Question is the last USER's query. A relevant document is given above and goes after words "DOCUMENT: ". Try to use this information in your answer. After the "DOCUMENT: " and the body of the document you are given a chat history - conversation with USER. Answer the last USER's question. Be as detailed as possible, but don't make up any information that's not from the context. If you don't know an answer, say you don't know.

DOCUMENT: ceftobiprole bpr is an investigational cephalosporin with activity against staphylococcus aureus including methicillinresistant s aur

In [43]:
chat_history.messages

['USER: What is pneumania?',
 'ASSISTANT: Pneumonia is a lung infection caused by various types of bacteria, viruses, or fungi. It can range from a mild to a severe illness, and can affect people of all ages and backgrounds. Symptoms of pneumonia can include cough, fever, chest pain, and difficulty breathing. Treatment for pneumonia depends on the underlying cause and can include antibiotics, rest, and hydration. In severe cases, hospitalization may be necessary. Prevention measures include getting vaccinated, washing hands regularly, and avoiding close contact with people who are sick.',
 'USER: How to treat it?',
 'ASSISTANT: Pneumonia is a lung infection caused by various types of bacteria, viruses, or fungi. It can range from a mild to a severe illness, and can affect people of all ages and backgrounds. Symptoms of pneumonia can include cough, fever, chest pain, and difficulty breathing. Treatment for pneumonia depends on the underlying cause and can include antibiotics, rest, and 

Как будто получилось сделать :)