See [README](https://github.com/samdeverett/metabolomics-llm/blob/glutamate/metabolomics-llm/glutamate/README.md) for experiment motivation.

# Experiment 1: `Synthesis` 

**Question:** Can the LLM combine insights from the various pieces of context given to it? Does it provide a balanced response if the pieces of context disagree?
  
**Methodology:** Generate a natural language summary of the result from each paper. With the summaries prepended to the prompt, ask the LLM about the effect of exercise on glutamate and assess it's ability to synthesize the summaries.

In [1]:
question = "How does the presence of glutamate change in response to exercise?"

In [2]:
# Summary of results from each paper

breit = """
    Table 1 shows that exercise causes an increase in glutamate, specifically a 1.12 MFC and 0.17 log_2(MFC). 
"""

peake = """
    The plasma concentration of glutamate increased significantly only during high-intensity interval exercise.
    In skeletal muscle, glutamate consumption increases markedly during the first few minutes of exercise, particularly when muscle glycogen is low.  
"""

zauber = """
    Table 2 shows an increase in glutamate in both males and females during exercise, although only significantly for females.  
"""

howe = """
    Table 1 shows a ratio of 0.528 for glutamate after an 80.5km run versus before, suggesting a decrease in glutamate in response to exercise.  
"""

coelho = """
    Glutamate was up regulated immediately following exercise but returned to origin levels shortly after.
"""

danaher = """
    Glutamate significantly decreased during the recovery period after exercise.
"""

In [4]:
import pipeline

from apikey import HF_AUTH_TOKEN

In [3]:
# Select model to use
model_id = 'meta-llama/Llama-2-7b-chat-hf'

In [None]:
model = pipeline.init_model(model_id, HF_AUTH_TOKEN)
tokenizer = pipeline.init_tokenizer(model_id, HF_AUTH_TOKEN)
text_generation_pipeline = pipeline.init_text_generation_pipeline(model, tokenizer)
llm = pipeline.init_langchain_pipeline(text_generation_pipeline)

In [5]:
prompt = f"""
<s>[INST] <<SYS>>
Act as a helpful research assistant capable of synthesizing insights.
<</SYS>>
Using the following summaries of research papers, answer the question at the end.
Summary of Breit paper: {breit}
Summary of Peake paper: {peake}
Summary of Zauber paper: {zauber}
Summary of Howe paper: {howe}
Summary of Coelho paper: {coelho}
Summary of Danaher paper: {danaher}
QUESTION: {question}
ANSWER: [/INST]
"""

In [None]:
llm(prompt)