# BioLLM x Plants - Procedure Design - Q-As

Rachel K. Luu, Ming Dao, Subra Suresh, Markus J. Buehler (2025) [full reference to be updated to be included here]

## Load BioLLM + RAG

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
from llama_index.llms.llama_cpp import LlamaCPP
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from utils.formats import messages_to_prompt, completion_to_prompt


model_url = "https://huggingface.co/rachelkluu/Llama3.1-8b-Instruct-CPT-SFT-DPO-09022024-Q8_0-GGUF/resolve/main/llama3.1-8b-instruct-cpt-sft-dpo-09022024-q8_0.gguf"
bioinspiredllm_q8 = LlamaCPP(
    model_url=model_url,
    model_path=None,
    temperature=.1,
    max_new_tokens=2048,
    context_window=16000,
    model_kwargs={"n_gpu_layers": -1},
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=False,
)

Settings.llm = bioinspiredllm_q8
Settings.embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-small-en-v1.5"
)

documents = SimpleDirectoryReader(
    "./RAG/"
).load_data()

Settings.chunk_size = 128
Settings.chunk_overlap = 50

vector_index = VectorStoreIndex.from_documents(documents)
query_engine = vector_index.as_query_engine(response_mode="compact", similarity_top_k=10) 

# Generate Questions

In [2]:
from protocols.proc_qa import get_technical_qs

prompt = "Design a procedure that makes a composite out of pollen grains and rhapis excelsa leaves." 

num_gen = 1 #number of sampling generations
df, all_questions, qcount = get_technical_qs(num_gen, prompt,query_engine) 

print(f"{qcount} total questions were generated!")
print(f"Here are the generated questions:")
for question in all_questions:
    print(f"- {question}")

9 total questions were generated!
Here are the generated questions:
- What are the unique properties of pollen grains and rhapis excelsa leaves?
- What are the advantages of using pollen grains and rhapis excelsa leaves in a composite material?
- What are the potential applications of a composite material made from pollen grains and rhapis excelsa leaves?
- What are the challenges in making a composite material from pollen grains and rhapis excelsa leaves?
- What are pollen grains and rhapis excelsa leaves?
- What is the process of making a composite material?
- What is a composite material?
- What are the different methods of making a composite material?
- What are the essential components of a composite material?


# Generate Answers

In [3]:
from protocols.proc_qa import ans_technicals

df, answers = ans_technicals(df, query_engine)

for index, row in df.iterrows():
    print(f"Question: {row['Question']}")
    print(f"Answer: {row['Answer']}")
    print() 


Question: What are the unique properties of pollen grains and rhapis excelsa leaves?
Answer: Pollen grains have unique properties such as a hollow internal architecture, uniform microscale size, and robust hierarchical structure. They are abundant in nature and harvested by bees. Sunflower pollen grains have a distinctive core-shell structure with spiky architectures, with the shell comprising an ultra-strong sporopollenin outer layer and a flexible inner layer consisting of cellulose and pectin. Rhapis excelsa leaves have a unique hierarchical structure with a high density of longitudinal veins and a distinct pattern of transverse veins.

Question: What are the advantages of using pollen grains and rhapis excelsa leaves in a composite material?
Answer: The advantages of using pollen grains and rhapis excelsa leaves in a composite material include their unique hierarchical structures, which contribute to the material's mechanical properties such as high strength and toughness. The comb

# Save Final Data to JSON File to be used in Multi-Agent

In [5]:
filename = "rhapispollencomp"

json_data = df.to_json(orient='records', lines=True)
with open(f"{filename}.json", 'w') as json_file:
    json_file.write(json_data)