## Project Abstract
This project demonstrates the integration of the Llama 3.1 language model for document-based question answering and personal health coaching. Using a custom `Modelfile`, the model is configured to act as a health trainer named David. The notebook workflow includes:
- Loading documents (such as `hdfs.pdf`) for context.
- Setting up the Llama 3.1 model and embedding model for semantic search.
- Creating a vector index from the documents for efficient querying.
- Defining a system prompt and input template for accurate Q&A.
- Running queries against the indexed documents and retrieving model-generated answers.

This approach enables both general Q&A and domain-specific assistance, leveraging the capabilities of Llama 3.1 and custom persona settings for practical applications in health and data science.

In [18]:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
from llama_index.llms.ollama import Ollama
from llama_index.core import PromptTemplate
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

In [None]:

documents = SimpleDirectoryReader("./documents").load_data()



In [17]:
llm=Ollama(model='llama3.1',request_timeout=120.0)

In [20]:
print(f"loaded {len(documents)} documents")

loaded 10 documents


In [21]:

system_prompt="""
You are a Q&A assistant. Your goal is to answer questions as accuratley as possible based on the instructions and context provided
"""
##deafult format supported by llama 3.1
llama3_1_template_str = (
    "<|begin_of_text|>"
    "<|start_header_id|>system<|end_header_id|>\n\n"
    "{system_prompt}<|eot_id|>"
    "<|start_header_id|>user<|end_header_id|>\n\n"
    "{query_str}<|eot_id|>"
    "<|start_header_id|>assistant<|end_header_id|>\n\n"
)
query_wrapper_prompt=PromptTemplate(llama3_1_template_str)

In [22]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
embed_model=HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')

Settings.llm = llm
Settings.embed_model = embed_model

2025-10-01 17:51:02,957 - INFO - Load pretrained SentenceTransformer: BAAI/bge-small-en-v1.5
2025-10-01 17:51:32,623 - INFO - 1 prompt is loaded, with the key: query


In [23]:
print('creating index')
index=VectorStoreIndex.from_documents(documents)
print('Index created Succesfully')

creating index
Index created Succesfully


In [27]:
query_engine=index.as_query_engine()

2025-10-01 17:53:52,372 - INFO - HTTP Request: POST http://localhost:11434/api/show "HTTP/1.1 200 OK"


In [29]:
print('Querying the index')

response= query_engine.query("How does HDFS exposes the locations of a file blocks")

print('This is the models reply')

print(response)

Querying the index


2025-10-01 20:18:27,174 - INFO - HTTP Request: POST http://localhost:11434/api/chat "HTTP/1.1 200 OK"


This is the models reply
When a client wants to read from a file, it first contacts the NameNode for the locations of data blocks comprising the file. The client then reads block contents from the DataNode closest to the client.
