**Requirements**

Tutorial from: [web](https://www.datacamp.com/tutorial/run-llama-3-locally)

In [None]:
#%pip install unstructured[docx] langchain langchainhub langchain_community langchain-chroma libmagic

**Variables**
Change them to your path

In [1]:

#pdfToRead = "C:/tmp/"

**Loading the documents**
It is a best practice to develop and test your code in Jupyter Notebook before creating the app.

We will load all the docx files from the folder using the DirectoryLoader.

In [2]:
#from langchain_community.document_loaders import DirectoryLoader
#from langchain_community.document_loaders import PyPDFLoader

#file_path = "./input/c06.pdf"
#fileExtToSearch = "**/*.pdf"
#loader = PyPDFLoader(file_path)
#books = loader.load_and_split()

**Splitting the text**
Feeding an entire book to the model is not feasible, as it would exceed its context window. To overcome this limitation, we must divide the text into smaller, more manageable chunks that fit comfortably within the model's context window.

In our case, we will convert all four books to a chunk size of 500 characters.

In [11]:
from langchain.text_splitter import TextSplitter
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema import Document
class DisorderTextSplitter(TextSplitter):
    def split_text(self, text: str) -> list[str]:
        # Split the text into paragraphs
        paragraphs = text.split('\n')
        
        # Initialize variables
        chunks = []
        current_chunk = ""
        
        for paragraph in paragraphs:
            # Check if the paragraph starts with a disorder code (e.g., 6A00.3)
            if re.match(r'^\s*6[A-Z]\d{2}(\.\d+)?([A-Z])?(?!\s+ICD-11 MMS)', paragraph):
                # If we have a current chunk, add it to the list of chunks
                if current_chunk:
                    chunks.append(current_chunk.strip())
                # Start a new chunk with this paragraph
                current_chunk = paragraph
            else:
                # If it's not a new disorder, add to the current chunk
                current_chunk += "\n" + paragraph
        
        # Add the last chunk if it exists
        if current_chunk:
            chunks.append(current_chunk.strip())
        
        return chunks
# Usage
with open('./input/c06.txt',encoding="utf-8") as f:
    content = f.read()
    document = Document(page_content=content, metadata={})
    text_splitter = DisorderTextSplitter()
    all_splits = text_splitter.split_documents([document])    
    

In [12]:
len(split_text(content))

641

**Ollama embeddings and Chroma vector store**
We will use Langchain to convert the text into the embedding and store it in the Chroma database.

We are using the Ollama Llama 3 model as an embedding model.


from langchain_chroma import Chroma

In [20]:
from langchain_chroma import Chroma
from langchain_community.embeddings import OllamaEmbeddings

#modelli = ["gemma2:latest","phi3:medium","phi3.5:latest","mistral-nemo:latest","llama3:latest"]
modelli = ["mixtral:8x7b"]
for modello in modelli:
    vectorstore = Chroma.from_documents(
        documents=all_splits,
        embedding=OllamaEmbeddings(model=modello, show_progress=True),
        persist_directory=f"./chroma_db-{modello.replace(':','')}",
    )

OllamaEmbeddings: 100%|██████████| 641/641 [2:49:38<00:00, 15.88s/it]


**TEST**

In [None]:
question = "What is anxiety?"
docs = vectorstore.similarity_search(question)
docs

**Building Langchain chains for Q&A retrieval system**
To build a proper question-and-answer retrieval system, we will use Langchain chains and start adding the modules.

In our Q&A chain, we will

Use vector store as the retriever and format the results.
After that, we will provide the RAG prompt. You can easily pull that from the Langchain Hub.
Then, we will provide the Ollama Llama 3 inference function.
In the end, we will parse the results only to display the response.
Simply put, before passing it through the Llama 3 model, your question will be provided with context using the similarity search and RAG prompt.

In [None]:
from langchain import hub
from langchain_community.llms import Ollama
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

llm = Ollama(model="llama3")

retriever = vectorstore.as_retriever()


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_prompt = hub.pull("rlm/rag-prompt")
qa_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

**Testing the Q&A retrieval chain**
Ask relevant questions about books to understand more about the story.

In [None]:
question = "from the context, which type of disease are being described?"
qa_chain.invoke(question)
#question = "which are anorexia symptoms?"
#qa_chain.invoke(question)