## Ingesting PDF

In [1]:
# %pip install --q unstructured langchain
# %pip install --q "unstructured[all-docs]"

In [2]:
import os
from langchain.document_loaders import TextLoader

In [3]:
local_path = "data.txt"

# Local PDF file uploads
if local_path:
    loader = TextLoader(local_path, encoding="latin1") 
    data = loader.load()



In [4]:
# Preview first page
# data[0].page_content

## Vector Embeddings

In [5]:
!ollama pull nomic-embed-text

In [6]:
!ollama list

NAME                       ID              SIZE      MODIFIED               
nomic-embed-text:latest    0a109f422b47    274 MB    Less than a second ago    
llama3.2:latest            a80c4f17acd5    2.0 GB    3 hours ago               
llama3:latest              365c0bd3c000    4.7 GB    4 hours ago               


In [7]:
%pip install --q chromadb
%pip install --q langchain-text-splitters

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 24.3.1
[notice] To update, run: C:\Users\kruth\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 24.3.1
[notice] To update, run: C:\Users\kruth\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [8]:
%pip install protobuf==5.26.1





[notice] A new release of pip is available: 24.0 -> 24.3.1
[notice] To update, run: C:\Users\kruth\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [9]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

In [10]:
# Split and chunk 
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks = text_splitter.split_documents(data)

In [11]:
current_dir = os.getcwd()
persistent_directory = os.path.join(current_dir, "db", "chroma_db_for_GitHub")
embedding_function = OllamaEmbeddings(model="nomic-embed-text", show_progress=True)

  embedding_function = OllamaEmbeddings(model="nomic-embed-text", show_progress=True)


In [12]:
if os.path.exists(persistent_directory):
    vector_db = Chroma(
        persist_directory=persistent_directory, 
        embedding_function=embedding_function,
        collection_name="local-rag"
    )
    print("Loaded existing Chroma vector store.")
else:
    vector_db = Chroma.from_documents(
        documents=chunks, 
        embedding=OllamaEmbeddings(model="nomic-embed-text", show_progress=True),
        collection_name="local-rag",
        persist_directory=persistent_directory
    )
    vector_db.persist()

  vector_db = Chroma(


Loaded existing Chroma vector store.


## Retrieval

In [13]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever

In [14]:
# LLM from Ollama
local_model = "llama3.2"
llm = ChatOllama(model=local_model)

  llm = ChatOllama(model=local_model)


In [15]:
QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are a GitHub Repository who answers questions taking relevance from the data present in the vector database. By generating multiple perspectives on the user question, your
    goal is to help the user overcome some of the limitations of the distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}""",
)

In [16]:
retriever = MultiQueryRetriever.from_llm(
    vector_db.as_retriever(), 
    llm,
    prompt=QUERY_PROMPT
)

template = """You are a GitHub Repository. Answer the question based ONLY on the following context:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [17]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [18]:
!ollama pull llama3.2

[?25lpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠧ [?25h[?25l[2K[1Gpulling manifest ⠇ [?25h[?25l[2K[1Gpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest 
pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB                         
pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB                         
pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB                         
pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB                         
pulling 56bb8bd477a5... 100% ▕██

In [20]:

response = chain.invoke("What is the contents present in the examples/using-web-socket/main.go file and which language is it? Please elaborate")

OllamaEmbeddings: 100%|██████████| 1/1 [00:03<00:00,  3.46s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.10s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.08s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.09s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.12s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.10s/it]


In [21]:
print(response)

The `main.go` file contains an example of using WebSockets with the Go programming language. Here's a breakdown of the contents:

```go
package main

import (
	"fmt"
	"log"

	"github.com/gorilla/websocket"
)

const (
	websocketURL = "ws://localhost:8080"
)

var upgrader = websocket.Upgrader{
	ReadBufferSize:  1024,
	WriteBufferSize: 1024,
}

type Contest struct {
	Title string
	StartDate time.Time
	EndDate time.Time
}

func main() {
	// Establish a connection to the WebSocket server
 conn, _, err := upgrader.Upgrade("localhost", "8080", "")
 if err != nil {
 log.Fatal(err)
 }

 // Handle incoming messages from the client
 for {
 conn.SetReadDeadline(time.Now().Add(30 * time.Second))
 message, r, err := conn.ReadMessage()
 if err != nil {
 log.Println(err)
 break
 }
 switch message := message.(type) {
 case websocket.TextMessage:
 fmt.Println("Received text message:", string(message))
 // Simulate sending a response back to the client
 sendText(conn, "Hello from server!")
 case websocke

In [26]:
response = chain.invoke("fix this issue that waqas raised: Godoc missing for AddRestHandler feature to override the Path"+" where can I learn more on gofr to fix this issue")

OllamaEmbeddings: 100%|██████████| 1/1 [00:03<00:00,  3.53s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.08s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.13s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.09s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.09s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.10s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.11s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.09s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.12s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.10s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.12s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [

In [27]:
print(response)

It seems like you're having an issue with a Go module or package that's not properly installed or configured, leading to a "Godoc missing" error when trying to access documentation for a specific feature.

The `AddRestHandler` feature is related to the `net/http` package in Go. To fix the issue, I'll provide some steps you can take:

1. **Check your dependencies**: Make sure that the `net/http` and `github.com/gorilla/mux` packages are properly installed. Run `go get -u github.com/gorilla/mux` to update them if necessary.
2. **Verify your import statements**: Ensure that your Go file is importing the correct package paths for `AddRestHandler`. The typical import statement for this feature would be:
```go
import (
    "github.com/gorilla/mux"
)
```
3. **Check the Gorilla MUX documentation**: Visit the official Gorilla MUX documentation to ensure that you're using the `AddRestHandler` correctly. You can find it at: <https://pkg.gorilla.org/mux/v1.8.0/doc/>
4. **GoFR documentation**: The 