In [1]:
!python -m pip install -r requirements.txt

Collecting ipywidgets (from -r requirements.txt (line 5))
  Downloading ipywidgets-8.1.2-py3-none-any.whl.metadata (2.4 kB)
Collecting widgetsnbextension~=4.0.10 (from ipywidgets->-r requirements.txt (line 5))
  Downloading widgetsnbextension-4.0.10-py3-none-any.whl.metadata (1.6 kB)
Collecting jupyterlab-widgets~=3.0.10 (from ipywidgets->-r requirements.txt (line 5))
  Downloading jupyterlab_widgets-3.0.10-py3-none-any.whl.metadata (4.1 kB)
Downloading ipywidgets-8.1.2-py3-none-any.whl (139 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.4/139.4 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
[?25hDownloading jupyterlab_widgets-3.0.10-py3-none-any.whl (215 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m215.0/215.0 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading widgetsnbextension-4.0.10-py3-none-any.whl (2.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m6.3 MB/s[0m eta [3

We will setup a locally running LLM and a locally running vector database and embedding function. These will be:
* Ollama running Mistral
* ChromaDB, running locally, but persistently. The vectors will be stored in the current directory. This can be changed via the `chroma_dir` variable
* `DefaultEmbeddingFunction` (part of the ChromaDB lib) will take care of creating embeddings

It is assumed that the ChromaDB database is already setup. In order to set it up, you can use the `create_knowledgebase` notebook

In [5]:
import dspy
from dspy.retrieve.chromadb_rm import ChromadbRM
from chromadb.utils.embedding_functions import DefaultEmbeddingFunction

chroma_dir = './chroma'
chroma_collection = 'man_data'

chroma_rm = ChromadbRM(
    collection_name=chroma_collection,
    persist_directory=chroma_dir,
    embedding_function=DefaultEmbeddingFunction(),
    k=3,
)

mistral_ollama = dspy.OllamaLocal(model='mistral')
dspy.configure(
    lm=mistral_ollama,
    rm=chroma_rm
)

In [6]:
# test the vanilla LLM
mistral_ollama('Who is the president of Brazil?')

ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x16dab5a50>: Failed to establish a new connection: [Errno 61] Connection refused'))

`DocumentFAQ` is a pipeline for retrieving knowledgebase for a question and then using an LLM to answer it.
It features a Chain Of Throught step that should improve the performance of the predictiions

In [3]:
class DocumentFAQSignature(dspy.Signature):
    """Answer questions based on the provided context."""

    context = dspy.InputField(desc="facts here are assumed to be true")
    question = dspy.InputField()
    answer = dspy.OutputField()


class DocumentFAQ(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()

        self.retrieve = dspy.Retrieve(k=3)
        self.generate_answer = dspy.ChainOfThought(DocumentFAQSignature)
    
    def forward(self, question) -> dspy.Prediction:
        context = self.retrieve(question).passages
        prediction = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=prediction.answer)

The following block is test the pipeline without any training or optimizations. It will just extract three items from the knowledgebase and put them in the context. Then, using Ollama, it will predict the answer and display it.

In [8]:
# Test pipeline

question = 'How can I make a multipart upload with curl? Can you write an example for me?'
pipeline = DocumentFAQ()
prediction = pipeline.forward(question)
print(prediction.answer)

To make a multipart upload using `curl`, you can use the `--form` or `-F` option. Here's an example of how to send a file named `image.jpg` along with some metadata as key-value pairs:

```bash
# Replace 'https://example.com/upload.php' with your actual upload URL
curl --form "name=JohnDoe" \
     --form "email=john.doe@example.com" \
     --form "file;file=@image.jpg" \
     https://example.com/upload.php
```

In this example, we use the `--


The next pipeline uses a Multi hop step in order to increase the accuracy of preductions for more complex queries.

In [21]:
from dsp.utils import deduplicate

class GenerateSearchQuery(dspy.Signature):
    """Write a simple search query that will help answer a complex question."""

    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    query = dspy.OutputField()

class MultihopFAQ(dspy.Module):
    def __init__(self, passages_per_hop=2, max_hops=2):
        super().__init__()

        self.generate_query = [dspy.ChainOfThought(GenerateSearchQuery) for _ in range(max_hops)]
        self.retrieve = dspy.Retrieve(k=passages_per_hop)
        self.generate_answer = dspy.ChainOfThought(DocumentFAQSignature)
        self.max_hops = max_hops
    
    def forward(self, question):
        context = []
        
        for hop in range(self.max_hops):
            query = self.generate_query[hop](context=context, question=question).query
            passages = self.retrieve(query).passages
            context = deduplicate(context + passages)

        pred = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=pred.answer)

In [22]:
# Test the multi hop pipeline

question = 'How do I set the cache headers in curl so that the server does not give me cached results?'
pipeline = MultihopFAQ()
prediction = pipeline.forward(question)
print(prediction.answer)

To prevent the server from serving you cached results and also not sending any cache-related headers, you can use the `--no-cache` option in curl along with other options if needed. Here's an example of how to use it:
```bash
curl --no-cache [OPTIONS] URL
```
Replace `[OPTIONS]` with any additional options you might need, such as `--header`, `--data`, or `--compressed`. For instance:
```bash
curl --no-cache --header "User-Agent: Mozilla/5.0" https://example.com/path-to-resource
```
This command will send


In [23]:
# inspect the LLM usage
from summarize_usage import summarize_usages


summarize_usages(mistral_ollama.history[-3:])

LLM usage for chatcmpl-da39a3ee5e6b4b0d3255bfef95601890afd80709: 156 prompt, 150 completion, 306 total
LLM usage for chatcmpl-da39a3ee5e6b4b0d3255bfef95601890afd80709: 992 prompt, 150 completion, 1142 total
LLM usage for chatcmpl-da39a3ee5e6b4b0d3255bfef95601890afd80709: 620 prompt, 150 completion, 770 total
