### Chat with your PDFs using byaldi + Claude 🚀

How does it work?
- first download your chosen model (e.g. [ColPali](https://huggingface.co/vidore/colpali))
- then create an index for your pdf
- search the index for your chosen query
- pass the top search result to Claude along with your query

In this notebook we'll chat with an academic paper and financial report.

### Setup
- Follow the byaldi setup instructions [here](https://github.com/AnswerDotAI/byaldi/)
- pip install claudette

In [10]:
import base64
import os
from byaldi import RAGMultiModalModel
from claudette import *

os.environ["HF_TOKEN"] = "YOUR_HF_TOKEN" # to download the ColPali model
os.environ["ANTHROPIC_API_KEY"] = "YOUR_ANTHROPIC_API_KEY"
model = RAGMultiModalModel.from_pretrained("vidore/colpali")

Verbosity is set to 1 (active). Pass verbose=0 to make quieter.


Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00,  1.33it/s]


### Academic Paper

We're going to chat with the epochal Attention Is All You Need [paper](https://arxiv.org/pdf/1706.03762)

Specifically we're going to ask "What's the BLEU score for the transfomer base model?" The answer to this question is found in Table 2 on page 8.

<img src="./attention_table.png" alt="Table 2" width="512" height="512">

In [2]:
query = "What's the BLEU score for the transfomer base model?"

First, let's index the paper

In [3]:
model.index(
    input_path="attention.pdf",
    index_name="attention",
    store_collection_with_index=True,
    overwrite=True
)

overwrite is on. Deleting existing index attention to build a new one.
Added page 1 of document 0 to index.
Added page 2 of document 0 to index.
Added page 3 of document 0 to index.
Added page 4 of document 0 to index.
Added page 5 of document 0 to index.
Added page 6 of document 0 to index.
Added page 7 of document 0 to index.
Added page 8 of document 0 to index.
Added page 9 of document 0 to index.
Added page 10 of document 0 to index.
Added page 11 of document 0 to index.
Added page 12 of document 0 to index.
Added page 13 of document 0 to index.
Added page 14 of document 0 to index.
Added page 15 of document 0 to index.
Index exported to .byaldi/attention
Index exported to .byaldi/attention


Now, let's search the index for our query. We expect the top result to be page 8.

In [4]:
results = model.search(query, k=1)
results[0].page_num

8

Finally, we convert the top search result to bytes and pass it to Claude with our query. We use the Sonnet model as it is well suited to this task. If everything works as expected Claude should tell us that the BLEU score is 27.3 for EN-DE and 38.1 for EN-FR.

**Note**: The image passed to Claude is large so depending on your account settings you might hit a token limit. If this happens try the smaller Financial Report example further down in the notebook. 

In [5]:
image_bytes = base64.b64decode(results[0].base64)
chat = Chat(models[1])
chat([image_bytes, query])

According to the table in the image, the BLEU score for the Transformer (base model) is:

- 27.3 for EN-DE (English to German)
- 38.1 for EN-FR (English to French)

<details>

- id: `msg_01VsGfhxT8kq2Yt2Lq3v4xkx`
- content: `[{'text': 'According to the table in the image, the BLEU score for the Transformer (base model) is:\n\n- 27.3 for EN-DE (English to German)\n- 38.1 for EN-FR (English to French)', 'type': 'text'}]`
- model: `claude-3-5-sonnet-20240620`
- role: `assistant`
- stop_reason: `end_turn`
- stop_sequence: `None`
- type: `message`
- usage: `{'input_tokens': 1522, 'output_tokens': 58, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0}`

</details>

### Financial Report

Now, we're going to chat with a financial report for a fictitious company called ACME. The report contains monthly revenue data for ACME's 5 products that are creatively named (A, B, C, D, E). 

We're going to ask "In which month did Product C generate the most revenue?". The expected answer is **June**.

<img src="./product_c.png" alt="Table 2" width="512" height="512">

In [6]:
query = "In which month did Product C generate the most revenue?"

First, let's index the report

In [7]:
model.index(
    input_path="financial_report.pdf",
    index_name="financial_report",
    store_collection_with_index=True,
    overwrite=True
)

overwrite is on. Deleting existing index financial_report to build a new one.
Added page 1 of document 1 to index.
Added page 2 of document 1 to index.
Added page 3 of document 1 to index.
Added page 4 of document 1 to index.
Added page 5 of document 1 to index.
Added page 6 of document 1 to index.
Index exported to .byaldi/financial_report
Index exported to .byaldi/financial_report


Now, let's search the index for our query. We expect the top result to be page 4.

In [8]:
results = model.search(query, k=1)
results[0].page_num

4

Finally, we convert the top search result to bytes and pass it to Claude with our query. We use the Sonnet model as it is well suited to this task. If everything works as expected Claude should tell us Product C generated the most revenue in **June**.

In [9]:
chat = Chat(models[1])
image_bytes = base64.b64decode(results[0].base64)
chat([image_bytes, query])

According to the bar graph showing monthly revenue for Product C, the month with the highest revenue was June. The bar for June is visibly the tallest, reaching above 2500 on the revenue scale, indicating it generated the most revenue compared to all other months shown.

<details>

- id: `msg_011g8XCR3VsXBSZVfzs5b9GY`
- content: `[{'text': 'According to the bar graph showing monthly revenue for Product C, the month with the highest revenue was June. The bar for June is visibly the tallest, reaching above 2500 on the revenue scale, indicating it generated the most revenue compared to all other months shown.', 'type': 'text'}]`
- model: `claude-3-5-sonnet-20240620`
- role: `assistant`
- stop_reason: `end_turn`
- stop_sequence: `None`
- type: `message`
- usage: `{'input_tokens': 1573, 'output_tokens': 59, 'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0}`

</details>