![ALT_TEXT_FOR_SCREEN_READERS](./header.png)

# Exercise 4.C OCR supported LLM Information Extraction

The goal of this exercise is to build an agent demo which allows you to talk about the content of documents. The method behind this exercise is called retrieval augmented generation (RAG).
The detailed tasks in this exercise are:
- install a local large language model using the application Ollama
- setup a new environment with the required packages
- implement a simple chatbot using langchain[2]
- test the chatbot on a specific technical document

We are using Ollama[1] for local execution of the LLM and the framework langchain[2] for the access to the model.

- [1] https://ollama.com/
- [2] https://www.mistral.com/

# Considerations

- Read the tutorials carefully, especially [1]
- Install Ollama on your computer
- Install additional software packages into the environment by uncommenting the pip install commands one time
- Select a model based on your memory size of the laptop
- This is less a coding example, rather just the integration with a local LLM

# Requirements

- R0: Install the required packages using the pip commands
- R1: Install the Ollama software
- R2: Find a model which is running on your machine
- R3: Start the server for the model
- R4: Connect the server to the notebook
- R5: Run the code parts until the first query
- R6: Improve your query according to the slides learned in the class


# Setup

In [None]:
#%pip install -qU mistralai

# Imports

In [None]:
import os
from dotenv import load_dotenv
import pprint
from mistralai import Mistral
import base64
import mimetypes

In [None]:
#
# the api key has to be written in a .env file in this folder. Content: MISTRAL_API_KEY=xc.....
#
load_dotenv()
api_key = os.environ["MISTRAL_API_KEY"]

# Prepare LLM

In [None]:
client = Mistral(api_key=api_key)

# OCR a single Image

In [None]:
image_file_receipt = "./documents/graph1.png"

In [None]:
def load_image(image_path):
  mime_type, _ = mimetypes.guess_type(image_path)
  with open(image_path, "rb") as image_file:
    image_data = image_file.read()
  base64_encoded = base64.b64encode(image_data).decode('utf-8')
  base64_url = f"data:{mime_type};base64,{base64_encoded}"
  return base64_url

In [None]:
ocr_response = client.ocr.process(
  model="mistral-ocr-latest",
  document={
    "type": "image_url",
    "image_url": load_image(image_file_receipt),
  },
  include_image_base64=True,
)

In [None]:
print(ocr_response)

In [None]:
from IPython.display import Markdown
from IPython.display import Image, display, HTML

from base64 import b64decode

In [None]:
def printmd(string):
    display(Markdown(string))

In [None]:
for page in ocr_response.pages:
  printmd(page.markdown)
  for image in page.images:
      base64_str = image.image_base64  # Assuming this is the full Base64 string
      img_html = f'<img src="{base64_str}" style="max-width: 500px;">'  # Adjust size if needed
      display(HTML(img_html))
      #print(image_raw)
      #image_object = Image(image_raw,embed=True, format="jpeg")
      #image_object = Image(data=b64decode(image.image_base64),embed=True)
      #print(image_object)
      #display ( image_object )

# Load and scan complete PDF with LLM

In [None]:
#
# Define PDF file to read
#
file_path = "./documents/graph2.pdf"

In [None]:
def upload_pdf(filename):
  uploaded_pdf = client.files.upload(
    file={
      "file_name": filename,
      "content": open(filename, "rb"),
    },
    purpose="ocr"
  )
  signed_url = client.files.get_signed_url(file_id=uploaded_pdf.id)
  return signed_url.url

In [None]:
messages = [
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Explain this article in detail for experts in the field. Take special care for the diagram and the relations shown in diagrams.",
      },
      {
        "type": "document_url",
        "document_url": upload_pdf(file_path),
      },
    ],
  }
]
chat_response = client.chat.complete(
  model="mistral-small-latest",
  messages=messages,
)
print(chat_response.choices[0].message.content)