# Document MObject Example
This Jupyter notebook runs on Colab after adding HF_TOKEN as secret to Colab.

## Install Ollama

Before we get started with Mellea, we download, install and serve ollama. We define set_css to wrap Colab output.

In [None]:
!curl -fsSL https://ollama.com/install.sh | sh > /dev/null
!nohup ollama serve >/dev/null 2>&1 &

from IPython.display import HTML, display


def set_css():
    display(HTML("\n<style>\n pre{\n white-space: pre-wrap;\n}\n</style>\n"))


get_ipython().events.register("pre_run_cell", set_css)

## Install Mellea
We run `uv pip install mellea` to install Mellea.

In [None]:
!uv pip install mellea[docling]

## Create a RichDocument
Let's create a RichDocument from an arxiv paper, which loads the PDF file and parses it with the Docling parser into an intermediate representation.

In [None]:
from mellea.stdlib.docs.richdocument import RichDocument

rd = RichDocument.from_document_file("https://arxiv.org/pdf/1906.04043")

## Extract a Table from the Document
We can extract some document content, e.g. the first table:

In [None]:
from mellea.stdlib.docs.richdocument import Table

table1: Table = rd.get_tables()[0]
print(table1.to_markdown())

## Work with the Table Object
The Table object is Mellea-ready and can be used immediately with LLMs. In this example, table1 is transformed to have an extra column "Model" which contains the model string from the Feature column or "None" if there is none.

In [None]:
from mellea import start_session
from mellea.backends.types import ModelOption

m = start_session()
for seed in [x * 12 for x in range(5)]:
    table2 = m.transform(
        table1,
        "Add a column 'Model' that extracts which model was used or 'None' if none.",
        model_options={ModelOption.SEED: seed},
    )
    if isinstance(table2, Table):
        print(table2.to_markdown())
        break
    else:
        print("==== TRYING AGAIN after non-useful output.====")

The model has fulfilled the task and coming back with a parsable syntax. You could now call (e.g. m.query(table2, "Are there any GPT models referenced?")) or continue transformation (e.g. m.transform(table2, "Transpose the table.")).