## XYZ

> For while man strives he errs.  

&mdash; [Goethe, *Faust, Part 1* (Kline translation).](https://www.gutenberg.org/files/14591/14591-h/14591-h.htm#PROLOGUE_IN_HEAVEN)

It's great fun to chat with a large language model about a book you have both read.  But as the LLM is scaled down in size, the quality of the conversation diminishes proportionally.  This project is an experiment to see how a smaller size LLM will perform in this task if  retrieval augmented generation with contextual retrieval techinques are applied.  This Anthropic [blog post](https://www.anthropic.com/news/contextual-retrieval) and [guide](https://github.com/anthropics/anthropic-cookbook/blob/main/skills/contextual-embeddings/guide.ipynb) are used as a reference, but altered to work without third party services.

How to get started:
```bash
docker compose up -d
python3.9 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

In [261]:
from itertools import repeat
import json
import math
import multiprocessing
import warnings
from lxml import etree
from more_itertools import chunked
import ollama
import psycopg
from tqdm import tqdm

warnings.filterwarnings("ignore", category=DeprecationWarning) 

MODEL='llama3.2:1b'
DB_DSN='host=127.0.0.1 port=5433 dbname=postgres user=postgres password=password'

In [24]:
llm = ollama.Client(host='http://127.0.0.1:11435')

In [18]:
llm.pull(MODEL)

{'status': 'success'}

In [149]:
def llm_generate(prompt, chunks):
    # https://github.com/ollama/ollama-python
    data = '\n'.join([c[0] for c in chunks])
    stream = llm.generate(
        model=MODEL, 
        prompt=f'Using this data: "{data}". Respond to this prompt: {prompt}', 
        stream=True
    )
    for chunk in stream:
        response = chunk['response']
        print(response, end='', flush=True)

### No RAG
**Grade: D**

It is confusing an American Western writer with Goethe.  The details of the deal are what makes it interesting and this answer is too vague.

In [117]:
PROMPT_BARGAIN = "In Goethe's Faust what was the bargain he made with the devil?"
llm_generate(PROMPT_BARGAIN, [])

In Friedrich Schiller's drama "Faust," not Johann Wolfgang von Goethe, it is actually Faust who makes a deal with the devil. 

According to the play, Faust agrees to renounce his mortal life and earthly pleasures in exchange for 24 years of supernatural knowledge, power, and pleasure from Mephistopheles. In return, Mephistopheles grants Faust mastery over fire and darkness, as well as the ability to shape-shift into various animals.

In [242]:
def read_faust():
    'Read the play and chunk it based on part, act, scene, and character speaking.'

    # https://lxml.de/api.html#iteration
    # https://shallowsky.com/blog/programming/parsing-html-python.html
    
    faust = 'Faust, Goethe & A. S. Kline.html'
    iter = etree.iterparse(faust, html=True, events=('start', 'end'))

    # skip header
    for event, element in iter:
        if event == 'start' and element.tag == 'body':
            break

    unwanted = {
        'line-number',
        'small-font'     # picture description
    }

    acc = ''
    h1 = ''
    h2 = ''
    h3 = ''
    character = ''

    def chunk():
        if acc.strip():
            yield ((h1, h2, h3, character), acc)

    for event, element in iter:
        classes = set(element.attrib.get('class', '').split(' '))
        text = (element.text or '').strip()
        tail = (element.tail or '').strip()
        if text and not classes & unwanted: 
            if event == 'start' and element.tag == 'h1':
                yield from chunk()
                acc = ''
                h1 = text
                h2 = ''
                h3 = ''
                character = ''
            elif event == 'start' and element.tag == 'h2':
                yield from chunk()
                acc = ''
                h2 = text
                h3 = ''
                character = ''
            elif event == 'start' and element.tag == 'h3':
                yield from chunk()
                acc = ''
                h3 = text
                character = ''
            elif event == 'start' and 'play-char' in classes:
                yield from chunk()
                acc = ''
                character = text
            elif event == 'start':
                acc += text
            elif event == 'end':
                acc += tail
                acc += '\n'
    yield from chunk()

In [243]:
def read_faust_simple_context():
    for ((context), acc) in read_faust():
        yield f','.join(x for x in context if x) + '\n' + acc

In [244]:
# Debugging purposes
with open('faust-chunks.txt', 'w') as f:
    for chunk in read_faust_simple_context():
        f.write(chunk)
        f.write('================================================\n')
        f.flush()

In [92]:
# Create the vector database table
with psycopg.connect(DB_DSN) as conn:
    with conn.cursor() as cursor:
        cursor.execute("""
            CREATE EXTENSION vector;
            CREATE TABLE embed (
                id bigserial,
                technique smallint,
                value vector(2048),
                chunk text
            );""")
        conn.commit()

In [265]:
def insert_embeddings(technique, read_fn):
    'Insert the play embeddings and chunks into the vector database'
    # https://ollama.com/blog/embedding-models
    # https://github.com/pgvector/pgvector/tree/master?tab=readme-ov-file#storing
    num_embeddings = sum(1 for _ in read_faust())
    with psycopg.connect(DB_DSN) as conn:
        with conn.cursor() as cursor:
            cpu_count = multiprocessing.cpu_count()
            total = math.ceil(num_embeddings / cpu_count)
            for chunks in tqdm(chunked(read_fn(), cpu_count), total=total):
                resp = llm.embed(MODEL, chunks)
                values = [json.dumps(e) for e in resp['embeddings']]
                params = zip(repeat(technique), values, chunks)
                cursor.executemany('INSERT INTO embed(technique, value, chunk) VALUES (%s, %s, %s);', params)
                conn.commit()

In [158]:
RAG = 0
insert_embeddings(0, read_faust_simple_context)

100%|██████████| 261/261 [00:00<00:00, 1457.71it/s]


In [253]:
def retrieve_chunks(technique, prompt):
    resp = llm.embed(MODEL, prompt)
    query_param = json.dumps(resp['embeddings'][0])
    with psycopg.connect(DB_DSN) as conn:
        with conn.cursor() as cursor:
            cursor.execute('SELECT chunk FROM embed WHERE technique=%s ORDER BY value <-> %s LIMIT 20;', [technique, query_param])
            return cursor.fetchall()

In [255]:
chunks = retrieve_chunks(RAG, PROMPT_BARGAIN)
chunks

[('Faust: Parts I & II,Act V,Scene VI: The Great Outer Court of the Palace,Chorus\n\n\xa0\n\n\n It’s past.\n\xa0\n\n\n',),
 ('Faust: Parts I & II,Act V,Scene I: Open Country,The Wanderer\n\n\n\n\n\n\xa0\n\n\n Yes! Here are the dusky lindens,\nStanding round, in mighty age.\n\nAnd here am I, returning to them, \n\nAfter so long a pilgrimage!\n\nIt still appears the same old place:\n\nHere’s the hut that sheltered me,\n\nWhen the storm-uplifted wave,\n\nHurled me shore-wards from the sea! \n\nMy hosts are those I would bless,\n\nA brave, a hospitable pair,\n\nWho if I meet them, I confess,\n\nMust already be white haired.\n\nAh! They were pious people! \n\nShall I call, or knock? – Greetings,\n\nIf, as open-hearted, you still\n\nEnjoy good luck, in meetings!\n\n\xa0\n\n\n',),
 ('Faust: Parts I & II,Part II,Scene III: A Spacious Hall with Adjoining Rooms,The Boy Charioteer\n Let’s hear more! Go on: go on,\nFind the riddle’s bright solution.\n\n\xa0\n\n',),
 ('Faust: Parts I & II,Part I,Sc

### RAG
**Grade: C-**

A little better than last time, at least it's not confusing the author.  There are a lot more details, but unfortunately most are wrong.

In [256]:
llm_generate(PROMPT_BARGAIN, chunks)

In Johann Wolfgang von Goethe's Faust, the titular character makes a pact with the devil Mephistopheles in exchange for six years of his immortal life. This bargain is negotiated in Act V, Scene VI of both parts I and II.

According to the terms of their agreement, Faust agrees to use his immortal life to achieve three earthly goals:

1. To win over and marry the beautiful Helen Schmid (Act III, Scene IV)
2. To learn Greek and become a scholar, thereby gaining knowledge and wisdom (Act IV, Scene I)
3. To acquire the power of eloquence and persuade the crowd, thereby achieving fame and success

In return, Mephistopheles agrees to grant Faust six years of his immortal life in exchange for these three earthly goals. The devil promises that Faust will achieve all three of these goals within the specified timeframe.

It is worth noting that Goethe's Faust is a tragic work, and the bargain made by Faust ultimately leads to his downfall. Despite achieving some of his earthly goals, Faust is u

In [250]:
def read_faust_better_context(debug=False):
    "Faust is really long and exceeds ollama3.2's context window, so adding the context per scene."
    current_scene = (None, None, None)
    dialog = []

    def chunk():
        dialog_body = '\n'.join(dialog)
        if debug:
            dialog_body = dialog_body[:20] + '...' + dialog_body[-20:]
        scene_body = 'In ' + f','.join(x for x in current_scene if x) + '\n' + dialog_body
        for quote in dialog:
            yield scene_body, quote

    for ((document, part_or_act, scene, character), quote) in read_faust():
        if current_scene != (document, part_or_act, scene):
            yield from chunk()
            dialog = []
        
        if character:
            quote = f'{character} says:\n{quote}'

        dialog += [quote]
        current_scene = (document, part_or_act, scene)
    yield from chunk()

In [251]:
# Debugging purposes
with open('faust-context-chunks.txt', 'w') as f:
    for scene_body, quote in read_faust_better_context(True):
        f.write(scene_body)
        f.write('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n')
        f.write(quote)
        f.write('================================================\n')
        f.flush()

In [258]:
CONTEXT_PROMPT = '''
<scene> 
{} 
</scene> 
Here is the chunk we want to situate within the whole scene.
<chunk> 
{} 
</chunk> 
Please give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk. Answer only with the succinct context and nothing else. 
'''

def read_faust_and_add_context():
    for scene_body, quote in read_faust_better_context():
        context = llm.generate(
            model=MODEL, 
            prompt=CONTEXT_PROMPT.format(scene_body, quote)
        )['response']
        yield f'{quote}\n\n{context}'

In [266]:
CONTEXTUAL_EMBEDDING = 1
insert_embeddings(CONTEXTUAL_EMBEDDING, read_faust_and_add_context)

  4%|▍         | 10/261 [2:52:14<78:43:25, 1129.10s/it]

In [236]:
llm.generate(
        model=MODEL, 
        prompt=f'Please give a short succinct answer, what color is the sky?'
    )

{'model': 'llama3.2:1b',
 'created_at': '2024-10-11T00:55:22.313670626Z',
 'response': 'The color of the sky varies depending on the time of day and atmospheric conditions. At sunrise and sunset, the sky often appears pinkish or reddish due to scattering of light by atmospheric particles. During daytime, the sky typically appears blue.',
 'done': True,
 'done_reason': 'stop',
 'context': [128006,
  9125,
  128007,
  271,
  38766,
  1303,
  33025,
  2696,
  25,
  6790,
  220,
  2366,
  18,
  271,
  128009,
  128006,
  882,
  128007,
  271,
  5618,
  3041,
  264,
  2875,
  99732,
  4320,
  11,
  1148,
  1933,
  374,
  279,
  13180,
  30,
  128009,
  128006,
  78191,
  128007,
  271,
  791,
  1933,
  315,
  279,
  13180,
  35327,
  11911,
  389,
  279,
  892,
  315,
  1938,
  323,
  45475,
  4787,
  13,
  2468,
  64919,
  323,
  44084,
  11,
  279,
  13180,
  3629,
  8111,
  18718,
  819,
  477,
  63244,
  819,
  4245,
  311,
  72916,
  315,
  3177,
  555,
  45475,
  19252,
  13,
  12220,