In [2]:
import os
import yaml
from dotenv import load_dotenv
from openai import OpenAI

## Setup

- Load OpenAI API key from .env using dotenv package
- Start client
- define functions to get embeddings from a piece of text

In [3]:
load_dotenv()
client = OpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)
def get_embedding(text, model="text-embedding-ada-002"):
   text = text.replace("\n", " ")
   return client.embeddings.create(input = [text], model=model).data[0].embedding

Define function to parse a markdown file into metadata and text. Probably ultimately want to insert some metadata into text and clean up markdown.

In [5]:
def parse_markdown_file(file_path):
    """
    Reads a markdown file and returns its frontmatter as a dictionary and the rest of the text as a string.

    :param file_path: Path to the markdown file.
    :return: A tuple containing a dictionary of the frontmatter and a string of the markdown text.
    """
    with open(file_path, 'r', encoding='utf-8') as file:
        lines = file.readlines()

    # Check if the file starts with frontmatter (triple dashes)
    if lines and lines[0].strip() == '---':
        # Try to find the second set of triple dashes
        try:
            end_frontmatter_idx = lines[1:].index('---\n') + 1
        except ValueError:
            # Handle the case where the closing triple dashes are not found
            frontmatter = {}
            markdown_text = ''.join(lines)
        else:
            frontmatter = yaml.safe_load(''.join(lines[1:end_frontmatter_idx]))
            markdown_text = ''.join(lines[end_frontmatter_idx + 1:])
    else:
        frontmatter = {}
        markdown_text = ''.join(lines)
    return frontmatter, markdown_text

In [7]:
markdown_file = "/Users/tim/Library/Mobile Documents/iCloud~md~obsidian/Documents/Taelgar/People/Other Nonhumans/Grash.md"
metadata, text = parse_markdown_file(markdown_file)

In [9]:
print(metadata)
print(text)

{'headerVersion': '2023.11.25', 'tags': ['person', 'status/tim'], 'displayDefaults': {'endStatus': 'killed'}, 'campaignInfo': [{'campaign': 'DuFr', 'type': 'scryed', 'date': datetime.date(1748, 12, 28)}, {'campaign': 'DuFr', 'type': 'defeated', 'date': datetime.date(1749, 1, 20)}], 'name': 'Grash', 'born': None, 'died': datetime.date(1749, 1, 20), 'species': 'undead', 'ancestry': 'skeletal', 'gender': 'male', 'excludePublish': ['clee'], 'whereabouts': [{'type': 'away', 'start': 1747, 'end': datetime.date(1748, 11, 28), 'location': 'Kharsan'}, {'type': 'away', 'start': datetime.date(1748, 11, 28), 'end': datetime.date(1748, 12, 5), 'location': 'Garamjala'}, {'type': 'away', 'start': datetime.date(1748, 12, 5), 'end': datetime.date(1748, 12, 14), 'location': 'Xurkhaz'}, {'type': 'away', 'start': datetime.date(1748, 12, 14), 'end': 9999, 'location': 'Uzgukhar'}]}
%% Tim: Not sure if you want to flip the status to needswork or accept this page is good enough. I didn't set the active year b

In [10]:
## lets just try getting an embedding of the text first
text_embedding = get_embedding(text)

In [11]:
print(text_embedding)

[-0.0039802566, -0.014347032, 0.00090452464, -0.013908263, 0.01736966, 0.010990106, -0.048529185, -0.014820623, -0.019082945, -0.033485696, 0.022607023, 0.0019431175, -0.0059129274, -0.0028937825, -0.0067591234, 0.013984874, 0.027412582, 0.017926825, 0.01157513, -0.025908234, -0.021269822, -0.0033638915, -0.019152591, -0.0042553577, -0.014333103, 0.031953488, -0.005199058, -0.0017211215, 0.0046732323, -0.009179315, 0.007800328, -0.014806694, -0.017912896, 0.012132296, -0.03685655, -0.001504349, 0.0051677176, -0.027719023, 0.016798563, -0.0050005675, 0.014152024, 0.001728086, 0.002986063, -0.00012797413, 0.012097473, 0.004478224, -0.0020075398, -0.030867012, -0.004875205, 0.0247939, 0.036271527, 0.03721871, -0.009604154, 0.009374323, 0.0078490805, -0.0046871617, 0.039948825, 0.016269255, 0.008385353, 0.0070307422, 0.016812492, -0.018762575, 0.001286706, 0.0061253468, -0.013344132, 0.010098639, -0.022035927, -0.02826226, -0.017759675, -0.010258825, 0.027983677, 0.011916394, 0.0024288967,

Now let's experiment with our session note summarization. 

Basic protocal will be:
- Read session note, set metadata such as date
- Set up chat API call with a system message to extract a short tagline and a description
- Run and see what happens?

In [13]:
sys_prompt = "You are a creative and careful assistant who is skilled in extracting summaries and meaningful content from text. You will receive a query that consists of possibly some optional context, followed by a potentially long text. This text will describe a narrative of one or more days, describing the events that happened in a fictional world. Your job is to summarize these narratives. You will return a JSON object that contains three things: 1. A tagline: this is a 5-10 word tagline that could be used as a subtitle for the text; it should capture the main event of the narrative succinctly and clearly, and ALWAYS start with the words *in which* 2. A summary: this is no more than 100 words, in the form of a markdown list. each element of the list should succinctly, clearly, and accurately summarize a main event from the narrative. Choose carefully to only summarize the primary or most important parts of the narrative.\n3. A list of people and places: using context, infer which words in the narrative may refer to people or places, and return a list of people and a list of places that are mentioned in the narrative. Your primary concern is summarization. Your goal is to extract the most important and relevant information from the text. You will remember that this text describes events in a fictional world. The text you receive will be formatted in markdown format, and you will ignore markdown formatting characters in your responses."
def get_session_summary(prompt, model="gpt-4-1106-preview", max_tokens=4000, system_prompt=sys_prompt):
    input_messages = []
    input_messages.append({"role": "system", "content": system_prompt})
    input_messages.append({"role": "user", "content": prompt})
    response = client.chat.completions.create(
        model=model,
        max_tokens=max_tokens,
        messages=input_messages,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0,
        temperature=1,
    )
    return response

In [14]:
session_note_path = "/Users/tim/taelgar/taelgarverse/docs/campaigns/dunmari-frontier/session-notes/session-1-dufr.md"
metadata, text = parse_markdown_file(session_note_path)
context = "Context: this describes events happening to a group of adventurers called the Dunmar Fellowship, occurring in the D&D world of Taelgar."
prompt = context + "\n===\n" + text
summary = get_session_summary(prompt)
print(summary)

ChatCompletion(id='chatcmpl-8iopyIgy90uddY4I4Gpmlui3x33nE', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='```json\n{\n  "tagline": "In which the Dunmar Fellowship defends Karawa",\n  "summary": [\n    "The Dunmar Fellowship fights off giant hyenas attacking the village of Karawa, saving all villagers.",\n    "Seeker, Wellby Goodbarrel, Kenzo, and Delwath collaborate in battle, with assistance from a divine woman named Beli.",\n    "Post-battle, leaders Speaker Candrosa and Elder Kisa discuss past attacks with the heroes, suspecting a larger threat.",\n    "The Fellowship tracks the beasts into the desert, losing their trail amidst box canyons and rocky terrain.",\n    "Wellby uses a grappling hook to survey the land and spots a dust cloud on the old trade road.",\n    "They meet Alesh, a Dunmari scout, who recalls a decade of relative safety from the Nashtkar threatening Karawa.",\n    "Following the trail to Gomat oasis, they enco

In [20]:
import json
print(summary.choices[0].message.content)


```json
{
  "tagline": "In which the Dunmar Fellowship defends Karawa",
  "summary": [
    "The Dunmar Fellowship fights off giant hyenas attacking the village of Karawa, saving all villagers.",
    "Seeker, Wellby Goodbarrel, Kenzo, and Delwath collaborate in battle, with assistance from a divine woman named Beli.",
    "Post-battle, leaders Speaker Candrosa and Elder Kisa discuss past attacks with the heroes, suspecting a larger threat.",
    "The Fellowship tracks the beasts into the desert, losing their trail amidst box canyons and rocky terrain.",
    "Wellby uses a grappling hook to survey the land and spots a dust cloud on the old trade road.",
    "They meet Alesh, a Dunmari scout, who recalls a decade of relative safety from the Nashtkar threatening Karawa.",
    "Following the trail to Gomat oasis, they encounter three large lizards feasting on dead sheep upon arrival."
  ],
  "people": [
    "Seeker",
    "Wellby Goodbarrel",
    "Kenzo",
    "Delwath",
    "Beli",
    "Bady

JSONDecodeError: Expecting value: line 1 column 1 (char 0)