# Bullet Journal LLM

I currently use the SuperNote as a [Bullet Journal](https://www.youtube.com/watch?v=fm15cmYU0IM). This
notebook is experimenting with processing the supernote media, using extracted text from note pages,
and determining what we can infer or cleanup about each note.

# Simple Daily Analysis

Look at an arbitrary day's note and see what can be inferred/cleaned up.

In [6]:
from pathlib import Path
import logging
import sys
import yaml

import google.generativeai as genai
import os

_LOGGER = logging.getLogger(__name__)
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)

NOTES_DIR = Path("../texts")
PAGES_DIR = Path("../pages")
SECRETS = Path("../secrets.yaml")

MODEL_ID = 'gemini-1.5-flash'

secrets = yaml.safe_load(SECRETS.read_text())

genai.configure(api_key=secrets["gemini_api_key"])

_LOGGER.info("Initializing client library")
model = genai.GenerativeModel(MODEL_ID)

INFO:__main__:Initializing client library


In [7]:
import random

daily_notes = list(NOTES_DIR.glob("Daily-*.txt"))

random.shuffle(daily_notes)
daily_notes[0]

PosixPath('../texts/Daily-16-P20240201073944998430qrrAJQfuUG0J.txt')

In [8]:
with daily_notes[0].open() as f:
    text = f.read()
    print(text)

0201 THU
✗ aot rent
- ai throughput notes
- Bard evcls :/videos?
' Ask Photos user study ,' run synthetic data % lab
notes for synthetic data
* - review Tl Databases slides
wrap up DMA rollout plan
' summarize s nth data next
' monthly reflection
• query lens transitions 1 t 0202 FRI * E serving TPU cal
Tristan email intro
- 15 mins: DMA rollout plan cleanup
1- synth date next steps
1. gap print return label
- At throughput bullet points
. review quality report
0203 SAT, 0204 SUN
- Boo 1-oct flights to LOW
. TP stand
Fill cavity endar * .


In [17]:
OCR_PROMPT = """You are analyzing OCR text from an e-ink SuperNote A5X. The text has been extract
automatically by the device and is not perfect. You can see that the text is not perfect
and will need some cleaning up. The text is in English and is in a bullet journal format.
The bullet journal can contain a variety of different types of information, such as
notes, to-do lists, and reminders. 


An entry in a daily journal can have different types of bullets:
o event
. task
x completed task
- note
< moved to future log
> moved to another daily, weekly, or monthly note

Here are examples of what a daily note looks like:

1030 SUN
o Halloween Party @ 7PM @ Kid's school 
. Pick up candy
x Buy pumpkins
> Call mom

1031 MON
- Trickortreaters were so cute
. Call mom


Your task is to rewrite the journal in a consistent format and fix mistakes in
the OCR. Only answer with the corrected text, and no other commentary, as the
answer will be programtically parsed

Original OCR text {filename}:
{note}

Corrected bullet journal:
"""

response = model.generate_content(OCR_PROMPT.format(note=text, filename=str(daily_notes[0])))

In [18]:
print(response.text)

0201 THU
x paid rent
- AI throughput notes
- Bard evcls :/videos?
- Ask Photos user study, run synthetic data % lab
notes for synthetic data
- review TL Databases slides
- wrap up DMA rollout plan
- summarize synthetic data next
- monthly reflection
- query lens transitions 1 to 0202 FRI * E serving TPU calc
- Tristan email intro
- 15 mins: DMA rollout plan cleanup
- synth data next steps
- gap print return label
- AI throughput bullet points
- review quality report
0203 SAT, 0204 SUN
- Book 1-oct flights to LOW
- TP stand
- Fill cavity calendar * 



In [27]:
import PIL.Image

page_filename = PAGES_DIR / ("_".join(daily_notes[0].name.split("-")[0:2]) + ".png")
page_png = PIL.Image.open(page_filename)

DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13
DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 41 65536


In [31]:


IMG_PROMPT = """You are analyzing OCR text from an e-ink SuperNote A5X. The note is hand
written. You can see that the text is not perfect and will need some cleaning up. The
text is in English and is in a bullet journal format. The bullet journal can contain a
variety of different types of information, such as notes, to-do lists, and reminders. 

An entry in a daily journal can have different types of bullets:
  o event
  . task
  x completed task
  - note
  < moved to future log
  > moved to another daily, weekly, or monthly note
*   important (can be added left of another bullet)

Here are examples of what a daily note looks like:

1030 SUN
  o Halloween Party @ 7PM @ Kid's school 
  . Pick up candy
  x Buy pumpkins
* > Call mom

1031 MON
  - Trickortreaters were so cute
  . Call mom


Your task is to rewrite the journal in a consistent format and fix mistakes in
the OCR. Only answer with the corrected text, and no other commentary, as the
answer will be programtically parsed
"""

response = model.generate_content([IMG_PROMPT, page_png])

In [32]:
print(response.text)

0201 THU
x act rent
< ai through put notes
- Board overview/videos?
*x Ask Photos user study
**x run Synthetic data colab 
**x notes for Synthetic data
**x review TT Databases slides
*x Wrap up DMA rollout plan
x Summarize synth data next
< monthly reflection
* < query lens transitions

0202 FRI
**< Serving TPU calendar
*x Tristan email intro 
*x 15 mins; DMA rollout plan Cleanup
*x synth data next steps
x gap print return label
*x AI through put bullet points
*x review quality report

0203 SAT 0204 SUN
- Booked flights to LOW
*x TP stand
* > Fill cavity 

