<a href="https://colab.research.google.com/github/louis030195/10x-google-photo/blob/main/batch_physical_journal_to_markdown.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📸 Journal OCR Tool

### how to use
1. take pictures of your journal using your phone
2. make sure they sync to google photos
3. go to photos.google.com on your computer
4. select all the photos you want to process
5. click the 3-dot menu (top right) and choose 'download'
6. upload the zip file below and run all cells
7. copy the generated markdown text at the end

### setup
first, let's install required packages:

In [1]:
!pip install openai pillow pillow-heif

Collecting pillow-heif
  Downloading pillow_heif-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.8 kB)
Downloading pillow_heif-0.21.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m36.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pillow-heif
Successfully installed pillow-heif-0.21.0


### enter your openai api key
you can get one from: https://platform.openai.com/account/api-keys

In [2]:
from getpass import getpass
OPENAI_API_KEY = getpass('enter your openai api key: ')

enter your openai api key: ··········


### upload your photos zip file
use the file upload widget below:

In [3]:
from google.colab import files
uploaded = files.upload()
ZIP_PATH = list(uploaded.keys())[0]  # get the uploaded zip filename

Saving Photos (1).zip to Photos (1).zip


In [5]:
import zipfile
import base64
import os
from datetime import datetime
from pathlib import Path
import time
from openai import OpenAI
from typing import List, Dict
from PIL import Image
from io import BytesIO, StringIO

class PhotoTextExtractor:
    def __init__(self, api_key: str):
        self.client = OpenAI(api_key=api_key)
        self.supported_formats = {'.jpg', '.jpeg', '.png', '.gif', '.webp', '.heic'}
        self.output_buffer = StringIO()

    def is_image(self, filename: str) -> bool:
        ext = os.path.splitext(filename)[1].lower()
        return ext in self.supported_formats

    def encode_image(self, image_bytes: bytes, filename: str) -> str:
        if filename.lower().endswith('.heic'):
            try:
                from pillow_heif import register_heif_opener
                register_heif_opener()

                img = Image.open(BytesIO(image_bytes))
                buffer = BytesIO()
                img.save(buffer, format="PNG")
                buffer.seek(0)
                return base64.b64encode(buffer.getvalue()).decode('utf-8')
            except ImportError:
                print("please install pillow-heif")
                raise
        else:
            return base64.b64encode(image_bytes).decode('utf-8')

    def extract_text_from_image(self, image_bytes: bytes, filename: str) -> Dict:
        base64_image = self.encode_image(image_bytes, filename)

        try:
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=[{
                    "role": "user",
                    "content": [
                        {"type": "text", "text": """Extract and list all visible text from this image.
                        Below the OCR'd text, add a bullet list of key elements from these journal notes.
                        """},
                        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}", "detail": "high"}}
                    ]
                }],
                max_tokens=1000
            )

            return {
                "filename": filename,
                "text": response.choices[0].message.content,
                "timestamp": datetime.now().isoformat()
            }

        except Exception as e:
            print(f"error processing {filename}: {str(e)}")
            return {
                "filename": filename,
                "error": str(e),
                "timestamp": datetime.now().isoformat()
            }

    def get_photo_date(self, img: Image.Image, filename: str) -> str:
        try:
            exif = img._getexif()
            if exif:
                for tag in [36867, 36868, 306]:
                    if tag in exif:
                        date_str = exif[tag]
                        dt = datetime.strptime(date_str, '%Y:%m:%d %H:%M:%S')
                        return dt.strftime('%d%m%y')
        except Exception:
            pass

        return datetime.now().strftime('%d%m%y')

    def process_zip(self, zip_path: str) -> str:
        with zipfile.ZipFile(zip_path, 'r') as zip_ref:
            for file in zip_ref.namelist():
                if self.is_image(file):
                    print(f"processing: {file}")

                    # read image bytes
                    image_bytes = zip_ref.read(file)

                    # get date
                    img = Image.open(BytesIO(image_bytes))
                    photo_date = self.get_photo_date(img, file)

                    # extract text
                    result = self.extract_text_from_image(image_bytes, file)

                    # append to output buffer
                    self.output_buffer.write(f"""### ocr from {file}
#ocr #screenshot

extracted on: {result['timestamp']}

{result['text']}

---

""")

                    time.sleep(1)  # rate limiting

        return self.output_buffer.getvalue()

# process the uploaded zip
extractor = PhotoTextExtractor(OPENAI_API_KEY)
output_text = extractor.process_zip(ZIP_PATH)

# display results
print("\n=== generated markdown text below (copy this) ===\n")
print(output_text)

processing: Screenshot 2025-01-06 at 9.30.57 AM.png
processing: Screenshot 2025-01-06 at 9.07.26 AM.png

=== generated markdown text below (copy this) ===

### ocr from Screenshot 2025-01-06 at 9.30.57 AM.png
#ocr #screenshot

extracted on: 2025-01-06T18:29:05.547310

**Extracted Text:**

1. louis030195
   - PRO
2. Dashboard
3. Due today
   - 3
4. New cards
   - 25
5. Settings
6. DECKS
7. Flashcards
8. Notes
9. mindset reframes
   - 1
10. tactical openers
   - 1
11. power moves
   - 1
12. deep connection
13. professional/startup context
14. social warmup
15. dating/romantic context
16. Import
17. Trash
   - 1
18. mindset reframes
19. Q: When feeling self-conscious at an event, remember:
   - + 1 HIDDEN SIDE
   - 0 DAYS
20. Q: When thinking "did i say something wrong?", reframe to:
   - + 1 HIDDEN SIDE
   - NEW
21. Q: When anxious about silence in conversation, remember:
   - + 1 HIDDEN SIDE
   - NEW
22. Q: When worried "am i being awkward?", remind yourself:
   - + 1 HIDDEN SIDE
   - N