# Logseq Full-Text Search

This notebook loads markdown documents from a Logseq vault into PostgreSQL with full-text search enabled.

## Setup

Create a `.env` file in the project root with:
```
DB_HOST=your_host
DB_PORT=5432
DB_NAME=your_database
DB_USER=your_user
DB_PASSWORD=your_password
```

In [None]:
from pathlib import Path

from logseq_searcher import (
    init_db,
    create_schema,
    load_logseq_vault,
    search,
    advanced_search,
    get_document,
    get_document_count,
)

# Initialize database connection from .env file
init_db(Path('../.env'))
print("Database initialized")

In [None]:
# Path to logseq vault
LOGSEQ_PATH = Path.home() / 'git' / 'active' / 'logseq-personal'

print(f"Logseq vault: {LOGSEQ_PATH}")
print(f"Pages directory exists: {(LOGSEQ_PATH / 'pages').exists()}")
print(f"Journals directory exists: {(LOGSEQ_PATH / 'journals').exists()}")

## Create Database Schema

Creates a `documents` table with:
- `id`: Auto-incrementing primary key
- `filename`: Original filename
- `doc_type`: Either 'page' or 'journal'
- `title`: Document title (derived from filename)
- `content`: Full markdown content
- `content_tsv`: Full-text search vector (auto-generated)
- `created_at`: Timestamp

In [None]:
create_schema()
print("Schema created successfully")

## Load Documents

In [None]:
result = load_logseq_vault(LOGSEQ_PATH)
print(f"Loaded {result['pages']} pages and {result['journals']} journals")
print(f"Total: {result['total']} documents")

In [None]:
# Verify the data was loaded
counts = get_document_count()
for doc_type, count in counts.items():
    print(f"{doc_type}: {count} documents")

## Search Functions

In [None]:
def display_results(results: list):
    """Display search results in a readable format."""
    if not results:
        print("No results found.")
        return
    
    print(f"Found {len(results)} result(s):\n")
    for i, r in enumerate(results, 1):
        print(f"{i}. [{r['doc_type']}] {r['title']}")
        print(f"   Rank: {r['rank']:.4f}")
        print(f"   {r['headline']}")
        print()

## Example Searches

In [None]:
# Example: Search for "Feynman"
results = search("Feynman", limit=5)
display_results(results)

In [None]:
# Example: Search only in journals
results = search("Roam", limit=5, doc_type='journal')
display_results(results)

In [None]:
# Example: Search for multiple terms
results = search("Python programming", limit=5)
display_results(results)

## Advanced Search

For more control over search, you can use `advanced_search` which supports:
- `"quoted phrases"`
- `OR` for alternatives
- `-` for exclusion

In [None]:
# Example: Search for exact phrase
results = advanced_search('"favorite problems"', limit=5)
display_results(results)

In [None]:
# Get a specific document by ID
# doc = get_document(1)
# print(doc['content'][:500])