Skip to content

AdityaVG13/TweetKB

Repository files navigation

TweetKB

Private X/Twitter bookmark knowledge base. Local SQLite. Terminal-first.

Latest release MIT license Python 3.11+ Support TweetKB on Ko-fi

TweetKB turns saved X/Twitter bookmarks into a private knowledge base you can search, classify, enrich, review, and export. It reads from your logged-in browser, stores everything in local SQLite, and writes portable notes for Obsidian, Logseq, Markdown, JSONL, CSV, or a searchable HTML analysis spec.

It ships with no bookmark database, never uploads your archive, and never needs your X/Twitter password.

From saved bookmarks To usable knowledge
Browser collection from X bookmarks Local SQLite database with review state
Tweet text, raw visible text, links Categories, entities, summaries, tags
Question posts and threads Captured reply/context notes for analysis
Linked pages and visible media metadata Obsidian notes, Markdown, JSONL, CSV, or interactive specs

Quick Start

Install as a global uv tool:

uv tool install git+https://github.com/AdityaVG13/TweetKB.git
uv tool update-shell

Open a new terminal, then run:

tweetkb init
tweetkb

That gives you the direct tweetkb command. No uv run needed after install.

Use a source checkout when you want to develop, run tests, or inspect the code:

git clone https://github.com/AdityaVG13/TweetKB.git TweetKB
cd TweetKB
uv sync --extra dev
uv run tweetkb init
uv run tweetkb

What You Can Build

Need Command path
Collect bookmarks from a logged-in browser tweetkb collect
Classify and analyze selected slices tweetkb analyze --stage all
Capture full posts, links, and thread context tweetkb enrich --apple-events
Export an interactive analysis bundle tweetkb analyze-export --adapter spec --vault ./exports/spec
Review or exclude low-confidence items tweetkb review list or tweetkb serve

Support Open Source

TweetKB is free, local-first, and open source. If it saves you time or helps you turn a messy X bookmark backlog into useful knowledge, donations help fund maintenance, documentation, testing, screenshots, and future open-source work.

Support TweetKB on Ko-fi

Privacy Model

TweetKB is built around a simple rule: your bookmark archive stays yours.

  • The repository tracks only data/.gitkeep, not a database.
  • Runtime data stays under data/ by default and is ignored by git.
  • Export folders such as obsidian-vault/ and exports/ are ignored.
  • Browser profiles, cookies, .env, and local config files are ignored.
  • Collection is read-only. It scrolls and reads visible bookmark content.
  • It never posts, likes, follows, deletes, messages, or changes account settings.
  • Cloud LLM providers are off unless you explicitly enable them.

Requirements

  • Python 3.11 or newer
  • uv
  • Chrome or Chromium for bookmark collection
  • browser-harness on PATH for Browser-Harness collection

Install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Official uv docs: https://docs.astral.sh/uv/getting-started/installation/

Upgrade a tool install:

uv tool upgrade tweetkb

Optional local config:

cp tweetkb.example.toml tweetkb.toml

tweetkb.toml is ignored because it may contain private filesystem paths.

Start with the menu

Run:

tweetkb

From a source checkout, use uv run tweetkb.

TweetKB terminal menu

The menu can initialize the database, open login Chrome, collect bookmarks, analyze selected slices, analyze and export to a chosen folder, enrich posts, export notes, review bookmarks, show stats, generate clusters, mine project ideas, export graphs, run TweetZip, start the review UI, run doctor checks, and run the release audit.

Use 5a. Analyze + export to folder when you want TweetKB to run analysis and write the resulting analysis documents to a specific output folder in one step.

Every menu action prints the exact tweetkb ... command before running it. Long-running commands also print progress lines such as selected counts, bookmark IDs, enrich URLs, and final totals.

Common workflow

Open the login browser:

uv run tweetkb login

Collect a small batch:

uv run tweetkb collect --limit 100 --batch-size 20

Classify first:

uv run tweetkb analyze --stage classify

Run heavier analysis only where it matters:

uv run tweetkb analyze --stage entities --include-category ai-agents,coding
uv run tweetkb analyze --stage embed --exclude-category misc --needs-review

Export to Obsidian:

uv run tweetkb export --adapter obsidian --vault ./obsidian-vault

Open the local review UI:

uv run tweetkb serve

Then open http://127.0.0.1:8765.

Collection modes

Install Browser-Harness before collecting bookmarks:

git clone https://github.com/browser-use/browser-harness ~/Developer/browser-harness
cd ~/Developer/browser-harness
uv tool install -e .
browser-harness --setup
browser-harness --doctor

TweetKB can collect through Browser-Harness managed Chrome, normal Chrome CDP, or macOS Apple Events. These collectors are deterministic local browser automation, not AI browser agents, so collection does not need an LLM model or Browser Use cloud API.

See Browser-Harness setup for managed Chrome, normal Chrome, and troubleshooting notes.

Interactive collection defaults to Apple Events against your already-open normal Chrome bookmarks tab. Open https://x.com/i/bookmarks, then choose 3. Collect bookmarks and press Enter for apple-events.

When you collect --all, TweetKB stops after it reaches already-saved bookmark history. Use --no-stop-at-existing only when you intentionally want a full timeline rescan.

Default Browser-Harness collection:

uv run tweetkb collect --limit 100 --batch-size 20

Normal Chrome profile:

uv run tweetkb chrome-debug
uv run tweetkb collect --normal-chrome --existing-tab --limit 100

macOS Apple Events fallback:

uv run tweetkb collect --apple-events --all --batch-size 10 --wait 1

For Apple Events mode, Chrome must allow JavaScript from Apple Events.

Analyze selected bookmarks

Category filters use existing classifications. Run classification once, then target later stages:

uv run tweetkb analyze --stage classify
uv run tweetkb analyze --stage all --limit 100
uv run tweetkb analyze --stage entities --include-category ai-agents,coding
uv run tweetkb analyze --stage embed --exclude-category misc --needs-review
uv run tweetkb analyze --stage all --reviewed
uv run tweetkb analyze-export --stage all --adapter spec --vault ./exports/spec

TweetKB progress output

Enrich saved posts

enrich opens saved status URLs in logged-in Chrome and captures long-form post or article text. It also captures visible thread/reply context by default when the bookmarked tweet looks like a question, so answers in the discussion become part of later analysis. Add --include-media to analyze attached images; image descriptions and OCR text are stored with the bookmark and included in later classification, entity extraction, embeddings, and exports.

uv run tweetkb enrich --apple-events --limit 100 --wait 4
uv run tweetkb enrich --apple-events --include-conversation always --max-conversation-items 20
uv run tweetkb enrich --apple-events --include-links --max-links 3
OPENAI_API_KEY=... uv run tweetkb enrich --apple-events --include-media --vision-provider openai --vision-detail high
uv run tweetkb enrich --apple-events --include-media --vision-provider ollama --vision-model llava
uv run tweetkb media-export --out ./exports/media-review
uv run tweetkb analyze --stage all

Vision providers:

  • openai: sends image URLs to the OpenAI Responses API. Requires OPENAI_API_KEY.
  • ollama: downloads images locally and sends them to an Ollama vision model.
  • metadata: stores captured image alt text only, useful as a no-model fallback.

If you do not want to configure a vision API key, run media-export after enrichment. It downloads captured tweet images into a folder with manifest.jsonl, index.md, and AI_REVIEW_PROMPT.md so you can point any AI assistant at the folder for manual visual analysis.

Conversation modes:

  • auto: capture thread/reply context for question-like bookmarks
  • always: capture visible thread/reply context for every enriched bookmark
  • never: capture only the bookmarked status/article

Export

uv run tweetkb export --adapter obsidian --vault ./obsidian-vault
uv run tweetkb export --adapter spec --vault ./exports/spec
uv run tweetkb export --adapter markdown --vault ./exports/markdown --exclude-category misc
uv run tweetkb export --adapter jsonl --vault ./exports/jsonl --exclude-review
uv run tweetkb export --adapter csv --vault ./exports/csv --include-category ai-agents,coding,models,tools

spec writes a static index.html with search, category filters, expandable analysis sections, captured thread/link context, entities, tags, and visible media metadata. It is meant for browsing the analysis as an interactive local spec instead of reading plain Markdown files.

How analysis documents are built

TweetKB builds exports from the local SQLite database:

  1. Collection stores the bookmarked status URL, author, visible tweet text, raw article text, timestamps, and mentioned links. Collection dedupes by X status ID, so seeing the same bookmark again updates the existing row.
  2. Enrichment opens saved X URLs in logged-in Chrome. It captures fuller status or article text, optional outbound linked pages, question-aware thread/reply context, and visible image metadata when the page exposes it. By default it selects only bookmarks missing the requested enrichment, up to the limit, newest collected/bookmark-page order first. It is not classification.
  3. Analysis joins the original tweet text with captured enrichments, hashes that combined text, and records per-stage analysis state so changed-only runs skip unchanged classification, entity extraction, and embedding work. Then it classifies categories, extracts entities, creates tags/summaries, writes "why it matters", and stores an embedding when that stage is selected.
  4. Export turns the stored analysis into the selected format. Markdown/Obsidian write note files. spec writes one interactive HTML analysis bundle.

Images are not downloaded, OCRed, or semantically analyzed yet. The spec export can show image URLs/alt text captured during enrichment, but the current analysis model is text/link/context based.

Review

uv run tweetkb review list --limit 50
uv run tweetkb review approve 1234567890123456789
uv run tweetkb review exclude 1234567890123456789
uv run tweetkb review tag 1234567890123456789 research
uv run tweetkb review junk --limit 25

Compression

TweetZip is an experimental local archive format for bookmark corpora.

uv run tweetkb compress export --out ./exports/bookmarks.twz
uv run tweetkb compress verify ./exports/bookmarks.twz
uv run tweetkb compress inspect ./exports/bookmarks.twz

Instructions for AI coding agents

Use this when an AI agent is asked to download, install, or verify TweetKB.

You are installing TweetKB from source.

Rules:
- Do not ask for X/Twitter credentials.
- Do not inspect or upload browser cookies, browser profiles, `.env`, `data/`,
  `exports/`, or vault folders.
- Do not run `collect`, `enrich`, `chrome-debug`, or `--normal-chrome` unless
  the user explicitly asks you to operate their browser.
- Use synthetic data for tests.

Install and verify:
1. Ensure `uv` exists. If missing, install it from the official Astral docs.
2. Run: git clone https://github.com/AdityaVG13/TweetKB.git TweetKB
3. Run: cd TweetKB
4. Run: uv sync --extra dev
5. Run: uv run tweetkb --db /tmp/tweetkb-smoke.sqlite3 init
6. Run: uv run tweetkb --help
7. Run: uv run tweetkb
8. Select `0` to quit the menu.
9. Run: uv run pytest
10. Run: uv run ruff check .
11. Run: uv run tweetkb release-audit

Success means the CLI works, the menu opens, tests pass, lint passes, and
release audit passes.

Shell-only verification:

git clone https://github.com/AdityaVG13/TweetKB.git TweetKB
cd TweetKB
uv sync --extra dev
uv run tweetkb --db /tmp/tweetkb-smoke.sqlite3 init
uv run tweetkb --help
printf '0\n' | uv run tweetkb
uv run pytest
uv run ruff check .
uv run tweetkb release-audit

Public release audit

Run this before publishing source or building artifacts:

uv run tweetkb release-audit

For a local folder that may contain ignored databases or vault exports:

uv run tweetkb release-audit --strict-worktree

See docs/RELEASE.md for the full release checklist.

Development

uv sync --extra dev
uv run pytest
uv run ruff check .
uv run python -m compileall src tests tools
uv build

Documentation

About

Organize and Analyze your Twitter Bookmarks

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages