Skip to content

Releases: genli-ai/market-research-skills

v1.7.1 — strict-YAML-safe descriptions (skills.sh detects all 4)

09 Jun 11:02

Choose a tag to compare

Fixed

  • `verifying` & `analyst-research`: frontmatter `description` is now valid under strict YAML parsers. Both previously used a single-line unquoted scalar containing `: ` (colon-space) at `Triggers:` / `Covers five scenarios:` / `Not for:`, which strict parsers reject (`mapping values are not allowed here`). Claude Code's own loader is lenient — so all four skills always worked as a plugin — but skills.sh / `npx skills` (and other strict-YAML terminals) detected only 2 of 4. Converted both to `>-` folded block scalars (same style as `local-vault`). Description text is byte-identical; triggering behavior unchanged.

Added

  • README Option 6 — `npx skills` / skills.sh cross-terminal install:
    ```bash
    npx skills add genli-ai/market-research-skills -a claude-code -s '*'
    ```
    The repo is now indexed by skills.sh (GitHub topics `skills-sh` / `claude-skill`). This path resolves the latest from `main` — no version to pin.

All four skill zips are attached (repo-wide snapshot at v1.7.1).

🤖 Generated with Claude Code

v1.7.0 — local-vault: PDF plain-text fallback + cached-whisper reuse

08 Jun 14:43

Choose a tag to compare

local-vault

Added

  • PDF plain-text fallback. When pymupdf4llm crashes mid-extraction (commonly a missing-font error like MuPDF no font file for digest), the converter now falls back to a local PyMuPDF plain-text pass (page.get_text()) before resorting to cloud MinerU. A font glitch no longer forces a cross-border cloud round-trip. The result is marked converted_by: pymupdf (plain text) so it's identifiable as a degraded artifact.
  • Reuse an already-downloaded whisper model without re-asking. When .env has no model choice yet but a model's weights are already in the user-global HuggingFace cache, the tool auto-selects the best cached tier (turbo > large-v3 > small > tiny) and proceeds — no menu, no re-download, works non-interactively. The 'show download sizes + ask' prompt only appears when there is actually something to download.

Changed

  • A pymupdf4llm error on a digital PDF now logs 'trying plain-text fallback first' instead of going straight to MinerU.

Full snapshot of the repo at v1.7.0. All four skill zips attached: analyst-research, local-vault, topic-brief, verifying.

v1.6.0 — local-vault: platform-aware whisper engine + first-run model consent

06 Jun 20:16

Choose a tag to compare

Added

  • local-vault: whisper engine auto-selected by platformmlx-whisper (GPU-native) on Apple Silicon, faster-whisper (CTranslate2, cross-platform CPU/CUDA) everywhere else. Both paths normalize to one transcript shape, so the rest of the pipeline is engine-agnostic. Override with KB_WHISPER_ENGINE (auto/mlx/faster).
  • local-vault: first-run consent + model-size picker. When audio/video is detected and no model has been chosen yet, the tool (in a terminal) shows each model tier with its download size — tiny ~75 MB / small ~480 MB / turbo ~1.6 GB (recommended) / large-v3 ~3 GB — and lets the user pick or skip. The choice is saved to .env (KB_WHISPER_MODEL) so it never re-asks. On consent the engine is auto-installed (pip install --user). Non-interactive runs (e.g. claude -p) never download a multi-GB model unattended — they report a hint to run once in a terminal first.

Changed

  • Audio/video no longer needs a manual pip install of the transcription engine — the tool installs the right one for the platform after you consent. ffmpeg remains a prerequisite (system package; the tool prints the platform-appropriate install command if missing and skips, rather than auto-installing a system package).

Setup: brew install ffmpeg (or distro pkg) — that's it; the engine installs itself on first consent.

v1.5.0 — local-vault: audio/video local transcription (whisper)

06 Jun 19:57

Choose a tag to compare

Added

  • local-vault: audio & video now transcribe locally via whisper (mlx-whisper). .mp3/.m4a/.wav/.aac/.flac/.ogg and .mp4/.mov/.m4v were previously skipped; they now route to process_transcribe, producing a Markdown transcript with the same source backlink + retrieval frontmatter as every other converter. Fully local — no token, no quota, no API key (first run downloads the model ~1.6 GB once, then offline). Per-segment [mm:ss] timestamps (audio/video equivalent of page-number traceback) + auto-detected language. Video is audio-track only — ffmpeg pulls the audio stream from the container, so video and audio share one path (no keyframe OCR in this version).
  • New config knobs in scripts/config.py: TRANSCRIBE_EXTS, WHISPER_MODEL (default mlx-community/whisper-large-v3-turbo), WHISPER_LANGUAGE, TRANSCRIBE_TIMESTAMPS.
  • Fail-soft dependency preflight — when ffmpeg or mlx-whisper is missing, the whole audio/video batch is reported as skipped with a one-line install hint instead of repeated tracebacks.

Notes

  • mlx-whisper is Apple-Silicon-only. On Intel/Linux the preflight degrades gracefully (files skipped with hint, no crash) — swap in a portable engine (e.g. faster-whisper) for cross-platform use.
  • Best on clear speech (lectures, interviews, meetings, podcasts). Songs/music transcribe poorly — an ASR limitation, not a bug; use the source backlink to verify against the original.

Setup for the new feature: brew install ffmpeg && python3 -m pip install --user mlx-whisper

v1.4.0 — local-vault: local HTML conversion + PDF image tightening

05 Jun 06:12

Choose a tag to compare

local-vault

Changed

  • .html/.htm now convert locally via pandoc instead of MinerU cloud (no token/quota, faster). HTML is pre-cleaned — style/class/id attributes and layout-only div/section/span wrappers stripped — so the vault gets the article content, not inline-CSS noise. Raw HTML is kept so complex tables survive losslessly (no [TABLE] placeholder loss).
  • Digital-PDF image extraction tightened. Size floor 5% → 12% of page area, plus a min-bytes drop (PYMUPDF4LLM_IMAGE_MIN_BYTES, default 6000) and content de-duplication (a logo repeated on every page is stored once). Fixes the slowdown / no-value-image clutter from v1.3.0.

Added

  • Master switch PYMUPDF4LLM_WRITE_IMAGES (default on; .env: KB_PDF_NO_IMAGES=1) to turn digital-PDF image extraction off entirely for a text-only, fastest run.

Release bundles all four skills (verifying / topic-brief / analyst-research / local-vault). Full notes: CHANGELOG.md.

v1.3.0 — local-vault: digital-PDF images + launcher relocation

05 Jun 04:40

Choose a tag to compare

local-vault

Added

  • Digital PDFs now keep their images. The pymupdf4llm path previously dropped every image (text-only output). It now extracts images (≥ PYMUPDF4LLM_IMAGE_SIZE_LIMIT, default 5% of page area) into attachments/<stem>/, renaming them to ASCII-safe names so source filenames with spaces/CJK don't break the Markdown links. Discarded on a sparse→MinerU fallback.
  • New tuning knob PYMUPDF4LLM_IMAGE_SIZE_LIMIT (scripts/config.py, default 0.05).

Changed

  • sync.command is now placed in the knowledge-base root (parent of SOURCE), not inside SOURCE. A stale auto-generated copy left in SOURCE by an older version is removed automatically (user-written launchers untouched).
  • When a different sync.command already exists at the root, an interactive terminal prompts update / skip; non-interactively our own out-of-date launcher self-heals while a user-customized one is left alone.

Release bundles all four skills (verifying / topic-brief / analyst-research / local-vault). Full notes: CHANGELOG.md.

v1.2.0 — local-vault clickable launcher in SOURCE folder

03 Jun 07:10

Choose a tag to compare

local-vault

  • The sync pipeline now drops a clickable sync.command into the user's SOURCE folder (macOS), with the absolute path to sync.py baked in. Tool and data live apart — under /plugin install the script sits in ~/.claude/plugins/cache/…, so the old relative-path launcher was effectively unreachable for plugin users. The new launcher makes the daily loop drop files → double-click → read the .md work regardless of install path. Created by the setup wizard and idempotently on every run, and self-heals (re-points itself when a plugin upgrade moves sync.py).
  • New KB_NO_LAUNCHER=1 .env flag to opt out.

中文:同步管线现在会往用户原始文件目录放一个可双击的 sync.command(macOS),硬编码 sync.py 绝对路径,解决插件安装时工具埋在 cache 深处、相对路径 launcher 不可达的问题。向导 + 每次运行幂等补放,升级自愈。KB_NO_LAUNCHER=1 可关。

All four skills (verifying / topic-brief / analyst-research / local-vault) are attached as a full snapshot of this version.

🤖 Generated with Claude Code

v1.1.0 — local-vault skill

02 Jun 07:33

Choose a tag to compare

Added

New skill: local-vault — build and query a local Markdown knowledge base ("vault").

  • Convert / sync: turn raw files (PDF, Word/docx, PowerPoint/pptx, Excel/xlsx, csv/tsv, images, html, md/txt, code) into clean Markdown with retrieval-friendly frontmatter (abstract / tags / synonyms / key_data + a source backlink). Local-first (pandoc / pymupdf4llm / openpyxl / python-pptx); cloud OCR (MinerU) only as a fallback.
  • Retrieve / answer: answer questions over the vault with retrieval discipline — vault health check, coverage self-monitoring, lossy-content flags, and Maps-of-Content (MOC) proposals that grow from real usage.
  • Incremental sync, orphan staging (deleted sources never hard-deleted), frontmatter-only enrichment (document bodies never rewritten → zero content-loss risk from the tool).

The collection is now four skills: verifying / topic-brief / analyst-research / local-vault.

This release attaches the full set of skill zips (repo-wide snapshot at v1.1.0).


新增

新 skill:local-vault —— 构建并查询本地 Markdown 知识库。

  • 转换 / 同步:把 PDF / Office / 图片 / 代码等原始文件转成带检索 frontmatter 的干净 Markdown,本地优先,云端 OCR(MinerU)仅兜底。
  • 检索 / 回答:基于 vault 负责任地回答 —— 健康检查、覆盖度自检、有损内容标注、按真实使用沉淀 MOC。
  • 增量同步、孤儿暂存、仅改 frontmatter(正文永不改写,零内容丢失风险)。

合集现为四个 skill:verifying / topic-brief / analyst-research / local-vault。本次 Release 挂载全部 skill 的 zip(v1.1.0 全量快照)。

v1.0.0 — analyst-research 3-mode skill + full bilingual sources

28 May 12:45

Choose a tag to compare

First stable release. Bumps from v0.5.0 → v1.0.0, folding the internal 0.6.x analyst-research development series (never published) into one milestone.

Highlights

  • New skill analyst-research — three-mode end-to-end research workflow: light (4-5p memo, 0 charts, 60-80 min) / medium (12-15p, 6-10 charts, half-day) / heavy (30-40p flagship, 25-35+ charts, multi-LLM optional). Mode picked at trigger time.
  • BREAKING: light-research removed as a standalone skill — preserved verbatim as analyst-research light mode.
  • MODE_REGISTRY.md single source of truth for the 3 modes; SessionStart announce hook; AI-disclosure footer convention (spec §八).
  • Full Chinese mirrors (.zh.md) for analyst-research; bilingual-sync notes added to topic-brief / verifying.
  • Report-language policy: reports default English, AI replies in the user's chat language.
  • Fixed repo URL (reagan475614947 → genli-ai) in README + announce hook.

Skills in this release

  • verifying — fact-check against primary sources
  • topic-brief — single-piece HTML briefing generator
  • analyst-research — 3-mode research workflow

Full changelog: see CHANGELOG.md.

v0.5.0 — new light-research skill

17 May 20:07

Choose a tag to compare

Added

  • New skill: light-research — lightweight research workflow that produces a 5-page decision memo / executive brief (PDF + Word) in 60–80 min. Single LLM, 0 hard stops, BLUF consulting-style summary, plain text no charts, inline footnote citations.
    • 6-step skeleton: hypothesis → search → plan → draft → self-check → freeze
    • 1-question onboarding (hypothesis only), everything else locked to sensible defaults
    • 6 source categories, 20–30 source ledger + 5–8 core PDFs in full text
    • 12-item grep self-check enforces writing discipline (dashes, headings, bold-in-body, colon ratio, filler words, meta-language, page count, etc.)
    • Self-check failure auto-rolls back to step 4 — the only quality gate, no user-confirmation hard stop

Files

  • light-research.zip — new skill (this release)
  • topic-brief.zip — unchanged since v0.4.0, re-attached per release-completeness policy
  • verifying.zip — unchanged since v0.2.0, re-attached per release-completeness policy

Install (Claude Code)

/plugin marketplace add https://github.com/reagan475614947/market-research-skills.git
/plugin update market-research-skills@market-research-skills

See CHANGELOG.md for full details (English + 中文).