Skip to content

feat: add Bloomberg adapter#145

Merged
ByteYue merged 2 commits intojackwener:mainfrom
KasumiChen:feat/bloomberg-adapter
Mar 20, 2026
Merged

feat: add Bloomberg adapter#145
ByteYue merged 2 commits intojackwener:mainfrom
KasumiChen:feat/bloomberg-adapter

Conversation

@KasumiChen
Copy link
Contributor

Description

Adds a Bloomberg adapter with RSS-backed listing commands plus browser-backed article extraction for standard Bloomberg story/article pages.

This PR also includes the small manifest/compiler fix needed so helper/test TS files in src/clis/bloomberg/ do not get exposed as fake commands in opencli list.

Related issue:

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 🌐 New site adapter
  • 📝 Documentation
  • ♻️ Refactor
  • 🔧 CI / build / tooling

What this adds

New Bloomberg commands:

  • opencli bloomberg main
  • opencli bloomberg markets
  • opencli bloomberg economics
  • opencli bloomberg industries
  • opencli bloomberg tech
  • opencli bloomberg politics
  • opencli bloomberg businessweek
  • opencli bloomberg opinions
  • opencli bloomberg feeds
  • opencli bloomberg news <link>

What works now

  • The listing commands above are RSS-backed via feeds.bloomberg.com and return structured headline data without requiring a browser.
  • bloomberg news can read standard Bloomberg story/article pages and extract:
    • title
    • summary
    • canonical link
    • media links
    • full article text
  • bloomberg news accepts either a full Bloomberg URL or a relative Bloomberg path.
  • opencli list/manifest output only shows real Bloomberg commands (not helper files like utils or test files).

Out of scope / current limitations

  • Audio pages and some other non-standard Bloomberg URLs are intentionally out of scope for this PR and may still fail.
  • Pages that do not expose the expected __NEXT_DATA__ story payload may still fail.
  • Bloomberg bot-protection / access-gated pages may still fail.
  • This version is for data access/extraction only. It does not bypass Bloomberg paywall, login, entitlement, or other user-side access requirements. Whether an article is readable still depends on the user's own browser/session state.

Docs / wording updates

  • Added a new adapter doc page for Bloomberg.
  • Updated adapter docs index and VitePress sidebar.
  • Added README notes that clarify:
    • which Bloomberg commands are RSS-backed
    • that bloomberg news is for standard story/article pages
    • that audio/non-standard pages may fail
    • that OpenCLI does not bypass Bloomberg access controls
  • Updated bloomberg news command/help text and error messages to be explicit about current session/access requirements.

Tests / manual checks run

Automated

  • npm run build
  • npm run docs:build
  • npx vitest run src/clis/bloomberg/utils.test.ts src/build-manifest.test.ts
  • npx vitest run tests/e2e/public-commands.test.ts tests/e2e/browser-public.test.ts -t bloomberg

Manual

  • node dist/main.js list -f json
    • verified Bloomberg commands are registered and helper/test files are not exposed as commands
  • node dist/main.js bloomberg feeds -f json
  • node dist/main.js bloomberg main --limit 1 -f json
  • node dist/main.js bloomberg news <article-link> -f json
    • verified article extraction on an accessible standard /news/articles/... page

Checklist

  • I ran the checks relevant to this PR
  • I updated tests or docs if needed
  • I included output or screenshots when useful

Documentation (if adding/modifying an adapter)

  • Added doc page under docs/adapters/
  • Updated docs/adapters/index.md table
  • Updated sidebar in docs/.vitepress/config.mts

Screenshots / Output

Relevant manual smoke summary:

  • bloomberg main --limit 1 -f json returned live RSS data successfully
  • bloomberg news successfully extracted a standard article page with non-empty content

KasumiChen and others added 2 commits March 20, 2026 09:18
- news.ts: reorder flow (goto before loadStory), increase wait times for slow hydration, add clarity comments
- utils.ts: clarify validateBloombergLink regex (use non-capturing group)
- build-manifest.ts: log warning on scanTs parse failure (match scanYaml pattern)
- public-commands.test.ts: use it.each for section RSS tests (better isolation & reporting)
Copy link
Collaborator

@ByteYue ByteYue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Bloomberg Adapter PR

Nice work on this PR! Well-structured with good documentation and comprehensive tests. I've pushed a follow-up commit with a few improvements:

Changes made (c5a0bac):

  1. news.ts: Reordered execution flow — page.goto(url) now comes before loadStory definition for clearer top-down reading. Increased wait times (5s initial + 4s retry) for Bloomberg's heavy Next.js pages.

  2. utils.ts: Clarified the hostname regex in validateBloombergLink — replaced ([.]|^) with (?:\\.|^) (non-capturing group form is unambiguous).

  3. build-manifest.ts: Added console.warn when scanTs() fails to parse a file, matching the existing scanYaml() pattern. The previous silent return null could hide real I/O errors.

  4. public-commands.test.ts: Replaced the for-loop over 7 sections with it.each() for better test isolation — each section now gets its own test case with independent pass/fail reporting.

Considered but kept:

  • Consolidating the 8 feed command files into a single factory — this would break the static manifest scanner which relies on regex to extract name: and description: from source files. Individual files are the correct pattern for this codebase.

LGTM overall — solid first adapter contribution! 🎉

@ByteYue ByteYue merged commit 7700704 into jackwener:main Mar 20, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants