feat: migrate from Chrome DevTools Protocol to arXiv HTTP API by sonesuke · Pull Request #11 · sonesuke/arxiv-cli

sonesuke · 2026-02-19T12:33:14Z

Summary

Replace CDP-based scraping with arXiv's public HTTP API
Add arxiv_client.rs with API-based implementation
Remove CDP module (browser.rs, connection.rs, page.rs) and JavaScript scraping scripts
Update dependencies: add quick-xml, chrono; remove tokio-tungstenite, futures, uuid

Key Changes

search(): Query arXiv API with pagination support
fetch(): Get paper details by ID with PDF text extraction
fetch_pdf(): Download raw PDF bytes
Date filtering with arXiv API's submittedDate format

Benefits

This change enables the application to work as a single binary without requiring Chrome to be installed on the system.

Test plan

Test search functionality with various queries
Test paper fetching with valid arXiv IDs
Test PDF download functionality
Verify binary runs without Chrome dependency

🤖 Generated with Claude Code

This change enables the application to work as a single binary without requiring Chrome to be installed on the system. Changes: - Replace CDP-based scraping with arXiv's public HTTP API - Add arxiv_client.rs with API-based implementation - Remove src/cdp/ module (browser.rs, connection.rs, page.rs, mod.rs) - Remove src/scripts/ directory (JavaScript scraping scripts) - Remove src/arxiv_search.rs (old CDP-based implementation) - Update Cargo.toml: add quick-xml, chrono; remove tokio-tungstenite, futures, uuid - Update config.rs: remove headless and browser_path settings - Update main.rs: remove --head flag and CDP imports API features: - search(): Query arXiv API with pagination support - fetch(): Get paper details by ID with PDF text extraction - fetch_pdf(): Download raw PDF bytes - Date filtering with arXiv API's submittedDate format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-advanced-security · 2026-02-20T22:19:42Z

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

.github/workflows/ci.yml

…#11)" This reverts commit 011989c.

…#11)" (#17) This reverts commit 011989c. Co-authored-by: Claude <claude@anthropic.com>

* Revert "feat: migrate from Chrome DevTools Protocol to arXiv HTTP API (#11)" This reverts commit 011989c. * test: improve CDP coverage by adding unit tests and E2E execution tests * feat: auto-start devcontainer in pr-healer script * chore: strengthen pre-commit and fix undetected clippy failures * chore: cleanup temporary files --------- Co-authored-by: Claude <claude@anthropic.com>

claude and others added 3 commits February 18, 2026 21:02

ci: align status check names with branch protection rules

49e8ef2

ci: broaden push triggers to all branches

1108fbf

github-advanced-security bot found potential problems Feb 20, 2026

View reviewed changes

.github/workflows/ci.yml Fixed Show fixed Hide fixed

.github/workflows/ci.yml Fixed Show fixed Hide fixed

claude added 4 commits February 21, 2026 07:23

style: format code with cargo fmt

48f1659

fix(ci): fix clippy warnings and code style

7f408e1

style: fix extra empty line in arxiv_client.rs

986f77c

ci: simplify to linux-only and fix test failures

35dae48

github-advanced-security bot found potential problems Feb 20, 2026

View reviewed changes

.github/workflows/ci.yml Fixed Show fixed Hide fixed

claude added 2 commits February 21, 2026 07:31

ci: limit push triggers to main to avoid duplicate runs

1365088

ci: consolidate test and verify jobs into Build & Verify

909a02e

github-advanced-security bot found potential problems Feb 20, 2026

View reviewed changes

.github/workflows/ci.yml Fixed Show fixed Hide fixed

ci: add explicit permissions to workflows

4c19302

sonesuke merged commit 011989c into main Feb 20, 2026
4 checks passed

sonesuke deleted the feat/migrate-to-arxiv-api branch February 20, 2026 22:49

sonesuke pushed a commit that referenced this pull request Feb 21, 2026

Revert "feat: migrate from Chrome DevTools Protocol to arXiv HTTP API (…

8650db2

…#11)" This reverts commit 011989c.

sonesuke pushed a commit that referenced this pull request Feb 21, 2026

Revert "feat: migrate from Chrome DevTools Protocol to arXiv HTTP API (…

6b7443f

…#11)" This reverts commit 011989c.

sonesuke pushed a commit that referenced this pull request Feb 21, 2026

Revert "feat: migrate from Chrome DevTools Protocol to arXiv HTTP API (…

4be249f

…#11)" This reverts commit 011989c.

sonesuke added a commit that referenced this pull request Feb 21, 2026

Revert "feat: migrate from Chrome DevTools Protocol to arXiv HTTP API (…

97b5a3a

…#11)" (#17) This reverts commit 011989c. Co-authored-by: Claude <claude@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: migrate from Chrome DevTools Protocol to arXiv HTTP API#11

feat: migrate from Chrome DevTools Protocol to arXiv HTTP API#11
sonesuke merged 10 commits intomainfrom
feat/migrate-to-arxiv-api

sonesuke commented Feb 19, 2026

Uh oh!

github-advanced-security bot commented Feb 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sonesuke commented Feb 19, 2026

Summary

Key Changes

Benefits

Test plan

Uh oh!

github-advanced-security bot commented Feb 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants