[WIP] Add the test that we can actually download habr article (markdown and image): https://habr.com/ru/articles/895896 using all our support engines #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds comprehensive integration tests to verify that the web-capture service can successfully download real-world content from Habr.com (a Russian tech publication) using both Puppeteer and Playwright browser engines.
What Was Tested
The tests verify that we can download the Habr article at https://habr.com/ru/articles/895896 in both supported formats:
Implementation Details
New Files
tests/integration/habr-article.test.js- Integration tests for Habr article downloads (5 test cases)Modified Files
jest.config.mjs- Added**/tests/integration/**/*.test.jsto testMatch patternssrc/browser.js- Fixed Playwright adapter to properly handle browser context creation and setUserAgent limitationTest Results
All 5 new tests pass successfully:
Technical Notes
Playwright User Agent Limitation
During implementation, I discovered that Playwright doesn't support
setUserAgent()after page creation (unlike Puppeteer). The user agent must be set during browser context creation. I updated the browser abstraction layer to:setUserAgent()is called on Playwright pages (as it has no effect)This is acceptable because:
Test Strategy
The tests use
domcontentloadedinstead ofnetworkidle0to avoid timeouts on complex pages. The tests verify:Fixes
Fixes #7
🤖 Generated with Claude Code