Skip to content

fix(cli): exclude hidden elements from scrape command output#46

Merged
echobt merged 1 commit intomasterfrom
fix/bounty-issue-1707
Jan 26, 2026
Merged

fix(cli): exclude hidden elements from scrape command output#46
echobt merged 1 commit intomasterfrom
fix/bounty-issue-1707

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Jan 26, 2026

Summary

This PR fixes the scrape command to properly exclude hidden DOM elements from scraped content.

Changes

  • Added is_element_hidden() helper function to detect hidden elements via:
    • hidden attribute
    • style="display:none"
    • style="visibility:hidden"
    • aria-hidden="true"
  • Updated process_node_to_markdown() to skip hidden elements
  • Updated extract_text_content() to skip hidden elements
  • Added unit tests to verify hidden elements are excluded

Related

Fixes PlatformNetwork/bounty-challenge#1707

Fixes bounty issue #1707

The scrape command was including text from hidden DOM elements in its
output. This fix adds visibility checks before processing elements to
skip content that is hidden via:
- hidden attribute
- style="display:none"
- style="visibility:hidden"
- aria-hidden="true"
@echobt echobt force-pushed the fix/bounty-issue-1707 branch from e923e7d to ed7c5e9 Compare January 26, 2026 16:33
@echobt echobt merged commit c5fd732 into master Jan 26, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] scrape --selector Returns Text from Hidden Elements

2 participants