Releases
v0.4.1
Compare
Sorry, something went wrong.
No results found
chore: bump version to 0.4.1
Merge pull request #65 from pixlie/chore/update-readme-docs
docs: move development and contributing info to separate guide
chore: restructure documentation with docs/ folder and OS-specific guides
Merge pull request #63 from pixlie/feature/template-detection
fix: resolve clippy warnings for uninlined format args
chore: apply code formatting with cargo fmt
feat: disable domain duplicate filtering in template mode
feat: apply template detection before domain duplicate analysis
feat: implement template detection for variable and text concatenation patterns
Merge pull request #60 from pixlie/feature/real-world-tests
fix: resolve clippy warnings in real-world tests
style: apply cargo fmt formatting
feat: enhance real-world tests with complete SmartCrawler pipeline
fix: configure real-world tests to run serially
feat: add real-world integration tests (#56 )
Merge pull request #59 from pixlie/fix/duplicate-root-url-loading
fix: prevent duplicate root URL loading (#58 )
Merge pull request #57 from pixlie/feature/filter-domain-duplicate-html-nodes
feat: prioritize root URLs and improve element ID handling
fix: remove overly aggressive page-level duplicate filtering
feat: improve duplicate detection to include complete element structure
fix: improve domain duplicate filtering logic
feat: add domain-level duplicate HTML node filtering
Merge pull request #55 from pixlie/feature/project-restart-step-one
Formatting fixes
chore: fix formatting and linting issues
fix: change default WebDriver port from 9515 to 4444
fix: initialize rustls crypto provider to prevent runtime error
feat: project restart with WebDriver-based crawler
feat: include single-item groups in grouped data detection
Merge pull request #50 from pixlie/feature/grouped-data-extraction
fix: show complete path including grouped elements in path display
refactor: improve grouped data path display with full CSS selectors
fix: eliminate duplicates in grouped data detection
feat: implement grouped data detection with --grouped CLI option
Merge pull request #48 from brainless/feature/data-extraction
fix: resolve clippy warning in extractor module
feat: implement data extraction mode with --extract CLI option
Merge pull request #47 from brainless/chore/remove-html-cleaning
chore: remove HTML cleaning functionality
Merge pull request #46 from brainless/chore/remove-structured-content
Formatting fixes
chore: remove StructuredContent and extract_structured_data
Minor changes to Claude.md
Merge pull request #40 from brainless/feature/improve-html-cleaner
feat: extend image filtering to handle SVG elements (issue #39 )
feat: improve HTML cleaner with image filtering and comment removal (fixes #39 )
Merge pull request #38 from brainless/feature/clean-html-url-support
fix: initialize crypto provider in browser URL test
feat: use browser/webdriver for URL fetching in --clean-html mode
feat: add URL support to --clean-html mode (fixes #37 )
You can’t perform that action at this time.