Skip to content

fix: add score_threshold support to BestFirstCrawlingStrategy#1804

Closed
nightcityblade wants to merge 1 commit intounclecode:mainfrom
nightcityblade:fix/issue-1801
Closed

fix: add score_threshold support to BestFirstCrawlingStrategy#1804
nightcityblade wants to merge 1 commit intounclecode:mainfrom
nightcityblade:fix/issue-1801

Conversation

@nightcityblade
Copy link

Summary

Add score_threshold parameter to BestFirstCrawlingStrategy, matching the existing behavior in BFSDeepCrawlStrategy and DFSDeepCrawlStrategy.

Fixes #1801

List of files changed and why

crawl4ai/deep_crawling/bff_strategy.py — Added score_threshold parameter to __init__ (defaulting to -infinity for backward compatibility) and added threshold check in the link discovery loop to skip URLs scoring below the threshold.

How Has This Been Tested?

  • Code review against BFS/DFS implementations to ensure consistent behavior
  • Default value of -infinity ensures no behavioral change for existing users

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added/updated unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Add score_threshold parameter to BestFirstCrawlingStrategy, matching the
existing behavior in BFSDeepCrawlStrategy and DFSDeepCrawlStrategy.

URLs scoring below the threshold are now skipped during link discovery
instead of being unconditionally enqueued.

Fixes unclecode#1801
unclecode added a commit that referenced this pull request Mar 7, 2026
…gStrategy

Adds score_threshold parameter (default -inf for backward compatibility)
to BestFirstCrawlingStrategy, matching BFS and DFS strategies. URLs
scoring below the threshold are skipped.
Fixes #1801.
unclecode added a commit that referenced this pull request Mar 7, 2026
@unclecode
Copy link
Owner

Thanks @nightcityblade — merged into develop, will be in the next release. Good addition to keep BestFirstCrawlingStrategy consistent with BFS/DFS. You've been added to CONTRIBUTORS.md.

@unclecode unclecode closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: BestFirstCrawlingStrategy missing score_threshold support

2 participants