pre-0.12.x => master #574

atuchin-m · 2025-11-20T07:43:27Z

No description provided.

github-actions

Rust Benchmark

Benchmark suite	Current: `29cf12d`	Previous: `3f09734`	Ratio
`rule-match-browserlike/brave-list`	`1991942926` ns/iter (`± 7538002`)	`2062822761` ns/iter (`± 6207839`)	`0.97`
`rule-match-first-request/brave-list`	`1091120` ns/iter (`± 7198`)	`1126665` ns/iter (`± 9322`)	`0.97`
`blocker_new/brave-list`	`132721976` ns/iter (`± 1001032`)	`146929925` ns/iter (`± 463260`)	`0.90`
`blocker_new/brave-list-deserialize`	`24291322` ns/iter (`± 243802`)	`23264715` ns/iter (`± 60876`)	`1.04`
`memory-usage/brave-list-initial`	`10213344` ns/iter (`± 3`)	`10213344` ns/iter (`± 3`)	`1`
`memory-usage/brave-list-initial/max`	`60612235` ns/iter (`± 3`)	`60612235` ns/iter (`± 3`)	`1`
`memory-usage/brave-list-initial/alloc-count`	`996170` ns/iter (`± 3`)	`1231711` ns/iter (`± 3`)	`0.81`
`memory-usage/brave-list-1000-requests`	`2692712` ns/iter (`± 3`)	`2692712` ns/iter (`± 3`)	`1`
`memory-usage/brave-list-1000-requests/alloc-count`	`69464` ns/iter (`± 3`)	`71607` ns/iter (`± 3`)	`0.97`
`url_cosmetic_resources/brave-list`	`190467` ns/iter (`± 798`)	`192394` ns/iter (`± 2074`)	`0.99`
`cosmetic-class-id-match/brave-list`	`3383031` ns/iter (`± 938484`)	`3365434` ns/iter (`± 902553`)	`1.01`

This comment was automatically generated by workflow using github-action-benchmark.

Cargo.toml

antonok-edm · 2025-11-21T00:20:57Z

src/filters/network.rs

    #[deprecated(since = "0.11.1", note = "use get_tokens_optimized instead")]
    pub fn get_tokens(&self) -> Vec<Vec<Hash>> {
-        match self.get_tokens_optimized() {
+        let mut tokens_buffer = TokensBuffer::default();
+        match self.get_tokens_optimized(&mut tokens_buffer) {
            FilterTokens::OptDomains(domains) => {
-                domains.into_iter().map(|domain| vec![domain]).collect()
+                domains.iter().map(|domain| vec![*domain]).collect()
            }
-            FilterTokens::Other(tokens) => vec![tokens],
+            FilterTokens::Other(tokens) => vec![tokens.to_vec()],
            FilterTokens::Empty => vec![],
        }
    }


since we're making a breaking version change, shall we remove this one also?

we can rename get_tokens_optimized back to get_tokens in the process

also, I am open to making these pub(crate) to prevent needing similar breaking changes in the future

Done. I also made pub(crate) related TokensBuffer and FilterTokens

antonok-edm · 2025-11-21T00:28:10Z

src/filters/network.rs

    fn matches_test(&self, request: &request::Request) -> bool;
 }

-impl NetworkMatchable for NetworkFilter {


I think we can also make NetworkMatchable into a pub(crate) trait since it's only implemented for FlatNetworkFilter now

Done. I also removed unused matches_test.
Maybe it makes sense to completely remove trait NetworkMatchable, but since it non-public, we can do it later.

The PR reduces the amount of allocations by using arrayvec crate that allocates memory on a stack instead of a heap. * the number of allocations is reduced (-19%, memory-usage/brave-list-initial/alloc-count); * building time is improved (about -10%, memory-usage/brave-list-initial); * the token limit is increased up to 256; * the tests expectations were updated (one case hit the old token limit).

The PR removes impl NetworkMatchable for NetworkFilter and port the related tests to the flatbuffer impl (that we actually use in production)

Added: - `arrayvec` dependency Changed: - Improved performance and memory usage when parsing filters Removed: - `FilterTokens`, `get_tokens`, `get_tokens_optimized`, and `NetworkMatchable` are now crate-internal only

github-actions · 2025-11-24T22:11:46Z

[puLL-Merge] - brave/adblock-rust@574

Description

This PR upgrades the adblock-rust crate from version 0.11.1 to 0.12.0. The main changes focus on performance optimization by:

Replacing heap-allocated Vec<Hash> with stack-allocated ArrayVec for token storage to avoid allocations during filter tokenization
Refactoring the NetworkMatchable trait to be internal-only and changing test infrastructure to use full Engine instances instead of directly testing filters
Improving regex manager ergonomics by making the borrow method public within the crate and unifying the API across single-threaded and multi-threaded configurations
Updating test infrastructure to use more realistic testing scenarios with complete engines rather than isolated filter matching

Possible Issues

Breaking API changes: The get_tokens() method on NetworkFilter has been deprecated and removed from public API. The get_tokens_optimized() method has been changed to get_tokens() with a different signature that requires a mutable TokensBuffer parameter. This is a breaking change for any external code that directly calls these methods.
ArrayVec capacity limit: The new TokensBuffer type has a fixed capacity of 256 elements. If tokenization ever produces more than 256 tokens, they will be silently dropped. The old code used Vec which could grow unbounded. This could potentially affect filter matching accuracy for extremely complex filters.
Test timing changes: The regex manager tests now use mock_instant::thread_local::MockClock instead of mock_instant::global::MockClock, which could affect test isolation in multi-threaded test runs.
Test expectation change: In tests/matching.rs, the expected values changed from (106860, 136085) to (106861, 136084), indicating one request changed from "passed" to "blocked". This suggests a subtle behavior change in filter matching logic, though the cause isn't obvious from the diff.

Security Hotspots

Silent token truncation (Low-Medium risk): In fast_tokenizer_no_regex in src/utils.rs (lines 49-51), when the token buffer reaches capacity minus 1, the function returns early without processing remaining tokens:
```
if tokens_buffer.capacity() - tokens_buffer.len() <= 1 {
    return; // reserve one free slot for the zero token
}
```
This could theoretically allow an attacker to craft filters that exceed the 256-token limit, causing important tokens to be dropped and potentially bypassing filter rules. However, the practical risk is low as legitimate filters are unlikely to generate this many tokens.

Privacy Hotspots

No significant privacy issues identified in this PR. The changes are primarily performance optimizations and internal refactoring that don't affect data collection, storage, or transmission.

Changes

Changes

Cargo.toml / Cargo.lock / js/Cargo.toml / package.json

Version bump from 0.11.1 to 0.12.0
Added arrayvec = "0.7" dependency

src/utils.rs

Added TokensBuffer type alias: ArrayVec<Hash, 256> (fixed-size array-vector)
Removed TOKENS_BUFFER_SIZE and TOKENS_MAX constants
Modified tokenization functions to accept &mut TokensBuffer instead of &mut Vec<Hash>
Changed early return condition to prevent buffer overflow while reserving space for zero token
Updated tokenize() to convert TokensBuffer to Vec<Hash> for return

src/filters/network.rs

Changed FilterTokens enum to use borrowed slices (&'a [Hash]) instead of owned vectors
Modified get_tokens_optimized() to get_tokens() with new signature requiring &'a mut TokensBuffer parameter
Made get_tokens() internal (pub(crate))
Removed deprecated get_tokens() method
Added matches_test() method for testing that creates a full engine
Made NetworkMatchable trait internal (pub(crate))
Removed NetworkMatchable implementation for NetworkFilter

src/filters/fb_network.rs

Removed matches_test() method from NetworkMatchable implementation for FlatNetworkFilter

src/filters/fb_network_builder.rs

Added TokensBuffer to serialization logic
Modified to use borrowed token slices from get_tokens()

src/filters/network_matchers.rs

Removed check_included_domains() and check_excluded_domains() functions
Only retained the _mapped variants of these functions

src/blocker.rs

Added RegexManagerRef type alias to abstract over single-threaded and multi-threaded implementations
Made borrow_regex_manager() method pub(crate)
Unified regex manager borrowing logic for both feature configurations
Added check_exceptions() test helper method

src/engine.rs

Added check_network_request_exceptions() test helper method
Added borrow_regex_manager() test helper method

src/regex_manager.rs

Changed mock clock import from mock_instant::global::Instant to mock_instant::thread_local::Instant for tests

src/request.rs

Modified calculate_tokens() to use TokensBuffer and convert to Vec<Hash>

tests/ (multiple files)

Refactored tests to use full Engine instances instead of directly testing NetworkFilter.matches()
Removed direct RegexManager usage in tests
Updated test assertions to check engine results rather than filter matches
Modified domain checking tests to use mapped domain functions

sequenceDiagram
    participant Client
    participant Engine
    participant Blocker
    participant FilterList
    participant TokensBuffer
    participant RegexManager
    
    Client->>Engine: check_network_request(request)
    Engine->>Blocker: check(request, resources)
    Blocker->>Blocker: borrow_regex_manager()
    Blocker->>RegexManager: get mutable reference
    RegexManager-->>Blocker: RegexManagerRef
    
    Blocker->>FilterList: check(request, regex_manager)
    
    loop For each filter in list
        FilterList->>TokensBuffer: create/clear buffer
        FilterList->>NetworkFilter: get_tokens(&mut buffer)
        NetworkFilter->>TokensBuffer: tokenize into buffer (stack allocated)
        TokensBuffer-->>NetworkFilter: token references
        NetworkFilter-->>FilterList: FilterTokens (borrowed slices)
        
        FilterList->>NetworkFilter: matches(request, regex_manager)
        NetworkFilter->>RegexManager: compile_if_needed(pattern)
        RegexManager-->>NetworkFilter: compiled regex
        NetworkFilter-->>FilterList: match result
    end
    
    FilterList-->>Blocker: BlockerResult
    Blocker-->>Engine: BlockerResult
    Engine-->>Client: BlockerResult

atuchin-m self-assigned this Nov 20, 2025

atuchin-m requested a review from a team as a code owner November 20, 2025 07:43

github-actions bot added the puLL-Merge label Nov 20, 2025

atuchin-m force-pushed the pre-0.12.x branch from 3869ee3 to 9763c33 Compare November 20, 2025 07:46

github-actions bot reviewed Nov 20, 2025

View reviewed changes

atuchin-m force-pushed the pre-0.12.x branch from 9763c33 to 99c10c3 Compare November 20, 2025 07:49

This comment was marked as resolved.

Sign in to view

atuchin-m force-pushed the pre-0.12.x branch from 99c10c3 to d284e1b Compare November 20, 2025 20:28

antonok-edm reviewed Nov 21, 2025

View reviewed changes

Cargo.toml Show resolved Hide resolved

antonok-edm reviewed Nov 21, 2025

View reviewed changes

atuchin-m force-pushed the pre-0.12.x branch 2 times, most recently from 458febb to e56ec36 Compare November 21, 2025 08:23

atuchin-m added 4 commits November 24, 2025 14:03

Pr dont use old methods in tests (pre-0.12.x) (#572)

b3f66ea

The PR removes impl NetworkMatchable for NetworkFilter and port the related tests to the flatbuffer impl (that we actually use in production)

Adjust public API before 0.12.x release

6b54aaf

v0.12.0 - more memory/performance enhancements

29cf12d

Added: - `arrayvec` dependency Changed: - Improved performance and memory usage when parsing filters Removed: - `FilterTokens`, `get_tokens`, `get_tokens_optimized`, and `NetworkMatchable` are now crate-internal only

antonok-edm force-pushed the pre-0.12.x branch from e56ec36 to 29cf12d Compare November 24, 2025 22:10

antonok-edm approved these changes Nov 24, 2025

View reviewed changes

antonok-edm enabled auto-merge November 24, 2025 22:13

antonok-edm merged commit c69021f into master Nov 24, 2025
9 checks passed

antonok-edm deleted the pre-0.12.x branch November 24, 2025 22:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pre-0.12.x => master #574

pre-0.12.x => master #574

Uh oh!

atuchin-m commented Nov 20, 2025

Uh oh!

github-actions bot left a comment •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

antonok-edm Nov 21, 2025

Uh oh!

antonok-edm Nov 21, 2025

Uh oh!

atuchin-m Nov 21, 2025

Uh oh!

antonok-edm Nov 21, 2025

Uh oh!

atuchin-m Nov 21, 2025

Uh oh!

github-actions bot commented Nov 24, 2025

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pre-0.12.x => master #574

pre-0.12.x => master #574

Uh oh!

Conversation

atuchin-m commented Nov 20, 2025

Uh oh!

github-actions bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Rust Benchmark

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

antonok-edm Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

antonok-edm Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

atuchin-m Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

antonok-edm Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

atuchin-m Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 24, 2025

Description

Possible Issues

Security Hotspots

Privacy Hotspots

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot left a comment •

edited

Loading