Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize regexp memory usage #201

Closed

Conversation

atuchin-m
Copy link
Collaborator

Currently, the regexp field consumes a lot of memory because of the size of Arc<RwLock<Option<Arc<CompiledRegex>>>>.
It takes 40 bytes per item even when None is stored.

Most of the rules is non-regex(~90%).
This approach takes 8 bytes per item for such rules.
It saves about 7Mb of memory in total.

Important note: this makes the engine not thread-safe.
In fact, the browser uses adblock from a single sequence => we don't need to synchronize threads.

P.S. Many Arc in the code are left untouched to make the diff reasonable.

Memory usage after loading data/rs-ABPFilterParserData.dat in tests/deserialization.rs
(Release, ubuntu x64, added jemallocator locally to calculate):
before: 42094712 bytes allocated/53563392 bytes resident after: 34332128 bytes allocated/48881664 bytes resident

Copy link
Collaborator

@antonok-edm antonok-edm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this is a reasonable change, but we do need to be aware of other projects that use adblock-rust - could we gate this behind an off-by-default cargo feature, so that everything is still thread-safe by default?

@atuchin-m atuchin-m closed this Mar 28, 2022
@atuchin-m
Copy link
Collaborator Author

@antonok-edm I've recreated the PR (the current was done from a fork): #204
As for the feature: I can do it if we need to. Let's discuss in slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants