Skip to content
This repository has been archived by the owner. It is now read-only.

Fix domain matching logic #162

Merged
merged 4 commits into from Jan 23, 2019
Merged

Fix domain matching logic #162

merged 4 commits into from Jan 23, 2019

Conversation

@bbondy
Copy link
Member

bbondy commented Jan 23, 2019

Domain matching logic was complex and didn't work in all cases. This caused us to miss an exception allow rule from: brave/brave-browser#2843

This rewrites the domain matching logic to be much faster on checks, but use a bit more memory.

Note that domain matching code is well tested already before this PR, so there's a lot of tests which ensure the safety of this change. New tests were also added to cover the new cases.

bbondy added 3 commits Jan 22, 2019
This also makes it more performant for execution time.
@bbondy bbondy requested review from iefremov and SergeyZhukovsky Jan 23, 2019
}
for (int i = 0; i < numNoFingerprintAntiDomainOnlyExceptionFilters; i++) {
newNoFingerprintAntiDomainOnlyExceptionFilters[i].swapData(&(noFingerprintAntiDomainOnlyExceptionFilters[i]));
}

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

Using the old way of simply memcpy over here no longer works because Filter's store pointers now, and they manage their own memory. So when a Filter was destroyed, it would free the pointers twice.

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

The perf here doesn't matter btw because this is only for parsing logic. We use deserialized dat files in production.

@bbondy bbondy force-pushed the domain-matching-hashset branch 2 times, most recently from c4ac3f7 to 3c7bc90 Jan 23, 2019
* This Source Code Form is subject to the terms of the Mozilla Public
* License, v. 2.0. If a copy of the MPL was not distributed with this
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

This class is used in HashSet's which store the domains and antiDomains for each Filter.

}
if (data) {
delete[] data;
if (domains) {

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

domains and antiDomains are lazily created, so check if we need to delete them.

}
if (host) {
delete[] host;
if (!borrowed_data) {

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

This is just a refactoring of an early return to not have an early return. Code in this block is the same as before.

@bbondy bbondy force-pushed the domain-matching-hashset branch from 3c7bc90 to c7f2bfa Jan 23, 2019
const char *p = input;
if (anti) {
if (len >= 1 && p[0] != '~') {
bool Filter::containsDomain(const char* domain, size_t domainLen,

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

See the header file for a description of this. Mainly it just checks if the specified domain is inside the filter's list of domains (or anti domains).

@@ -298,7 +270,8 @@ void Filter::parseOption(const char *input, int len) {
*pFilterOption = static_cast<FilterOption>(*pFilterOption | FOThirdParty);
} else if (!strncmp(pStart, "first-party", len)) {
// Same as ~third-party
*pFilterOption = static_cast<FilterOption>(*pFilterOption | FONotThirdParty);
*pFilterOption = static_cast<FilterOption>(
*pFilterOption | FONotThirdParty);

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

just a line length thing

@@ -387,6 +360,44 @@ bool Filter::hasUnsupportedOptions() const {
return (filterOption & FOUnsupportedSoSkipCheck) != 0;
}

bool Filter::contextDomainMatchesFilter(const char *contextDomain) {

This comment has been minimized.

Copy link
@bbondy

bbondy Jan 23, 2019

Author Member

This checks if a context domain is applicable to this filter.
For example if foo.example.com is the context domain then it will check if this filter is applicable to both foo.example.com and example.com.

Copy link
Member

SergeyZhukovsky left a comment

nothing major that caught my eyes. but needs to tested well before production. Hopefully we will see if something wrong on dev builds.

@bbondy bbondy merged commit aa0485f into master Jan 23, 2019
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@bsclifton bsclifton deleted the domain-matching-hashset branch Apr 12, 2019
@bsclifton bsclifton restored the domain-matching-hashset branch Apr 12, 2019
@bsclifton bsclifton deleted the domain-matching-hashset branch Apr 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.