Make adlist ID available, rename queries.regex_id -> queries.list_id #1841

DL6ER · 2023-12-25T04:42:29Z

What does this implement/fix?

Important

This PR needs both core and web branches with the same name to work.

Furthermore, you will have to run pihole -g at least once to upgrade your gravity database such that the list IDs become available for FTL. This should be done automatically during checkout/update.

Rename query_storage.regex_id to query_storage.list_id as it is already now used to also store exact matching domainlist entries by their ID. This commit further extends this to also store the (first) matching anti-/gravity list (if available).

Related issue or feature (if applicable): N/A

Pull request in docs with documentation (if applicable): N/A

By submitting this pull request, I confirm the following:

I have read and understood the contributors guide, as well as this entire template. I understand which branch to base my commits and Pull Requests against.
I have commented my proposed changes within the code.
I am willing to help maintain this change if there are issues with it later.
It is compatible with the EUPL 1.2 license
I have squashed any insignificant commits. (git rebase)

Checklist:

The code change is tested and works locally.
I based my code and PRs against the repositories developmental branch.
I signed off all commits. Pi-hole enforces the DCO for all contributions
I signed all my commits. Pi-hole requires signatures to verify authorship
I have read the above and my PR is ready for review.

…dy now used to also store exact matching domainlist entries by their ID. This commit further extends this to also store the (first) matching anti-/gravity list (if available) Signed-off-by: DL6ER <dl6er@dl6er.de>

…istinguish from domains rather easily. Users are free to foil this method when they force negative IDs into the database but they will never be automatically created Signed-off-by: DL6ER <dl6er@dl6er.de>

…ading to a crash in heavy TCP worker activity Signed-off-by: DL6ER <dl6er@dl6er.de>

Signed-off-by: DL6ER <dl6er@dl6er.de>

github-actions · 2024-01-01T07:59:56Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: DL6ER <dl6er@dl6er.de>

github-actions · 2024-01-01T20:03:12Z

Conflicts have been resolved.

PromoFaux · 2024-01-06T12:50:46Z

What if a blocked domain is on multiple adlists? Are we just taking the id/name of the first adlist that we find a match for?

DL6ER · 2024-01-06T15:04:47Z

What if a blocked domain is on multiple adlists? Are we just taking the id/name of the first adlist that we find a match for?

Yes, and there is not even a guarantee which one this is as it highly depends on the tree-structure of the database index. Regardless, the ID returned here will be the one that triggered the block so that's always correct. But yeah, as you implied, there is no guarantee that disabling/removing this one adlist will cause the domain to be permitted afterwards.

PromoFaux · 2024-01-06T15:35:48Z

Is it worth potentially making list_id return an array?

DL6ER · 2024-01-06T15:41:16Z

We don't have more than one adlist_id (the first one) because we stop looking for additional matches once we find the first hit. This is a necessary simplification as it avoids a needless full scan of the index (which eats additional time). If you would like to be fully-correct, we'd even have to continue scanning if there is possibly some additional stuff like regex or deny list entries blocking this ... you see where this is going ...

yubiuser · 2024-01-12T22:04:50Z

Actually, I don't like adding the adlist id to the query log. The simple reason is that the information is not very useful when users use more than 1 list but it adds quite some overhead to all 3 Pi-hole components. The proper tool to identify which adlist(s) contain the queried domain is query.sh/Search Lists. I could instead imagine a link from each blocked query's details to the search page with pre-filled (and executed?) search input field.

Renaming queries.regex_id -> queries.list_id is fine to me.

rdwebdesign · 2024-01-12T22:25:06Z

I agree with yubiuser.

Since the same domain can be on multiple lists (and many users have more than one list), knowing the "the first one" adlist_id is not very useful.

I could instead imagine a link from each blocked query's details to the search page with pre-filled (and executed?) search input field.

We could add a link like /admin/search?domain=example.com on each query and add the javascript code to search the domain.

DL6ER · 2024-01-13T06:41:57Z

Note that this mindset may also question regex_id (or list_id after the renaming) as we have the same underlying issue here: There may be several regex matching a given domain but we will also stop when we see the first match for efficiency reasons. Does this mean this feature is to be removed altogether and - instead - always use a link to /search ?

yubiuser · 2024-01-14T10:43:07Z

Does this mean this feature is to be removed altogether and - instead - always use a link to /search ?

I like this idea. It seems like a regression at first, but is the right thing to do after consideration. (It's a bit like switching from line to bar graphs). If there are multiple adlists and/or regex that influence an allow/block decision it seems wrong to show a single "random" responsible adlist/regex.

rdwebdesign

Tested with web and core branches

PromoFaux · 2024-02-09T18:36:54Z

Agree with Yubi here, I'm unsure of the benefit of showing a link to the first adlist we find the domain in. Lets say this causes the user to rethink that adlist and remove it. If it is in more than one list, then they wait and see it blocked again, and again click the link and maybe rethink that next list..... it's all a rather drawn out process.

"Domain was blocked: click here to find out what lists it was on" with a link to /search seems to be the best way to go, both in computational efficiency and in meaningfulness (is that a word? It is now) to an end user

rdwebdesign · 2024-02-09T18:42:48Z

@PromoFaux

The image on the PR is outdated.

The current code is showing a link to the Search page, as it was suggested.

PromoFaux · 2024-02-09T18:44:08Z

Lol, I didn't yet check or test the code - and as I didn't see any commits after the conversation.... I guess the old code was FP'd away :)

rdwebdesign · 2024-02-09T18:46:11Z

The related code was changed in web PR, but the conversation was kept here, in one place.

Signed-off-by: DL6ER <dl6er@dl6er.de>

Fix failed auto-merge in #1841

DL6ER added the Pi-hole v6.0 label Dec 25, 2023

DL6ER requested a review from a team December 25, 2023 04:42

This was referenced Dec 25, 2023

Make IDs of anti-/gravity lists available in vw_(anti)gravity pi-hole/pi-hole#5526

Merged

Query Log: Show link to groups/lists page if applicable pi-hole/web#2916

Merged

Translate anti-/gravity list IDs to negative numbers so they can be d…

7c29048

…istinguish from domains rather easily. Users are free to foil this method when they force negative IDs into the database but they will never be automatically created Signed-off-by: DL6ER <dl6er@dl6er.de>

DL6ER force-pushed the tweak/list_id branch from 5f59125 to 7c29048 Compare December 25, 2023 19:13

DL6ER added 3 commits December 27, 2023 19:08

Recheck statements in forks to avoid edge-case collisions possibly le…

e022600

…ading to a crash in heavy TCP worker activity Signed-off-by: DL6ER <dl6er@dl6er.de>

Port gravity.db update to version 19 into FTL's testing harness

72aea36

Signed-off-by: DL6ER <dl6er@dl6er.de>

Use IN where = was used but a multi-value result may occur

3d41513

Signed-off-by: DL6ER <dl6er@dl6er.de>

github-actions bot added the Merge conflicts label Jan 1, 2024

Merge branch 'development-v6' into tweak/list_id

83dfd62

Signed-off-by: DL6ER <dl6er@dl6er.de>

github-actions bot removed the Merge conflicts label Jan 1, 2024

rdwebdesign approved these changes Feb 6, 2024

View reviewed changes

DL6ER merged commit 276ba42 into development-v6 Feb 9, 2024
17 checks passed

DL6ER deleted the tweak/list_id branch February 9, 2024 19:52

DL6ER added a commit that referenced this pull request Feb 9, 2024

Fix failed auto-merge in #1841

6a74642

Signed-off-by: DL6ER <dl6er@dl6er.de>

DL6ER mentioned this pull request Feb 9, 2024

Fix failed auto-merge in https://github.com/pi-hole/FTL/pull/1841 #1876

Merged

5 tasks

DL6ER added a commit that referenced this pull request Feb 9, 2024

Merge pull request #1876 from pi-hole/fix/merge

94521b8

Fix failed auto-merge in #1841

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make adlist ID available, rename queries.regex_id -> queries.list_id #1841

Make adlist ID available, rename queries.regex_id -> queries.list_id #1841

DL6ER commented Dec 25, 2023

github-actions bot commented Jan 1, 2024

github-actions bot commented Jan 1, 2024

PromoFaux commented Jan 6, 2024

DL6ER commented Jan 6, 2024

PromoFaux commented Jan 6, 2024

DL6ER commented Jan 6, 2024

yubiuser commented Jan 12, 2024

rdwebdesign commented Jan 12, 2024 •

edited

DL6ER commented Jan 13, 2024

yubiuser commented Jan 14, 2024 •

edited

rdwebdesign left a comment

PromoFaux commented Feb 9, 2024

rdwebdesign commented Feb 9, 2024

PromoFaux commented Feb 9, 2024

rdwebdesign commented Feb 9, 2024

Make adlist ID available, rename queries.regex_id -> queries.list_id #1841

Make adlist ID available, rename queries.regex_id -> queries.list_id #1841

Conversation

DL6ER commented Dec 25, 2023

What does this implement/fix?

Checklist:

github-actions bot commented Jan 1, 2024

github-actions bot commented Jan 1, 2024

PromoFaux commented Jan 6, 2024

DL6ER commented Jan 6, 2024

PromoFaux commented Jan 6, 2024

DL6ER commented Jan 6, 2024

yubiuser commented Jan 12, 2024

rdwebdesign commented Jan 12, 2024 • edited

DL6ER commented Jan 13, 2024

yubiuser commented Jan 14, 2024 • edited

rdwebdesign left a comment

Choose a reason for hiding this comment

PromoFaux commented Feb 9, 2024

rdwebdesign commented Feb 9, 2024

PromoFaux commented Feb 9, 2024

rdwebdesign commented Feb 9, 2024

rdwebdesign commented Jan 12, 2024 •

edited

yubiuser commented Jan 14, 2024 •

edited