Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

easylist go.*. rule breaks many sites #19

Closed
wmyrda opened this issue Jul 5, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@wmyrda
Copy link

commented Jul 5, 2018

Taking care of double rules is not enough as there even single rules which by using .*. break more sites than intended

#ab2p-block-request-R1304
{+client-header-tagger{ab2p-block-request-R1304} \
}
# |http://go.$domain=nowvideo.sx (easylist.txt: 46984)
go.*.

Following is setting header for sites as imasdk.googleapis.com

WORKAROUND:
Use sed -i -e '/^go\.\*\./s/^/#/' /etc/privoxy/ab2p.action to disable this rule

P.S. Rulesets I created after all fixes/workarounds so far still use .*. ~1200 times. Almost all other actually seem less harmless with exception of promo.*. wich does come from easylist.txt as well.

@wmyrda

This comment has been minimized.

Copy link
Author

commented Jul 23, 2018

Trying to fix this issue I did some testing for it and this is what I found out:

||log. - original adblock record
^log.*. - converted with fix from #23

This is still not right. After fix it would not catch frazes with blog.mypage.com, but still would catch stuff like loggingintothepage.mypage.com.

The only proper combination I found was ^log\.(*PRUNE).*? as this would catch log.mypage.com, but not loggingintothepage.mypage.com.

Proposed solution is to change all instances of . into \. even in hostnames not only in patterns like it is now and change = lst : "*." into = lst : "(*PRUNE).*?"

While changing the latter was easy in the adblock2privoxy code not knowing haskell I am not sure how to changed it within the code for dots and was able to do so partially only with sed -i -e '/\./{/^\^/s/\./\\./}' afterwords - change instances of dot into \dot but only for lines starting with ^.
My attempts to fix this in the code failed so far and help fixing it is welcomed.

@wmyrda

This comment has been minimized.

Copy link
Author

commented Jul 23, 2018

After a bit of trial and error I come with this. Not only it compiles but also seems to work just like expected :) It is combined with previous patch for #23

diff -Naur adblock2privoxy-9999.old/adblock2privoxy/src/PatternConverter.hs adblock2privoxy-9999/adblock2privoxy/src/PatternConverter.hs
--- adblock2privoxy-9999.old/adblock2privoxy/src/PatternConverter.hs    2018-07-23 14:45:40.829753697 +0200
+++ adblock2privoxy-9999/adblock2privoxy/src/PatternConverter.hs        2018-07-23 14:47:28.325970392 +0200
@@ -34,20 +34,22 @@
             | otherwise = "/"
         host' = case host of
                     "" -> ""
-                    _  -> changeFirst.changeLast $ host
+                    _  -> changeFirst.changeMiddle.changeLast $ host
                     where
                     changeLast []     = []
                     changeLast [lst]
                         | lst == '|' || lst `elem` hostSeparators   =  []
-                        | lst == '*' || lst == '\0'                 =  "*."
-                        | otherwise                                 =  lst : "*."
+                        | lst == '*' || lst == '\0'                 =  "(*PRUNE).*?"
+                        | otherwise                                 =  lst : "(*PRUNE).*?"
                     changeLast (c:cs) = c : changeLast cs
 
+                    changeMiddle = replace "." "\\."
+
                     changeFirst []    = []
                     changeFirst (first:cs)
                         | first == '*'                       =       '.' :  '*'  : cs
                         | bindStart == Hard || proto /= ""   =             first : cs
-                        | bindStart == Soft                  =       '.' : first : cs
+                        | bindStart == Soft                  =       '^' : first : cs
                         | otherwise                          = '.' : '*' : first : cs
 
         query' = case query of

@wmyrda wmyrda referenced this issue Jul 23, 2018

Closed

privoxy #579

@essandess

This comment has been minimized.

Copy link
Owner

commented Aug 26, 2018

@wmyrda I’m honestly still swamped with other projects, but am starting to think about thinking about addressing all the great issues you’ve raised. Rather than work through these linearly, would you please triage what you believe to be the most important issues?

Also, you raised compiler issues in another thread. That one perhaps is the most fundamental because the code refactoring should be done in such a way that it isn’t undone by a version upgrade.

It looks like this may be one the highest priority issues to address. Would you Please weigh in? Note that in markdown you can refer to stuff easily with e.g. #19 #19 links.

@wmyrda

This comment has been minimized.

Copy link
Author

commented Aug 26, 2018

Please do not feel like I am pushing You to do stuff, so definitely you may address them whenever You desire.
To make it easier follow what is important I will create another issue which would summarize all open bugs along with my subjective importance (low/medium/severe) and scope of required work (trivial/normal/high).

For compiler issue I think help is coming.

@wmyrda wmyrda referenced this issue Aug 27, 2018

Closed

META: work plan for cureent issues #26

16 of 18 tasks complete

@wmyrda wmyrda referenced this issue Sep 13, 2018

Closed

tvp.pl #8816

essandess added a commit that referenced this issue Sep 20, 2018

@essandess

This comment has been minimized.

Copy link
Owner

commented Sep 20, 2018

Fixed. See comments in #10.

@essandess essandess closed this Sep 20, 2018

essandess added a commit that referenced this issue Oct 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.