Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version string of filter is truncated if it not composed only of digits #16

Closed
wmyrda opened this issue Jun 19, 2018 · 6 comments
Closed

Comments

@wmyrda
Copy link

wmyrda commented Jun 19, 2018

This may be trivial, but in case it is somehow connected to #15 seems wise to report it as well
adblock_anty-dotacje.txt version is 100.2 and adblock2privoxy allows only digits. As the result it truncates the version to 100
adblock_adguard.txt current version is 364.2 and result in file is 364

EDIT: Looking at the source version number is treated as Integer which must be a whole number. Probably switching here to floating point is required to accept digits after dot.

essandess added a commit that referenced this issue Sep 23, 2018
@essandess
Copy link
Owner

I added this code to address the issue, 6fec1a3:

       (<++>) a b = (++) <$> a <*> b
         (<:>) a b = (:) <$> a <*> b
         number = many1 digit
         subnumber = char '.' <:> number
         versionnumber = number <|> number <++> subnumber
         versionParser = (\x -> info{_version = read x}) <$> (string "Version: " *> versionnumber)

Ideally, this should be the parser many1 digit `sepBy` char '.' to get things like version 8.4.3, but this gives the type error, which I haven't followed through with yet:

    • Couldn't match type ‘[Char]’ with ‘Char’
      Expected type: Text.Parsec.Prim.ParsecT s u m String
        Actual type: Text.Parsec.Prim.ParsecT s u m [[Char]]
    • In the second argument of ‘(<$>)’, namely
        ‘(string "Version: " *> (many1 digit `sepBy` char '.'))’
      In the expression:
        (\ x -> info {_version = read x})
          <$> (string "Version: " *> (many1 digit `sepBy` char '.'))
      In an equation for ‘versionParser’:
          versionParser
            = (\ x -> info {_version = read x})
                <$> (string "Version: " *> (many1 digit `sepBy` char '.'))
   |
93 |         versionParser = (\x -> info{_version = read x}) <$> (string "Version: " *> (many1 digit `sepBy` char '.'))
   |                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@essandess
Copy link
Owner

essandess commented Sep 23, 2018

@qrilka If you have a moment, I have a educational Haskell question I’ve been unable to tackle.

We would like to parse version numbers like 8.4.3.

The original code’s parser used many1 digit, and would grab the 8, but not the .4.3.

I thought that the obvious fix would be to replace many1 digit With many1 digit `sepBy` char ‘.’. But this fails to compile on a mismatched type of String versus [[Char]]. So I hacked in the code block above that would parse 8.4, and omit the .3.

Why doesn’t the sepBy work here?

@qrilka
Copy link
Contributor

qrilka commented Sep 23, 2018

@essandess I'm not sure I understand your point about "not work" here - sepBy works just as it's supposed to work - it constructs a list of values in the end with separator excluded, i.e. for "8.4.3" you'll get ["8","4","3"]. I see _version is an Integer and I wonder how could you store a multicomponent version there, something line [Int] would be more sensible if number of version components is not fixed (though I don't yet know how you use that information)

@essandess
Copy link
Owner

essandess commented Sep 23, 2018

Thanks again for the Haskell n00b pointers @qrilka!

What I mean is that the code fragment many1 digit `sepBy` char ‘.’ does not compile in this statement:

-- versionnumber = many1 digit -- this compiles!
versionnumber = many1 digit `sepBy` char ‘.’  -- this doesn't compile!!! 
-- versionnumber = (++) <$> many1 digit `sepBy` char ‘.’  -- this doesn't compile either!!! 
versionParser = (\x -> info{_version = read x}) <$> (string "Version: " *> versionnumber)

What's the correct sepBy (or equivalent) parser that will grab the string "8.4.3" from a line that looks like:

Version: 8.4.3

Just to keep it simple, I'd like to parse the string alone, and ignore that fact that it is comprised of things that could be cast as Int type.

In Python, this would be something like '.'.join(["8","4","3"]), after the parser found the "Version: 8.4.3" and sepBy converted it to ["8","4","3"].

essandess added a commit that referenced this issue Sep 24, 2018
@essandess
Copy link
Owner

@qrilka Thanks again for the pointer. I got it:

intercalate "." <$> many1 digit `sepBy` char '.'

@qrilka
Copy link
Contributor

qrilka commented Sep 24, 2018

I didn't write anything here today :)
Theoretically you could do that in the parser already though it looks not quite pleasant:

λ> parse (do{d1 <- many1 digit; dotDs <- many1 $ (:) <$> char '.' <*> many1 digit; return $ concat (d1:dotDs)}) "" "18.4.3"
Right "18.4.3"

essandess added a commit that referenced this issue Oct 12, 2018
essandess added a commit that referenced this issue Oct 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants