Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
FormatParser, like Parsec parsers, is a Monad and MonadPlus. Some of the standard Parsec primitives are implemented (anyOf, noneOf, <|>, many...). The CSV example from *Real World Haskell* works.
- Loading branch information
0 parents
commit 9881bad
Showing
4 changed files
with
180 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
Name: FormatParser | ||
Version: 0.0.1 | ||
cabal-version: >= 1.6 | ||
Synopsis: Like Parsec, but two-way! (Experimental!) | ||
Description: A Parsec inspired library that allows you to | ||
simultaneously write parsers and formaters, | ||
avoiding duplication of effort. | ||
License: GPL | ||
Author: Christopher Olah | ||
Maintainer: Christopher Olah <chris@colah.ca> | ||
build-type: Simple | ||
Category: Text | ||
|
||
Library | ||
Build-Depends: base >= 3 && < 5 | ||
Extensions: ViewPatterns | ||
Exposed-Modules: | ||
Text.FormatParser.Primitives | ||
Other-Modules: | ||
Text.FormatParser.Definitions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
FormatParser: Parsec-inspired (Simultaneous) Parsing & Formatting | ||
================================================================= | ||
|
||
Parsec and friends makes writing parsers in Haskell a lovely experience. But, if you're like me, you find yourself irate with the duplication of effort that comes with writing separate formaters. After all, they both convey essentially the same information! | ||
|
||
The following, from *Real World Haskell* is one of my favorite Parsec examples: | ||
|
||
```haskell | ||
csvFile = endBy line eol | ||
line = sepBy cell (char ',') | ||
cell = many (noneOf ",\n") | ||
eol = char '\n' | ||
``` | ||
|
||
Beautiful! We just said what a `csvFile` was and had a parser! But there is certainly enough information there to format the data as well. Why can't we? | ||
|
||
With FormatParser you can! | ||
|
||
```haskell | ||
import Text.FormatParser.Primitives | ||
|
||
csvFile = endBy line eol | ||
line = sepBy cell (char ',') | ||
cell = many (noneOf ",\n") | ||
eol = char '\n' | ||
``` | ||
|
||
```haskell | ||
> "abc,def,egh\nfoo,bar,blah\n" `parseBy` csvFile | ||
Just [["abc","def","egh"],["foo","bar","blah"]] | ||
> [["abc","def","egh"],["foo","bar","blah"]] `formatBy` csvFile | ||
Just "abc,def,egh\nfoo,bar,blah\n" | ||
``` | ||
|
||
The Nitty-Gritty Details | ||
------------------------- | ||
|
||
A `FormatParser s i o` parses streams of `s`s into output `o`s and formats input `i`s into streams of `s`s. | ||
|
||
It is a `Monad` and `MonadPlus`, but that only provides us with a way to route the output values, `o`. We need a way to handle the input values as well! | ||
|
||
The key is the `=|=` operator, of type `(i1 -> i2) -> FormatParser s i2 o -> FormatParser s i1 o`, which allows one to transform the input type. | ||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
|
||
module Text.FormatParser.Definitions where | ||
|
||
import Control.Monad | ||
|
||
data FormatParser s i o = | ||
FormatParser { | ||
formater :: i -> Maybe (o, [s]), | ||
parser :: [s] -> Maybe (o, [s]) | ||
} | ||
|
||
|
||
instance Monad (FormatParser s i) where | ||
return a = FormatParser (\_ -> Just (a, [])) (\b -> Just (a,b)) | ||
m >>= f = FormatParser formater' parser' | ||
where | ||
formater' val = do | ||
(mval, mstr) <- formater m val | ||
let n = f mval | ||
(nval, nstr) <- formater n val | ||
return (nval, mstr ++ nstr) | ||
parser' str = do | ||
(mval, str') <- parser m str | ||
let n = f mval | ||
(nval, str'') <- parser n str' | ||
return (nval, str'') | ||
|
||
instance MonadPlus (FormatParser s i) where | ||
mzero = FormatParser (const Nothing) (const Nothing) | ||
mplus a b = FormatParser formater' parser' | ||
where | ||
formater' val = formater a val `mplus` formater b val | ||
parser' val = parser a val `mplus` parser b val | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
{-# LANGUAGE ViewPatterns #-} | ||
|
||
module Text.FormatParser.Primitives where | ||
|
||
import Control.Monad | ||
import Text.FormatParser.Definitions | ||
|
||
parseBy :: [a] -> FormatParser a b c -> Maybe c | ||
parseBy a rw = fmap fst $ parser rw a | ||
|
||
formatBy :: b -> FormatParser a b c -> Maybe [a] | ||
formatBy a rw = fmap snd $ formater rw a | ||
|
||
a <|> b = a `mplus` b | ||
|
||
(=|=) :: (i1 -> i2) -> FormatParser s i2 o -> FormatParser s i1 o | ||
f =|= FormatParser a b = FormatParser (a.f) b | ||
|
||
oneOf vals = FormatParser formater parser | ||
where | ||
parser (x:xs) | x `elem` vals = Just (x, xs) | ||
parser _ = Nothing | ||
formater a | a `elem` vals = Just (a, [a]) | ||
formater _ = Nothing | ||
|
||
noneOf vals = FormatParser formater parse | ||
where | ||
parse (x:xs) | not (x `elem` vals) = Just (x, xs) | ||
parse _ = Nothing | ||
formater a | not (a `elem` vals) = Just (a, [a]) | ||
formater a = Nothing | ||
|
||
manyN :: Int -> FormatParser s i o -> FormatParser s [i] [o] | ||
manyN n (FormatParser childformater childParse) = FormatParser formater (parse n) | ||
where | ||
parse n (childParse -> Just (v, xs) ) = | ||
do | ||
(results, remainder) <- parse (n-1) xs | ||
return (v:results, remainder) | ||
parse n l | n <= 0 = Just ([], l) | ||
parse _ _ = Nothing | ||
formater vals | length vals >= n = do | ||
(newvals, strs) <- fmap unzip $ sequence $ map childformater vals | ||
return (newvals, concat strs) | ||
|
||
many = manyN 0 | ||
many1 = manyN 1 | ||
|
||
-- minor error, fix later | ||
sepByN :: Int -> FormatParser s i o -> FormatParser s () o2 -> FormatParser s [i] [o] | ||
sepByN 0 inter sep = sepByN 1 inter sep <|> (head =|= inter >>= return . return) | ||
sepByN n inter sep = do | ||
x <- head =|= inter | ||
let intersep = do | ||
(const ()) =|= sep | ||
a <- inter | ||
return a | ||
xs <- tail =|= manyN (n-1) intersep | ||
return (x:xs) | ||
|
||
sepBy1 = sepByN 1 | ||
sepBy = sepByN 0 | ||
|
||
endByN n inter sep = manyN n $ do | ||
a <- inter | ||
(const ()) =|= sep | ||
return a | ||
|
||
endBy1 = endByN 1 | ||
endBy = endByN 0 | ||
|
||
|
||
|
||
whitespace :: FormatParser Char a String | ||
whitespace = (const " ") =|= (many $ oneOf " \t\n") | ||
|
||
char :: Char -> FormatParser Char a Char | ||
char c = const c =|= oneOf [c] | ||
|
||
|