running parsers in API #418

Closed
noinia opened this Issue Feb 13, 2012 · 6 comments

Comments

Projects
None yet
3 participants

noinia commented Feb 13, 2012

Would it be possible to extend the API so external programs using pandoc as a library can run the readers themselves? My reason for requesting this is because it allows for more flexibility in error handling.

In my particular case I'm trying to run pandoc to convert a file in markdown format into html (as part as a webapplication in snap). In case pandoc will fail to parse the input file correctly some other part of my code should be executed. It seems that the API does not export the required functions to allow for this. I think there are two options to fix this:

  1. Provide an alternative function to `readWith' as found in Text.Pandoc.Parsing that simply returns an Either. e.g.
readWith' :: GenParser t ParserState a     -- ^ parser
         -> ParserState                                  -- ^ initial state
         -> [t]                                                  -- ^ input
         -> Either ParseError a
readWith' parser state input = runParser parser state "source" input 

or

  1. export the parse* functions in Text.Pandoc.Readers.* (e.g. parseMarkdown ) so a program can run a particular parser itself.

In this case (1) would be sufficient. But (2) is a bit more flexible.

noinia commented Feb 27, 2012

When I initially raised the issue I thought either one of the above mentioned solutions would work. But for 1 to work you would still need to export the parsers.

What I think is a nicer solution is to add a readMarkdown' function that produces an 'Either ParseError Pandoc' as an alternative to the readMarkdown function. The readMarkdown function itself can then be implemented in terms of readMarkdown'. The same for the other readers. So that is exactly what I did :). The patch can be found here: http://fstaals.net/junk/pandoc_readers.diff

noinia commented Mar 13, 2012

Any chances in getting this committed?

Owner

jgm commented Jul 14, 2012

Why can't you just use readMarkdown together with catch or some other function that handles errors?

noinia commented Jul 15, 2012

Hmm I guess that is an option too. However that still seems somewhat silly: we have all the error information, which we basically throw away, and then use another error handling mechanism to try to catch the error again.

Owner

jgm commented Jul 15, 2012

Actually, I have long planned to modify the API to allow the parsers to
return warning messages (e.g. about things that were discarded). Rather
than exposing the Parsec parsers themselves, I plan to provide
another function that is like readMarkdown but returns a list
of error/warning messages as well as a result. I think that would
help for your purposes, too.

I actually do expose a lot of parsers in the API already, in
Text.Pandoc.Parsing, but I'd prefer not to do so. I'd rather make
the actual parsing mechanisms a black box that can be changed without
affecting the API. Not exposing parsers also reduces problems linking
with code compiled with a different version of parsec.

+++ noinia [Jul 15 12 07:58 ]:

Hmm I guess that is an option too. However that still seems somewhat silly: we have all the error information, which we basically throw away, and then use another error handling mechanism to try to catch the error again.


Reply to this email directly or view it on GitHub:
#418 (comment)

jgm added the API label Dec 9, 2016

Owner

jgm commented Dec 9, 2016

We now do return an Either, so this was long ago fixed.

jgm closed this Dec 9, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment