-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify defaults metadata and markdown metadata parsers #6328
Conversation
This is intentional. The defaults file is supposed to be more or less equivalent to specifying things on the command line. When you specify |
Okay, ignore Markdown parsing part then, this patch doesn't do it anyway. The main issue here is booleans don't get parsed as booleans. And it's entirely possible to specify booleans in metadata on the CLI. |
I guess the question is: do you want to have the same semantics in the Markdown reader metadata parser and defaults metadata parser? (modulo parsing strings as Markdown) If not, I would argue you need a good reason for that, because different semantics will lead to much confusion. |
If booleans can be specified using |
@lierdakil You can now convert existing PRs to Draft mode. The link to do so us under the "Reviewers" section in the sidebar. |
This comment has been minimized.
This comment has been minimized.
@lierdakil - can you clarify the status of this draft? What work is still needed before it's ready to review? |
This comment has been minimized.
This comment has been minimized.
So, okay, a bit of an explanation of how I'm going about this. There's a function Given this modification, I'm re-using The original question I was implicitly asking with this PR is whether or not this all seems like a sound approach. On the positive side, this guarantees consistent handling of metadata in the defaults file and in-line as metadata blocks, with the notable exception of parsing string values as Markdown (in the current implementation, defaults file strings are treated as unformatted strings). On the negative side, this doesn't at all guarantee that anything would be consistent with the CLI (which uses a different code path altogether). Another possible negative is that the code ends up being somewhat convoluted, although I'm not sure if there's a better way to handle deep-ish monad stacks anyway. As for what still needs to be done, the current implementation silently ignores any parsing errors reported by I'd like to encourage a quick look over the code as it is now (while the changeset is relatively small and contains little noise), to figure out if I should pursue this further, or if this approach should be abandoned in favour of another perhaps. |
src/Text/Pandoc/App/Opt.hs
Outdated
where | ||
pMetaString = pure . MetaString <$> P.manyChar P.anyChar | ||
runEverything :: P.ParserT Text P.ParserState PandocPure (P.F (M.Map Text MetaValue)) -> Meta | ||
runEverything p = | ||
either (const mempty) (Meta . flip P.runF def) . join . runPure $ P.readWithM p def "" | ||
yamlToMeta _ = mempty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about this code. What's the point of runEverything
here if we're just passing in pMetaString
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically: to make the type checker happy. yamlMap
is several monad stacks deep before we get to the actual MetaMap
, runEverything
has to unwrap those.
FWIW, some of those layers need to be joined with the Parser
in doOpt
for the purposes of error handling, but I was being a little lazy with the proof-of-concept.
Quick review: I am okay with the change in T.P.Readers.Metadata. I don't quite understand some of the added complexity in T.P.App.Opt, as noted in the comment on the code. Also, I thought that one of the aims here was to ensure that CLI metadata and (leafs nodes for) metadata in defaults files are parsed the same (particularly as regards boolean values), and I don't see that in this PR. |
As I said above, this PR does not touch the CLI code path at all. The idea here is to handle metadata specified from the defaults file the same as metadata specified via the YAML metadata block (because both are YAML metadata, and discrepancies are confusing). The way I achieve the intended goal is to pass the whole YAML object containing the metadata in the defaults file to the parser in |
Basically, this is what I'm trying to point out: we don't have to reinvent the wheel with parsing metadata in the defaults file, we already have a parser for YAML metadata, which works in the context of the Markdown parser. We can use that one with minimal modification (at the cost of some arguably ugly monad stack unwrapping). That reduces effort duplication and gives us some UX consistency for free. If this sounds reasonable then I can tidy this PR up, and we can move on to review/merge and whatnot. Otherwise, we can abandon this angle and I can quickly slap together some ad-hoc measure to fix boolean parsing in defaults metadata and be done with that (this will lead to some code duplication, which irks me because DRY, but oh well) Sorry if I'm failing to express my thoughts clearly. P.S. I would argue using the same code to do basically the same job is highly preferable, but I acknowledge that it's a personal opinion, and that's why I'm asking if it sounds reasonable to you. |
This comment has been minimized.
This comment has been minimized.
@brainchild0 - just a note that comments like the last one are motivation-killers in open-source projects. I haven't forgotten about this PR, but believe it or not, I have other work to do besides pandoc. I've indicated general agreement with the proposed change -- my last message asks for clarification only, which I happily received. The walls of text you have provided make it harder to review the PR; please keep your comments as succinct as possible. |
@jgm: Although the overall intention was different from what may have been construed, I understand the view you express. I removed the preceding comment, which I am interpreting as the target of your remark, being the only one you indicated explicitly. (Please be aware that my reading of the conversation, though perhaps inaccurate, was not one of agreement. It may be beside the point.) |
6585b96
to
f916163
Compare
@jgm, so, okay, sorry it took a while, I got busy with day job and didn't find much time to work on this. Anyway, I've updated the PR. A few notes:
|
f916163
to
dd9840a
Compare
This is a friendly reminder that this PR is ready for review. I'm afraid this will go out of sync with master if it marinates for much longer, and it does fix a real issue (report linked in the OP). No pressure, just in case it slipped through the cracks. |
Sorry, I just forgot about this. I'll take a look now. |
src/Text/Pandoc/Readers/MediaWiki.hs
Outdated
@@ -1,6 +1,4 @@ | |||
{-# LANGUAGE OverloadedStrings #-} | |||
{-# LANGUAGE RelaxedPolyRec #-} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split these deletions of RelaxedPolyRec into a separate PR, since they aren't related?
src/Text/Pandoc/Readers/Markdown.hs
Outdated
return $ runF meta defaultParserState | ||
parsed <- readWithM parser def{ stateOptions = opts } "" | ||
case parsed of | ||
Right result -> return result | ||
Left e -> throwError e | ||
|
||
|
||
asBlocks :: Functor f => f (B.Many Block) -> f MetaValue | ||
asBlocks p = MetaBlocks . B.toList <$> p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asMetaBlocks
might be a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sidenote: this is the same as fmap toMetaValue
, where the latter is from Text.Pandoc.Builder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarleb, good point.
I think I'm fine with this change, though I have a nagging worry that there's something I'm overlooking. What more would be involved in making boolean values work in metadata in defaults files? I'd like to make a pandoc release very soon, so we can support ghc 8.10, and it would be convenient to include this (since we'd be going to 2.10 anyway and this is an API change). |
Btw it's helpful if all API changes are clearly indicated as such in the commit messages, so I don't forget to include them in the changelog. |
dd9840a
to
c6e019a
Compare
Okay, I've removed RelaxedPolyRec stuff and
As long as we're using the same code path in Markdown and defaults file metadata parsing, nothing. That's mostly the point: avoid duplicating pandoc/src/Text/Pandoc/Readers/Metadata.hs Lines 84 to 88 in f2b3377
Note to self, pandoc/src/Text/Pandoc/App/Opt.hs Lines 500 to 508 in 34775b4
Uh... about that. Are there any public-facing API changes? Now that I look at it, |
c6e019a
to
3f78c85
Compare
Sorry, I had thought the module was exposed. It isn't, you're right. |
And sorry about my misunderstanding; I now see that Booleans are handled. The branch has some conflicts that need to be resolved before merging. |
3f78c85
to
42e7f1e
Compare
Rebased. |
Thanks! |
This is prompted by lierdakil/pandoc-crossref#259.
Apparently, pandoc doesn't parse metatdata fields in the defaults file the way it does it for Markdown files, particularly boolean fields are parsed as strings. I took a gander, and to my surprise, found that pandoc uses completely different code path for parsing defaults file metadata field compared to yaml metadata block in Markdown.
Long story short, this is a first attempt at merging those code paths. This should hopefully behave a little more consistently (although I'll admit I quickly gave up on figuring out how to parse fields as Markdown while parsing defaults file -- I believe it's feasible, but likely more than a bit tricky)
P.S. This is more a request for comments than a bona fide pull request. The code perhaps needs a little polish. And I keep forgetting about the draft PR feature.
P.P.S. Sorry if this text is a little rambly or a bit incoherent, it's 4 AM and I'm slightly sleep-deprived.