Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for sourcepos for latex output #29

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

jeroen
Copy link
Member

@jeroen jeroen commented Dec 14, 2023

Requested by @dmurdoch. Fixes #28

If we have settled on the format, we should try to upstream this in cmark.

@dmurdoch
Copy link

I suspect they won't like it, because there are bound to be cases where the comments change the meaning of the LaTeX source, e.g. if one ends up in some verbatim environment.

@jeroen
Copy link
Member Author

jeroen commented Dec 14, 2023

Can you try it again? I tried to now insert the comment at each linebreak, except for within verbatim.

@dmurdoch
Copy link

I see a couple of spots where it emits %sourcepos(0:0-0:0). I'm not sure what they have in common, but it might be worth skipping the write if the sourcepos is empty. Here's the input I used:


$equation$

$$ equation $$

This is a sample vignette that contains an error which will be detected
by HTML Tidy.

The error is the use of a non-existent tag, "foobar":

<foobar>

Run `example(processConcordance)` to see before and after reports.

This is *italic*, **bold**, _italic_, __bold__ .

# Header 1

## Header 2

### Header 3

* Item 1
* Item 2
    + Item 2a
    + Item 2b
    
1. Item 1
2. Item 2
3. Item 3
    + Item 3a
    + Item 3b    

Roses are red,   
Violets are blue.

http://example.com

[linked phrase](http://r-project.org)

![alt text](mesh.png)

@jeroen
Copy link
Member Author

jeroen commented Dec 15, 2023

Thanks, I updated it to remove those.

@dmurdoch
Copy link

This is working nicely now. I'd be happy with this change.

One other thought about suggesting it for cmark: users of cmark don't need this, since it's not that hard to insert new code between parsing and rendering if you are working in C.

A way to do that in your package without running external C code would be to offer the R interface in two steps: one to parse, one to render to some format. The return from the parse step could be an R list object that contains all the information from the parse tree, and the rendering step could use that as input and rebuild a new parse tree from it. (Pandoc used to do this by generating JSON and then reinterpreting it, but I don't think there's a need for that here.) If you're interested in pursuing this I'd be happy to try to write it, or to test it if you write it.

@jeroen
Copy link
Member Author

jeroen commented Dec 15, 2023

I think it would be very difficult to convert the entire cmark parse-tree structure into an R object that users can manipulate, and then even more difficult to convert an arbitrarily modified R list back to the parse tree for cmark. I don't think there is a reliable way to do this honestly, there would be many ways the user can accidentally corrupt the parse tree.

If you want to manipulate the parsed document, the best way is to use the xml representation of the parse tree using tinkr rather than trying to convert everything to R lists and back.

@dmurdoch
Copy link

Thanks, I was unaware of tinkr. I think it could have solved all of the issues I had, though I imagine it would introduce a lot of dependencies, so your latex changes are still preferable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: make cmark functions callable
2 participants