Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --html-highlight support for the HTML backend #3313

Merged
merged 3 commits into from
Oct 30, 2018

Conversation

ice1000
Copy link
Member

@ice1000 ice1000 commented Oct 24, 2018

Signed-off-by: ice1000 ice1000kotlin@foxmail.com

This can close #3137 which is originally my little suggestion.

I know I didn't add tests, examples, etc. That's because I'm not very sure about how to add tests.
A list of what I've done here:

  • Added the command line option, default value, help message for it
  • Tested with a simple .agda file with some comments, no <a> generated for them
  • Haddockumented the new parameters

Suggestions with instructions or examples are welcomed!

@ice1000 ice1000 force-pushed the md-as-is branch 5 times, most recently from e0c22be to f7d58bf Compare October 24, 2018 03:59
@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

I have some better idea (to make this more consistent):

  • Add another Aspect, separate non-code contents and Comment
  • We preserve non-code contents instead of all comments
  • We don't generate HTML metadata in such case, enabling markdown processors to process our generated file directly (maybe we need to rename the command line argument?)

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

But I need help with the code structure -- can anyone tell me where does Agda convert Layers into Aspects? How are non-code contents and comments treated in literateMd, literateTeX and literateRsT?

@vlopezj vlopezj self-assigned this Oct 24, 2018
@nad
Copy link
Contributor

nad commented Oct 24, 2018

maybe we need to rename the command line argument?

I think that --preserve-comments sounds a little strange, because presumably you do not preserve regular Agda comments (-- ...).

@gallais
Copy link
Member

gallais commented Oct 24, 2018

I've tried to compile the following literatetest.lagda.md with --html and preserve-comments

# A Post

```
module literatetest where
```

This comment should be left as is.

and this is what I get:

<!DOCTYPE HTML>
<html>
<head>
  <meta charset="utf-8">
  <title>literatetest</title>
  <link rel="stylesheet" href="Agda.css">
</head>
<body>
  <pre>
   # A Post

```
<a id="15" class="Keyword">module</a>
<a id="22" href="literatetest.html" class="Module">literatetest</a>
<a id="35" class="Keyword">where</a>
```

This comment should be left as is.
  </pre>
</body>
</html>

I don't think that's the intended output. The <pre> tag prevents markdown from
interpreting the # A Post as an h-title. You should at least strip the header & footer.
And then there are still questions about getting markdown to handle the code blocks
properly.

@nad
Copy link
Contributor

nad commented Oct 24, 2018

I think most of the interesting code can be found in Agda.Syntax.Parser.Literate, Agda.Interaction.Highlighting.LaTeX and Agda.Interaction.Highlighting.HTML. The HTML backend treats all source files in the same way, perhaps you can get inspiration from the LaTeX backend. (Please try to avoid code duplication.)

@vlopezj
Copy link
Contributor

vlopezj commented Oct 24, 2018

@ice1000
Thanks for taking this on.

can anyone tell me where does Agda convert Layers into Aspects?
First, the Layers are converted into a stream of Tokens (this happens in the function Agda.Syntax.Parser.parseLiterateWithComments). These tokens go through the Agda lexing, parsing, type-checking etc, and produce, among other things, the highlighting information,

The highlighting information comes out as a HighlightingInfo, which is a type synonym for CompressedFile. The Agda.Interaction.Highlighting.LaTeX reads this information using the function toLaTeX and the Agda.Interaction.Highlighting.HTML reads this information in the function tokenStream.

How are non-code contents and comments treated in literateMd, literateTeX and literateRsT?
The literate parsers literateMd, literateTeX and literateRsT assign to each character in the source file three possible roles:

  • Code: Regular Agda code, including comments introduced with Agda comment syntax (-- comment).
  • Markup: Delimiters used to separate the Agda code blocks from the literate text. For example, "\begin{code}" or "\end{code}" in .lagda.tex files, or ``` in .lagda.md files.
  • Comment: Literate text (i.e. everything that's not Code or Markup).

Consecutive characters with the same role are grouped together as a Layer. These Layers are the input to the parseLiterateWithComments function. Note that this function doesn't care whether the original file was .lagda.rst, .lagda.md or .lagda.tex; it only cares about what the role of each layer is.

I hope this answered your questions.

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

@vlopezj Thank you! Will implement my new idea tonight.

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

I think that --preserve-comments sounds a little strange, because presumably you do not preserve regular Agda comments (-- ...).

How about --code-only or something similar?

@nad
Copy link
Contributor

nad commented Oct 24, 2018

What about --highlight-markup/--no-highlight-markup?

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

What about --highlight-markup/--no-highlight-markup?

It sounds strange when markup actually means \begin{code}, \end{code} or ``` or something in rst in Agda.

More suggestions:

  • --preserve-background
  • --preserve-literate

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

image

The last step is to remove the header and footer

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

I think we should also remove the ``` around code blocks.

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

I like --preserve-literate. I'll change the option and the variable name.

@ice1000
Copy link
Member Author

ice1000 commented Oct 24, 2018

Also, this should only work with the HTML backend:

  1. It has something to do with the HTML header/footer
  2. It's pointless to generate LaTeX/GHC/JS with comments not preserved

@ice1000 ice1000 changed the title Add preserveComments support for the HTML backend Add --preserve-literate support for the HTML backend Oct 25, 2018
@ice1000
Copy link
Member Author

ice1000 commented Oct 25, 2018

It works!
image

@ice1000
Copy link
Member Author

ice1000 commented Oct 25, 2018

I think I'll have to un-escape the special characters.

@ice1000
Copy link
Member Author

ice1000 commented Oct 25, 2018

Fixed tests

@ice1000
Copy link
Member Author

ice1000 commented Oct 25, 2018

< (agda2-highlight-add-annotations 'remove '(1 27 (comment) t) '(27 28 (comment) t) '(28 39 (comment) t) '(39 40 (comment) t) '(53 62 (keyword) t) '(65 66 (symbol) t) '(67 70 (primitivetype) t) '(81 82 (comment) t))
---
> (agda2-highlight-add-annotations 'remove '(1 27 (background) t) '(27 28 (background) t) '(28 39 (background) t) '(39 40 (background) t) '(53 62 (keyword) t) '(65 66 (symbol) t) '(67 70 (primitivetype) t) '(81 82 (background) t))
=== Old output ===
(agda2-highlight-add-annotations 'remove '(1 27 (comment) t) '(27 28 (comment) t) '(28 39 (comment) t) '(39 40 (comment) t) '(53 62 (keyword) t) '(65 66 (symbol) t) '(67 70 (primitivetype) t) '(81 82 (comment) t))
=== New output ===
(agda2-highlight-add-annotations 'remove '(1 27 (background) t) '(27 28 (background) t) '(28 39 (background) t) '(39 40 (background) t) '(53 62 (keyword) t) '(65 66 (symbol) t) '(67 70 (primitivetype) t) '(81 82 (background) t))
=== Diff ===
bash: line 8: colordiff: command not found
bash: line 8: wdiff: command not found
Accept new error [y/N/q]? 
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Is it my fault?

… flag

Signed-off-by: ice1000 <ice1000kotlin@foxmail.com>
Signed-off-by: ice1000 <ice1000kotlin@foxmail.com>
@vlopezj
Copy link
Contributor

vlopezj commented Oct 30, 2018

@ice1000 I rebased your commits on top of @asr's fix (thanks!). Hopefully the CI will now go through now and I can merge in your feature.

@vlopezj vlopezj merged commit 12afa48 into agda:master Oct 30, 2018
@ice1000 ice1000 deleted the md-as-is branch October 30, 2018 21:45
@ice1000
Copy link
Member Author

ice1000 commented Oct 30, 2018

Thanks!
It's good to see such a nice language evolve. 👍

@UlfNorell
Copy link
Member

Thank you for contributing!

@ice1000
Copy link
Member Author

ice1000 commented Oct 31, 2018

Oh no! I can't find my test!

@ice1000 ice1000 mentioned this pull request Oct 31, 2018
@andreasabel
Copy link
Member

That looks like just the feature I need for writing talk slides with markdown! Shall try it out.

@ice1000
Copy link
Member Author

ice1000 commented Nov 1, 2018

That looks like just the feature I need for writing talk slides with markdown! Shall try it out.

Currently I use it in this way:

  1. Write .lagda.md file
  2. agda --html --html-highlight=code my.lagda.md
  3. cd html
  4. mv my.html my.md
  5. pandoc my.md
  6. wrap everything with an HTML header/footer

But it's much less work compared to using the markdown frontend + HTML backend before this PR, which requires you to unwrap nearly every <a class="Comment"> blocks manually (complicated logic, it's also a hard work for creating scripts).

I hope if someone (or me in the future maybe?) could find a way to reduce the steps.

@nad
Copy link
Contributor

nad commented Nov 4, 2018

mv my.html my.md

I suggest that when Agda outputs Markdown the extension should be .md, not .html. (Perhaps it would also be nice to support output to stdout.)

wrap everything with an HTML header/footer

Pandoc can do something like this.

@vlopezj vlopezj removed their assignment Nov 4, 2018
@vlopezj
Copy link
Contributor

vlopezj commented Nov 4, 2018

I can think of --html-output-extension option, which by default would have the value html, but can be changed to md, or to anything else. An output extension of - could mean outputting to stdout.

For usability, we can also have a --markdown command line flag which would be an alias for the conjunction of flags --html --html-highlight=code --html-output-extension=md. (Another possibility would be for Agda to give sensible default values of --html-highlight and --html-output-extension depending on the extension of the input file(s?); but this seems harder both to document and to implement.).

For even more flexibility, instead of the --html-output-extension option, one can have an --html-output-file option, with which a pattern for filenames can be specified. The default would be --html-output-file=%.html, and the user can give --html-output-file=%.md instead. Output to stdout would be --html-output-file=-, and concatenating all the output to a single file of the user's choosing could be --html-output-file=my-blog-post.md.

@ice1000
Copy link
Member Author

ice1000 commented Nov 4, 2018

For even more flexibility, instead of the --html-output-extension option, one can have an --html-output-file option, with which a pattern for filenames can be specified

I consider this silly, since HTML files generated by Agda are using relative links which depends on the HTML files' name, so it's a bad choice to change the file name.
I agree with your suggestions about the extension.

@ice1000
Copy link
Member Author

ice1000 commented Nov 4, 2018

Let me do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: html HTML generation backend command-line Calling Agda's executable directly literate-agda
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preserve Markdown as-is when outputting HTML
9 participants