Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

md2html not outputing <meta charset="UTF-8"> #231

Closed
luntik2012 opened this issue Jan 25, 2024 · 5 comments
Closed

md2html not outputing <meta charset="UTF-8"> #231

luntik2012 opened this issue Jan 25, 2024 · 5 comments
Labels

Comments

@luntik2012
Copy link

luntik2012 commented Jan 25, 2024

Hi,
I'm a complete noob in html and web, I'm trying to serve single page generated by md2html using nginx.

The problem is that my file is utf8-encoded and browsers show garbage instead of expected symbols. I've resolved this problem by adding <meta charset="UTF-8"> to the beginning of the md2html output.

Is it possible to add this tag automatically using md2html? Am I doing something wrong?

My nginx config:

	server {
	  listen [::]:80;
	  server_name example.com;

	  charset utf-8;
          try_files $uri /example.html;
          root /usr/share/webapps/example;
	}
@mity
Copy link
Owner

mity commented Jan 25, 2024

Sorry, md2c is not our thing.

This project only provides Markdown parser lib (libmd4c.so), Markdown-to-HTML lib (libmd4c-html.so) and simple line wrapper tool md2html. So you're quite possibly on the wrong place unless someone wrote md2c atop MD4C without our knowledge (but even then you should likely start with that person/team.)

@luntik2012 luntik2012 changed the title Question: nginx + md4c utf-8 output, charset tag needed? Question: nginx + md2html utf-8 output, charset tag needed? Jan 25, 2024
@luntik2012
Copy link
Author

Sorry, it was a typo (md4c + md2html -> md2c, but should be md2html)

@mity
Copy link
Owner

mity commented Jan 25, 2024

Yes, md2html should likely add that into the output. It' sort of legacy of being primarily written as a test tool for MD4C lib and sort of only created as a tool of its own as afterthought.

As a workaround, you may use it without --full-html and create the complete HTML head/footer on your own. The --full-html will ever be likely useful only to users with very basic needs.

@mity mity closed this as completed in 4933a89 Jan 25, 2024
@luntik2012
Copy link
Author

luntik2012 commented Jan 25, 2024

Thanks, this works fine for me

echo '<meta charset="UTF-8">' > example.html && md2html example.md >> example.html

@mity
Copy link
Owner

mity commented Jan 25, 2024

It may more-or-less work (with some browsers) but strictly speaking you generate invalid HTML that way.

See https://developer.mozilla.org/en-US/docs/Learn/HTML/Introduction_to_HTML/The_head_metadata_in_HTML#what_is_the_html_head

md2html without the flag --full-html generates only that contents stuff between <body> and </body> so you should prepend the whole head

<html>
<head>
.....  <!-- custom stuff whatever your needs here are, including all the <meta charset="UTF-8"> -->
</head>
<body>

and respective footer

</body>
</html>

@mity mity changed the title Question: nginx + md2html utf-8 output, charset tag needed? md2html not outputing <meta charset="UTF-8"> Jan 25, 2024
@mity mity added the bug label Jan 25, 2024
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Feb 10, 2024
Changes:

 * Changes mandated by CommonMark specification 0.31:

   - The specification expands set of Unicode characters seen by Markdown
     parser as a punctuation. Namely all Unicode general categories P
     (punctuation) and S (symbols) are now seen as such.

   - The definition of HTML comment has been changed so that `<!-->` and
     `<!--->` are also recognized as HTML comments.

   - HTML tags recognized as HTML block starting condition of type 4 has been
     updated, namely a tag `<source>` has been removed, whereas `<search>`
     added.

   Refer to [CommonMark 0.31.2](https://spec.commonmark.org/0.31.2/) for full
   specification.

Fixes:

 - [#230](mity/md4c#230):
   The fix [#223](mity/md4c#223) in 0.5.1 release
   was incomplete and one corner case remained unfixed. This is now addressed.

 - [#231](mity/md4c#231):
   `md2html --full-html` now emits `<meta  charset="UTF-8">` in the HTML header.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants