Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML --> Latex #2

Open
hupili opened this issue Aug 26, 2012 · 3 comments
Open

HTML --> Latex #2

hupili opened this issue Aug 26, 2012 · 3 comments

Comments

@hupili
Copy link
Owner

hupili commented Aug 26, 2012

Latex syntax is more strict and focus on the semantic meaning. Usually, it should work in the reverse way. Latex --> HTML. There are mature tools for this purpose. I'm surveying whether HTML 2 Latex is well developed.

@hupili
Copy link
Owner Author

hupili commented Aug 26, 2012

HTML 2 Latex:
http://html2latex.sourceforge.net/

This one works very good. Minor flaws detected so far:

  • <hr /> label will be translated to "\hline" in latex. "\hline" works fine in "tabular" env, but not in "document" env.
    This problem is easy to work around by a few post-processing.
  • Lose links if href and text are different.

@hupili
Copy link
Owner Author

hupili commented Aug 26, 2012

HTML to Latex:
http://sourceforge.net/projects/htmltolatex/

A java package.

Features:

  • Use "\href" instead of "\url". More information is kept.

Currently detected flaws:

  • The "tabular" env is not invoked with proper formats, e.g. "c|c". It's also easy to post-process it.

@hupili
Copy link
Owner Author

hupili commented Aug 26, 2012

Thanks for sharing! Turns out so many people are considering the same thing. I become obsessive with Markdown recently, for its super simple syntax. It's also friendly to plaintext editors like vim. The basic version MD converts to HTML smoothly. It lacks some features, like citation, tables, etc. Of course, formulae is one. I feel comfortable writing MD, HTML, Latex, etc. I just want to figure out a more efficient way. Say, when I'm writting notes, or a simple blog post, MD seems the best choice. If the notes involve formulae, I have to turn to Latex. This is probably the most important reason why Latex sees a compelling adoption among researchers. After a little pondering, I end up writting my own extensions for MD, allowing tables, formulae, etc. You'll notice that formulae and tables have very simple notation in Latex. That's why I'd like to integrate them into MD. They have the same design philosophy. I don't have time to rewrite the whole thing, so the solution is to put existing tools together. I'm surveying the optimal path to do the task, e.g. MD -> preprocessing -> markdown compiler -> HTML 2 Latex, or MD -> MD 2 Latex compiler -> post-processing, etc. pandoc is one important piece to look at but it does not work very smoothly at my side.

For my formal publication, I'll stick to Latex. For informal writings, I'd like to use MD to accelerate the process. But it should still converts to HTML and PDF smoothly to be postded on blogs, or distributed in hardcopy. In this case, citation is not a big issue for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant